Export SQL Server query result as xml file in Data Factory

Export SQL Server query result as xml file in Data Factory

Published on: 8 March 2022

Author: Ramesh Kanjinghat

Recently I had to copy data from SQL Server and store it as XML files in Azure Data Lake Storage. I had to use XML format because the client system understands only XML format. We use Azure Data Factory (ADF) and unfortunately ADF doesn't have built in datasets to store data in XML format. We can save data in formats like AVRO, Parquet, JSON etc but XML is not an option.
It is kind of a workaround but I was able to do it using DelimitedText format. So, I thought I will share it just in case. At high level we use FOR XML clause and DeimitedText dataset.
SQL Server select statement can return data in XML format. So, I have update my select statement to use this clause. Please refer, https://docs.microsoft.com/en-us/sql/relational-databases/xml/for-xml-sql-server?view=sql-server-ver15 to learn more about SQL Server FOR XML clause .

Step 1: Create a dataset

  • Go to Author > Datasets > New dataset.
    Create dataset step 1
    Create dataset step 1
  • Select Azure Data Lake Storage Gen2 from source list and hit Continue.
    Create dataset step 3
    Create dataset step 3
  • Select DelimitedText as format and hit Continue.
    Create dataset step 1
    Create dataset step 1
In the above screenshot you can see that XML is an option while creating a dataset. But once you create a dataset as XML format then that dataset cannot be used as Sink in the pipelines.
  • Provide a Name, select appropriate Linked service.
  • Provide the File path. Make sure that you only provide the folder path and not the file name. Leave the file box, number 5 in below image.
  • Hit OK.
    Create dataset step 5
    Create dataset step 5
  • Once the dataset is created select it and on the Connection tab select No quote character for the property Quote character .
This is very important. If we don't do this then the whole XML will be stored as a string value in the file. That is the whole xml content will be enclosed in either single quotes or double quotes, based on what you have selected.
Create dataset step 3
Create dataset step 3
  • Publish the dataset.

Step 2: Create a pipeline

  • Go to Author > Pipelines > New pipeline.
    Create pipeline step 1
    Create pipeline step 1
  • Provide a name , if you want to change the default name.
  • Drag and drop and copy activity and go to Source tab.
  • Select your Source dataset (SQL Server dataset).
  • Select Query as Use query option.
If it is a stored procedure then make sure that you update your SP have FOR XML clause.
Create pipeline step 1
Create pipeline step 1
  • Complete rest of the fields.
  • Go to Sink tab and select our DelimitedText dataset, in this case GenericFormat.
  • Change File extension to .xml.
    Create pipeline step 1
    Create pipeline step 1
  • Publish your pipeline.
Now run the pipeline. If it completes successfully then you can see an xml file created in your storage.

You may like