Exam DP203 Synapse Data Pipelines

Data Pipelines in Synapse Studio

In the Synapse Studio Integrate tab.

integration runtime

For pipelines to be able to do anything, you have to define Linked Services to the resources they will need to access.

To create a pipeline, click the + button and select "Pipeline"

When you create a pipeline in Synapse Studio, like in SSIS the Control Flow items are shown graphically when they are created, with green and red arrows stating where to go next on success, failure, or completion.

Once the pipeline has been created, click the curly brackets on the top right {} to display the JSON code for it. The template can also be generated from supplied JSON code (this is done by clicking the + button and selecting "Import from Pipeline Template" (it needs the JSON file as a .zip).

Source, Transformation, Sink

Triggers to run the pipeline. Immediate, Scheduled, In response to an event

The Monitor tab of Synapse Studio shows the run progress of a pipeline. This progress information is used by Microsoft Purview to inform it about data lineage.

Create Pipeline

Under Activities - Move and Transform, select Data Flow. (another option is Copy Data). Name it "LoadProducts".

On the Settings tab of this data flow object, to the right of the "Data Flow" popup list, click + New. This opens a new window for this data flow.

In this window, on the top left, click the greyed-out "Add Source" graphic to add a new Source Stream object. Then at the bottom click the + New button to select a data source. Search for "Gen2" to find the Gen2 type. Click CSV. Then in the "Linked Service" popup, select the existing linked service to the Gen2.

Then click the browse button to choose the source file here - which puts in the path to products.csv and click OK to go back to the dataflow window.

Verify that "Source Type" is set to "Integration Dataset" (as opposed to "Inline" or "Workspace DB").