Exam DP203 Copy Data tool: Difference between revisions

From MillerSql.com
NeilM (talk | contribs)
No edit summary
NeilM (talk | contribs)
No edit summary
 
(One intermediate revision by the same user not shown)
Line 9: Line 9:


=== Source ===
=== Source ===
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Then enter the URL to a .csv file hosted on a Github share.
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share.


=== Destination ===
=== Destination ===
Line 24: Line 24:
Run the job, to do the data load.
Run the job, to do the data load.


Then back in Azure Synapse Studio, in '''Data''' - '''Linked''', notice that a new header item has appeared: "Integration Datasets". This contains two items '''SourceDataset_ghb''' and '''DestinationDataset_ghb''' referring to the above source and destination.
Then back in Azure Synapse Studio, in '''Data''' - '''Linked''', notice that a new header item has appeared: "Integration Datasets". This contains two items '''SourceDataset_ghb''' and '''DestinationDataset_ghb''' referring to the above source and destination. The '''SourceDataset_ghb''' item shows the "Products" linked connection (note this connection is also shown under '''Manage''' - '''Linked Services.'''
 
Note also that under the '''synapse9cxqfg2-WorkspaceDefaultStorage''' header item, the destination products.csv file now appears.


Note there doesn't seem to be an item corresponding to the source URL (the Github one).
Note there doesn't seem to be an item corresponding to the source URL (the Github one).

Latest revision as of 18:45, 15 November 2024

Copy Data Tool

In Azure Synapse Analytics Studio, under the "Home" section on the left hand side: Click the "Ingest" large button.

From:

https://microsoftlearning.github.io/dp-203-azure-data-engineer/Instructions/Labs/01-Explore-Azure-Synapse.html

Select "Built-in copy task".

Source

In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share.

Destination

In Destination, choose type Azure Data Lake Storage Gen 2, and select the existing connection item into there from the popup list: synapse9cxqfg2-WorkspaceDefaultStorage

(note this existing connection can be seen beforehand under the Data - Linked tab.

Give it a relative path of: files/product_data

and file name: products.csv

Run job

Run the job, to do the data load.

Then back in Azure Synapse Studio, in Data - Linked, notice that a new header item has appeared: "Integration Datasets". This contains two items SourceDataset_ghb and DestinationDataset_ghb referring to the above source and destination. The SourceDataset_ghb item shows the "Products" linked connection (note this connection is also shown under Manage - Linked Services.

Note also that under the synapse9cxqfg2-WorkspaceDefaultStorage header item, the destination products.csv file now appears.

Note there doesn't seem to be an item corresponding to the source URL (the Github one).