Exam DP203 Copy Data tool: Difference between revisions

From MillerSql.com
NeilM (talk | contribs)
No edit summary
NeilM (talk | contribs)
No edit summary
 
(3 intermediate revisions by the same user not shown)
Line 9: Line 9:


=== Source ===
=== Source ===
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Then enter the URL to a .csv file hosted on a Github share.
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share.


=== Destination ===
=== Destination ===
In Destination, choose type '''Azure Data Lake Storage Gen 2''', and select the existing connection item into there from the popup list: '''synapse9cxqfg2-WorkspaceDefaultStorage'''
In Destination, choose type '''Azure Data Lake Storage Gen 2''', and select the existing connection item into there from the popup list: '''synapse9cxqfg2-WorkspaceDefaultStorage'''
(note this existing connection can be seen beforehand under the '''Data''' - '''Linked''' tab.
Give it a relative path of: files/product_data
and file name: products.csv
'''Run job'''
Run the job, to do the data load.
Then back in Azure Synapse Studio, in '''Data''' - '''Linked''', notice that a new header item has appeared: "Integration Datasets". This contains two items '''SourceDataset_ghb''' and '''DestinationDataset_ghb''' referring to the above source and destination. The '''SourceDataset_ghb''' item shows the "Products" linked connection (note this connection is also shown under '''Manage''' - '''Linked Services.'''
Note also that under the '''synapse9cxqfg2-WorkspaceDefaultStorage''' header item, the destination products.csv file now appears.
Note there doesn't seem to be an item corresponding to the source URL (the Github one).

Latest revision as of 18:45, 15 November 2024

Copy Data Tool

In Azure Synapse Analytics Studio, under the "Home" section on the left hand side: Click the "Ingest" large button.

From:

https://microsoftlearning.github.io/dp-203-azure-data-engineer/Instructions/Labs/01-Explore-Azure-Synapse.html

Select "Built-in copy task".

Source

In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share.

Destination

In Destination, choose type Azure Data Lake Storage Gen 2, and select the existing connection item into there from the popup list: synapse9cxqfg2-WorkspaceDefaultStorage

(note this existing connection can be seen beforehand under the Data - Linked tab.

Give it a relative path of: files/product_data

and file name: products.csv

Run job

Run the job, to do the data load.

Then back in Azure Synapse Studio, in Data - Linked, notice that a new header item has appeared: "Integration Datasets". This contains two items SourceDataset_ghb and DestinationDataset_ghb referring to the above source and destination. The SourceDataset_ghb item shows the "Products" linked connection (note this connection is also shown under Manage - Linked Services.

Note also that under the synapse9cxqfg2-WorkspaceDefaultStorage header item, the destination products.csv file now appears.

Note there doesn't seem to be an item corresponding to the source URL (the Github one).