Exam DP203 Copy Data tool: Difference between revisions
Created page with "Copy Data Tool in Azure Synapse Analytics Studio" |
No edit summary |
||
(7 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Copy Data Tool in Azure Synapse | == Copy Data Tool == | ||
In Azure Synapse Analytics Studio, under the "Home" section on the left hand side: Click the "Ingest" large button. | |||
From: | |||
https://microsoftlearning.github.io/dp-203-azure-data-engineer/Instructions/Labs/01-Explore-Azure-Synapse.html | |||
Select "Built-in copy task". | |||
=== Source === | |||
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share. | |||
=== Destination === | |||
In Destination, choose type '''Azure Data Lake Storage Gen 2''', and select the existing connection item into there from the popup list: '''synapse9cxqfg2-WorkspaceDefaultStorage''' | |||
(note this existing connection can be seen beforehand under the '''Data''' - '''Linked''' tab. | |||
Give it a relative path of: files/product_data | |||
and file name: products.csv | |||
'''Run job''' | |||
Run the job, to do the data load. | |||
Then back in Azure Synapse Studio, in '''Data''' - '''Linked''', notice that a new header item has appeared: "Integration Datasets". This contains two items '''SourceDataset_ghb''' and '''DestinationDataset_ghb''' referring to the above source and destination. The '''SourceDataset_ghb''' item shows the "Products" linked connection (note this connection is also shown under '''Manage''' - '''Linked Services.''' | |||
Note also that under the '''synapse9cxqfg2-WorkspaceDefaultStorage''' header item, the destination products.csv file now appears. | |||
Note there doesn't seem to be an item corresponding to the source URL (the Github one). |
Latest revision as of 18:45, 15 November 2024
Copy Data Tool
In Azure Synapse Analytics Studio, under the "Home" section on the left hand side: Click the "Ingest" large button.
From:
Select "Built-in copy task".
Source
In the definition of the source, click "New Connection". Create a new connection, and in the Linked service pane that appears, on the Generic protocol tab, select HTTP. Give it a name of "Products". Then enter the URL to a .csv file hosted on a Github share.
Destination
In Destination, choose type Azure Data Lake Storage Gen 2, and select the existing connection item into there from the popup list: synapse9cxqfg2-WorkspaceDefaultStorage
(note this existing connection can be seen beforehand under the Data - Linked tab.
Give it a relative path of: files/product_data
and file name: products.csv
Run job
Run the job, to do the data load.
Then back in Azure Synapse Studio, in Data - Linked, notice that a new header item has appeared: "Integration Datasets". This contains two items SourceDataset_ghb and DestinationDataset_ghb referring to the above source and destination. The SourceDataset_ghb item shows the "Products" linked connection (note this connection is also shown under Manage - Linked Services.
Note also that under the synapse9cxqfg2-WorkspaceDefaultStorage header item, the destination products.csv file now appears.
Note there doesn't seem to be an item corresponding to the source URL (the Github one).