Exam DP203 Dedicated SQL Pool
Dedicated SQL Pool.
Reads relational data from data Warehouses.
In Synapse Studio Manage - SQL Pools tab, notice that there is an item that has been set up by the tutorial: sql9cxqfg2
Click the arrow to the left of it to resume it. I think this starts up a virtual machine in the background.
Then once it has started, in Data - Workspace, there is header "SQL Database". And inside this is database: sql9cxqfg2 underneath of which is a folder for tables and other metadata. This is a relational database that exists (by virtue of being under "Workspace") only in Synapse Studio.
Once querying is complete, pause the pool to save money.
Set the performance level for the pool: DW100c to DW30000c
Set the collation of the SQL pool, which determines sort order and string comparison rules.
Dedicated SQL pools in Synapse Analytics don't support foreign key and unique constraints as found in other relational database systems like SQL Server. This means such uniqueness must be enforced during data load.
Indexes. The default index type is clustered columnstore. Though regular clustered indexes are also supported (and is needed for the VARBINARY(MAX) datatype).
Again they use a massively parallel processing (MPP) architecture, as opposed to the symmetric multiprocessing (SMP) architecture used in most OLTP database systems.
To implement this MPP architecture, the data is distributed on the nodes according to one of the following:
- Hash. Here a hash of the data is stored on the nodes. Use with FACT tables
- Round-Robin: Here the data is split across multiple nodes. Use with STAGING tables.
- Replicate. All the data is on all of the nodes. Useful for performance with small datasets. Use with DIMENSION tables.
Table Creation
To create a table, run:
CREATE TABLE dbo.DimCustomer ( CustomerKey INT IDENTITY NOT NULL, CustomerAlternateKey NVARCHAR(15) NULL, CustomerName NVARCHAR(80) NOT NULL, EmailAddress NVARCHAR(50) NULL, Phone NVARCHAR(25) NULL, StreetAddress NVARCHAR(100), City NVARCHAR(20), PostalCode NVARCHAR(10), CountryRegion NVARCHAR(20) ) WITH ( DISTRIBUTION = REPLICATE, CLUSTERED COLUMNSTORE INDEX );
Note the key column values are auto-generated with an IDENTITY field.