Exam DP203 Storage 2 Azure Data Lake Storage Gen2

From MillerSql.com
Revision as of 14:54, 29 November 2024 by NeilM (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

The second Storage type is: Azure Data Lake Storage Gen2

Same as Azure Blob Storage but with additional data to allow the blobs to be seen as files in a directory structure, storing metadata about them. Optimised for analytics. Allows multiple changes to be made in an atomic operation. Security - can assign rights to individual files and folders.

Can treat the data as if it's stored in a Hadoop Distributed File System (HDFS).

Permissions:

  1. access control lists (ACLs)
  2. Portable Operating System Interface (POSIX)

Azure Storage Explorer

Redundancy options:

  1. Locally redundant storage (LRS)
  2. Geo-redundant storage (GRS)

To enable Azure Data Lake Storage Gen2 in the Azure Portal, either:

  1. Enable "hierarchical namespace" in the "Advanced" page when creating the storage account, or
  2. Run the Data Lake Gen2 upgrade wizard in the storage account.

(when complete, its name will change from Blob Storage, to Datalake storage)

Note that the upgrade wizard will fail unless you untick the two "soft delete" options under Data Management - Data Protection

Can create Blob Containers in both cases. Just in the Gen2 scenario, any folders created underneath it are real folders not filename virtual folders. Note that "Container" in this context is just another name for filesystem.