DP-201: Designing an Azure Data Solution
Posted in Azure Data Engineer Associate
DP-201: Designing an Azure Data Solution is part of the requirements for the Microsoft Certified: Azure Data Engineer Associate
Exam requirements
The official exam document are published here: https://docs.microsoft.com/en-us/learn/certifications/exams/dp-201
Exam preparation
Books covering the exam
Video training for the exam
Microsoft Partner Network
- DP-200: Implementing an Azure Data Solution
https://partner.microsoft.com/en-us/training/assets/collection/dp-200-implementing-an-azure-data-solution#/ - DP-201: Designing an Azure Data Solution
https://partner.microsoft.com/en-us/training/assets/collection/dp-201-designing-an-azure-data-solution#/
PluralSight
- Microsoft Azure Data Engineer (DP-200)
https://app.pluralsight.com/paths/certificate/microsoft-azure-data-engineer-dp-200 - Azure Data Solution (DP-201)
https://app.pluralsight.com/paths/certificate/azure-data-solution-dp-201
SQLBits
- XVIII (2019) Conference
https://sqlbits.com/content/Event18?type=3
Online training
Microsoft Learn (free)
- Azure Fundamentals
https://docs.microsoft.com/en-us/learn/paths/azure-fundamentals/ - Azure for the Data Engineer
https://docs.microsoft.com/en-us/learn/paths/azure-for-the-data-engineer/ - Store data in Azure
https://docs.microsoft.com/en-us/learn/paths/store-data-in-azure/ - Work with relational data in Azure
https://docs.microsoft.com/en-us/learn/paths/work-with-relational-data-in-azure/ - Work with NoSQL data in Azure Cosmos DB
https://docs.microsoft.com/en-us/learn/paths/work-with-nosql-data-in-azure-cosmos-db/ - Large-Scale Data Processing with Azure Data Lake Storage Gen2
https://docs.microsoft.com/en-us/learn/paths/data-processing-with-azure-adls/ - Implement a Data Streaming Solution with Azure Streaming Analytics
https://docs.microsoft.com/en-us/learn/paths/implement-data-streaming-with-asa/ - Implement a Data Warehouse with Azure Synapse Analytics
https://docs.microsoft.com/en-us/learn/paths/implement-sql-data-warehouse/
- Data engineering with Azure Databricks
https://docs.microsoft.com/en-us/learn/paths/data-engineer-azure-databricks/ - Perform data science with Azure Databricks
https://docs.microsoft.com/en-us/learn/paths/perform-data-science-azure-databricks/
Instructor-led training
Microsoft Learning Partner
- Course DP-200T01-A: Implementing an Azure Data Solution
https://docs.microsoft.com/en-us/learn/certifications/courses/dp-200t01 - Course DP-201T01-A: Designing an Azure Data Solution
https://docs.microsoft.com/en-us/learn/certifications/courses/dp-201t01
Exam Objectives
Design Azure Data Storage Solutions (40-45%)
- Recommend an Azure data storage solution based on requirements
- Choose the correct data storage solution to meet the technical and business requirements
https://docs.microsoft.com/en-us/azure/architecture/guide/technology-choices/data-store-overview
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/data-storage - Choose the partition distribution type
https://docs.microsoft.com/en-us/azure/architecture/best-practices/data-partitioning-strategies
https://docs.microsoft.com/en-us/azure/architecture/best-practices/data-partitioning
- Choose the correct data storage solution to meet the technical and business requirements
- Design non-relational cloud data stores
- Design data distribution and partitions
https://docs.microsoft.com/en-us/azure/cosmos-db/partition-data
https://docs.microsoft.com/en-us/azure/cosmos-db/global-dist-under-the-hood - Design for scale (including multi-region, latency, and throughput)
https://docs.microsoft.com/en-us/azure/cosmos-db/set-throughput
https://docs.microsoft.com/en-us/azure/cosmos-db/partition-data - Design a solution that uses Cosmos DB, Data Lake Storage Gen2, or Blob storage
https://docs.microsoft.com/en-us/azure/cosmos-db/social-media-apps
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-static-website-how-to - Select the appropriate Cosmos DB API
https://docs.microsoft.com/en-us/learn/modules/choose-api-for-cosmos-db/ - Design a disaster recovery strategy
https://docs.microsoft.com/en-us/azure/cosmos-db/how-to-backup-and-restore - Design for high availability
https://docs.microsoft.com/en-us/azure/cosmos-db/high-availability
- Design data distribution and partitions
- Design relational cloud data stores
- Design data distribution and partitions
https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-tables-distribute
- Design for scale (including multi-region, latency, and throughput)
https://docs.microsoft.com/en-us/azure/cosmos-db/scaling-throughput
https://docs.microsoft.com/en-us/azure/cosmos-db/optimize-cost-throughput
https://docs.microsoft.com/bs-latn-ba/azure/cosmos-db/consistency-levels-tradeoffs - Design a solution that uses SQL Database and Azure Synapse Analytics (SQL Data Warehouse)
- Design a disaster recovery strategy
https://docs.microsoft.com/en-us/azure/azure-sql/database/active-geo-replication-overview
https://docs.microsoft.com/en-us/azure/azure-sql/database/disaster-recovery-guidance
https://docs.microsoft.com/en-us/azure/azure-sql/virtual-machines/windows/business-continuity-high-availability-disaster-recovery-hadr-overview - Design for high availability
https://docs.microsoft.com/en-us/azure/azure-sql/database/high-availability-sla
- Design data distribution and partitions
Design Data Processing Solutions (25-30%)
- Design batch processing solutions
- Design batch processing solutions that use Data Factory and Azure Databricks
https://www.youtube.com/watch?v=0PjSYaV85t0
- Identify the optimal data ingestion method for a batch processing solution
https://docs.microsoft.com/en-us/azure/architecture/data-guide/technology-choices/batch-processing#technology-choices-for-batch-processing
https://docs.microsoft.com/en-us/azure/data-explorer/ingest-data-overview#choosing-the-most-appropriate-ingestion-method - Identify where processing should take place, such as at the source, at the destination, or in transit
- Identify transformation logic to be used in the Mapping Data Flow in Azure Data Factory
- Design batch processing solutions that use Data Factory and Azure Databricks
- Design real-time processing solutions
- Design for real-time processing by using Stream Analytics and Azure Databricks
https://docs.microsoft.com/en-us/azure/azure-databricks/databricks-stream-from-eventhubs
https://docs.microsoft.com/en-us/azure/architecture/reference-architectures/data/stream-processing-databricks
- Design and provision compute resources
https://docs.microsoft.com/en-us/azure/batch/batch-pool-vm-sizes
- Design for real-time processing by using Stream Analytics and Azure Databricks
Design for Data Security and Compliance (25-30%)
- Design security for source data access
- Plan for secure endpoints (private/public)
https://docs.microsoft.com/en-us/azure/storage/common/storage-private-endpoints - Choose the appropriate authentication mechanism, such as access keys, shared access signatures (SAS), and Azure Active Directory (Azure AD)
https://docs.microsoft.com/en-us/azure/storage/common/storage-auth-aad
https://docs.microsoft.com/en-us/azure/storage/common/storage-sas-overview
https://docs.microsoft.com/en-us/azure/storage/common/storage-account-manage#access-keys
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-control-access
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-manage-logins
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-aad-authentication
- Plan for secure endpoints (private/public)
- Design security for data policies and standards
- Design data encryption for data at rest and in transit
https://docs.microsoft.com/en-us/azure/security/fundamentals/encryption-overview#encryption-of-data-at-rest
https://docs.microsoft.com/en-us/azure/security/fundamentals/encryption-overview#encryption-of-data-in-transit
- Design for data auditing and data masking
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-auditing
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-dynamic-data-masking-get-started - Design for data privacy and data classification
https://docs.microsoft.com/en-us/azure/security/fundamentals/protection-customer-data
https://docs.microsoft.com/en-us/azure/information-protection/deploy-aip-scanner - Design a data retention policy
https://docs.microsoft.com/en-us/azure/sql-database/sql-database-long-term-backup-retention-configure - Plan an archiving strategy
https://azure.microsoft.com/en-us/solutions/architecture/backup-archive-on-premises/
https://azure.microsoft.com/en-us/blog/announcing-the-public-preview-of-azure-archive-blob-storage-and-blob-level-tiering/
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blob-storage-tiers#archive-access-tier - Plan to purge data based on business requirements
https://docs.microsoft.com/en-us/azure/storage/blobs/storage-lifecycle-management-concepts?tabs=azure-portal
- Design data encryption for data at rest and in transit