Like many people, I kicked off this year with a long list of resolutions. My list includes items like learning how to do a backflip, but that’s a story for another time! One of my top three goals, and the one has come up regularly over the last couple of years, was to make an active effort learning cloud computing. Luckily, my current employer has chosen Microsoft Azure as their cloud platform, hence allowing me to focus on it.

Having just completed my machine learning course with Georgia Tech (in May 2020) and being more fluent in Python, it was pretty obvious to me to choose Azure DP-100 exam.

What is the DP-100 Exam?

The Azure DP-100 Exam: Designing and Implementing a Data Science Solution on Azure is for data geeks who use machine learning techniques to implement a data science solution using Azure Machine Learning.

The DP-100 Exam is designed to evaluate candidates for the following tasks:

  • Create, define and manage Azure Machine Learning workspace.
  • Create and run experiments that log metrics and train machine learning models.
  • Create and manage datastores and datasets, and use them in machine learning experiments.
  • Create and manage compute resources, and use them to run machine learning experiments at scale in the cloud.
  • Deploy predictive models as real-time or batch inference services, and consume them from client applications.
  • Perform hyperparameters tuning, automate machine learning models.

Here is a diagram displaying the learning outcome visually.

For more details on the scope of the exam, click here.

Exam Prerequisites

Quick Installation

It is recommended you create a Python virtual environment (Miniconda preferred but virtualenv works too) and install the SDK in it.

pip install --upgrade azureml-sdk[explain,automl,notebooks]

Install the Azure Machine Learning SDK

Statistics/Machine Learning

Here are some key concepts to know prior to sitting the exam:

  • Supervised Learning
    1. Regression methods: Linear Regression, Decision Trees, Random Forest, Deep Learning, etc.
    2. Classification methods: Logistic Regression, Linear Discriminant Analysis, Support Vector Machine, Multi-classification, etc.
  • Unsupervised Learning methods: Principal Component Analysis, \(K-\)means Clustering, etc.
  • Data Wrangling: Imputation, Imbalance Classes, etc.

Programming Languages

The bulk of the questions focus on Python programming. In other words, as a candidate sitting the exam, you must be comfortable seeing some Python codes without any trace of sweat on your forehead (to my R-fellows, you know who you are!). There might be some R codes being thrown at you in the exam.

Either way, be comfortable reading/writing codes in both languages.

Exam Preparation

There aren’t that many centralized resources available to prepare for this exam. However, here are the ones that I used during my preparation time:

Documentation

Projects using Azure Machine Learning

Visit following repos to see projects contributed by Azure Machine Learning users:

I hope you found this blogpost useful. Please drop a comment/remark below :-)