Databricks Certified Machine Learning Professional
Validates ability to design, implement, and manage enterprise-scale machine learning solutions using the Databricks platform. The exam covers model development with SparkML and distributed training, MLOps practices including model lifecycle management, validation testing, environment architectures with Databricks Asset Bundles, automated retraining workflows, drift detection with Lakehouse Monitoring, and model deployment strategies with Model Serving. The exam consists of 60 multiple-choice questions over 120 minutes.
Exam domains
- Model Development47%
Build scalable ML models on Databricks using SparkML pipelines, distributed training with TorchDistributor, DeepSpeed, and Ray, and hyperparameter tuning with Hyperopt and Optuna. Engineer features via the Databricks Feature Store with offline tables and online stores, and track experiments, signatures, and model versions through advanced MLflow patterns.
- MLOps43%
Operate production ML on Databricks with comprehensive unit and integration testing, environment promotion via Databricks Asset Bundles, and automated retraining workflows on Lakeflow Jobs. Monitor inference tables, feature drift, and model quality with Lakehouse Monitoring and the MLflow Model Registry stages, webhooks, and aliases to detect and remediate drift in production.
- Model Deployment10%
Deploy registered MLflow models to Databricks Model Serving for real-time endpoints, batch, and Structured Streaming inference, including custom pyfunc models and GPU-backed serving. Manage rollout strategies such as A/B testing, canary releases, and shadow deployments while tracking served-model versions and traffic splits across endpoints.
Sources
Questions are grounded in 50 references from official and authoritative materials.