Databricks Certified Generative AI Engineer Associate
Validates ability to design, build, and deploy Generative AI solutions with Databricks including implementing RAG applications using Vector Search, deploying and managing foundation models through Mosaic AI Model Serving, preparing and chunking data for LLM applications, applying AI governance practices, and evaluating GenAI application quality. The exam consists of 45 scored multiple-choice questions over 90 minutes.
Exam domains
- Application Development30%
Build RAG pipelines and agentic applications with the Mosaic AI Agent Framework, LangChain/LlamaIndex, and PyFunc/ChatModel flavors logged to MLflow, wiring retrievers backed by Mosaic AI Vector Search to chat models served via Foundation Model APIs or Mosaic AI Model Serving. Implement prompt engineering (system prompts, few-shot, structured outputs), tool calling, guardrails, and chain-of-thought reasoning, and instrument MLflow Tracing to capture inputs, outputs, retrievals, and intermediate steps for every span.
- Assembling and Deploying Apps22%
Package GenAI applications as MLflow models, register them in Unity Catalog (three-level catalog.schema.model), and deploy with Mosaic AI Model Serving and Agent Serving endpoints using CPU/GPU compute, scale-to-zero, and provisioned throughput for production SLAs. Front endpoints with Mosaic AI Gateway for unified routing across Databricks-hosted and external LLMs (OpenAI, Anthropic), enable payload logging to inference tables, and orchestrate batch and streaming GenAI workloads with Lakeflow Jobs and Databricks Asset Bundles for CI/CD.
- Data Preparation14%
Prepare unstructured and structured data for GenAI workloads using Delta Lake on the lakehouse, including parsing PDFs and HTML, chunking strategies (fixed-size, recursive, semantic) and metadata enrichment for retrieval quality. Compute embeddings with Foundation Model APIs or external embedding endpoints and index them into Mosaic AI Vector Search (Delta Sync or Direct Access indexes) under Unity Catalog, applying schema design and chunk-level filters that support hybrid keyword + vector search.
Sources
Questions are grounded in 100 references from official and authoritative materials.