NVIDIA Certified Associate - Generative AI Multimodal
Validates foundational skills needed to design, implement, and manage AI systems that synthesize and interpret data across text, image, and audio modalities using NVIDIA platforms. Covers experimentation with multimodal models, core machine learning and AI knowledge including transformer and diffusion architectures, multimodal data processing across text/image/audio/video, software development for multimodal applications, data analysis and visualization, performance optimization for inference and training, and trustworthy AI practices. The exam covers seven domains: Experimentation (25%), Core Machine Learning and AI Knowledge (20%), Multimodal Data (15%), Software Development (15%), Data Analysis and Visualization (10%), Performance Optimization (10%), and Trustworthy AI (5%). Format: 50-60 multiple-choice questions, 60 minutes, proctored online.
Exam domains
- Experimentation25%
Iterative model development with NeMo Framework: prompt engineering for VLMs (Cosmos Nemotron, Llama 3.2 Vision, NeVA), in-context learning with multi-image references, structured JSON outputs, and tracking experiments via NeMo Curator and evaluation harnesses.
- Core Machine Learning and AI Knowledge20%
Foundations of generative AI across modalities: transformer and diffusion architectures, vision-language models with discrete latent codes, CUDA/cuDNN, Tensor Core acceleration, and the training-vs-inference lifecycle on NVIDIA NeMo and the NGC catalog.
- Multimodal Data15%
Curating image, video, audio, and text corpora with NeMo Curator (ExactDuplicates, FuzzyDuplicates MinhashLSH, FastTextLangId, PII redaction, quality heuristics) and Cosmos Tokenizers for spatiotemporal compression via 3D causal convolution and wavelet downsampling.
- Software Development15%
Building multimodal apps on NVIDIA NIM microservices with OpenAI-compatible APIs: NV-CLIP, vision/embedding/reranking NIMs, Riva ASR/TTS, Maxine Audio2Face, plus Triton Inference Server deployment on Kubernetes via the NIM Operator.
Sources
Questions are grounded in 50 references from official and authoritative materials.