NVIDIA Certified Associate - AI Infrastructure and Operations
Validates foundational knowledge of AI computing infrastructure and operations, including GPU architecture fundamentals, NVIDIA software stack, AI cluster design and scaling, data center networking with NVLink and InfiniBand, GPU monitoring with DCGM, container orchestration for AI workloads, and virtualization with Multi-Instance GPU. The exam covers three domains: Essential AI Knowledge (38%), AI Infrastructure (40%), and AI Operations (22%). Format: 50 multiple-choice questions, 60 minutes, proctored online.
Exam domains
- AI Infrastructure40%
NVIDIA accelerated compute building blocks: Hopper H100 GPUs with HBM3 memory and 4th-gen NVLink/NVSwitch, DGX H100 8-GPU systems, ConnectX-7 InfiniBand fabrics, and reference architectures (DGX BasePOD/SuperPOD). Includes storage tiers, fabric topologies, and CPU/GPU/network sizing for AI clusters.
- Essential AI Knowledge38%
Foundations of AI/ML, deep learning, and generative AI on NVIDIA platforms, including CUDA, cuDNN, the NGC catalog, and Tensor Core acceleration. Covers training versus inference workflows and the role of NVIDIA AI Enterprise software in deploying production models.
- AI Operations22%
Day-2 operations for GPU clusters: DCGM telemetry, health checks and diagnostics, Fabric Manager for NVSwitch, GPU Operator on Kubernetes, and Multi-Instance GPU (MIG) partitioning with profiles like 1g.10gb and 3g.40gb. Covers job scheduling, multi-tenant GPU sharing, and incident response.
Sources
Questions are grounded in 50 references from official and authoritative materials.