NVIDIA Certified Professional - AI Operations
Validates the ability to monitor, troubleshoot, and optimize AI infrastructure operations using NVIDIA tools, including installation and deployment of NVIDIA software stacks with Base Command Manager, administration of Slurm and Kubernetes GPU clusters with user access and resource management, workload management for AI training and inference job scheduling, and troubleshooting and optimization of GPU performance and connectivity. The exam covers four domains: Installation and Deployment (31%), Administration (23%), Workload Management (23%), and Troubleshooting and Optimization (23%). Format: 70-75 multiple-choice questions, 120 minutes, proctored online.
Exam domains
- Installation and Deployment31%
Provision DGX/HGX clusters with Base Command Manager and Mission Control: bring up head and compute nodes, configure cluster networking (in-band, IPMI, RDMA), deploy Slurm, Kubernetes with the NVIDIA GPU Operator (driver, container toolkit, device plugin, DCGM exporter, MIG manager), Run:AI, and DOCA Services, and apply firmware, driver, and software-image updates across node categories.
- Administration23%
Run day-2 cluster operations across Slurm partitions, Kubernetes nodes, Run:AI projects/departments, and Base Command Manager: manage users and RBAC, configure MIG geometries via the GPU Operator MIG manager (single vs mixed strategy, all-1g.10gb/all-balanced profiles), enable GPU time-slicing or MPS, and operate the AI data center reference architecture end-to-end.
- Workload Management23%
Deploy training and inference workloads across Slurm (sbatch, GRES, gres:gpu, srun), Kubernetes (nvidia.com/gpu resource limits, MIG slice scheduling, nodeSelectors), and Run:AI projects with quotas and fair-share scheduling; pull and run containerized stacks from NGC, allocate GPUs across teams, and use job statistics and queue tooling to balance throughput against utilization.
- Troubleshooting and Optimization23%
Sources
Questions are grounded in 50 references from official and authoritative materials.