Skip to main content
GPU visibility is deployment policy; hardcoding it in the image reduces portability.
PropertyValue
SeverityWarning
CategoryCorrectness
DefaultEnabled
Auto-fixPartial

Description

Detects ENV instructions that hardcode GPU device visibility variables (NVIDIA_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES) inside the image. GPU visibility is deployment policy that should be set at runtime via docker run --gpus, NVIDIA_VISIBLE_DEVICES in the orchestrator, or similar mechanisms — not baked into the container image.

Why this matters

  • Portability — images with hardcoded device indices or UUIDs cannot run on hosts with different GPU topologies without rebuilding
  • Orchestrator conflict — Kubernetes device plugins, Slurm, and other schedulers set GPU visibility externally; image-level settings can conflict with or override orchestrator intent
  • Redundancy — official nvidia/cuda base images already set NVIDIA_VISIBLE_DEVICES=all via image labels; re-declaring it in the Dockerfile is pure noise

What is flagged

PatternFlagged?Fix safety
ENV NVIDIA_VISIBLE_DEVICES=all on nvidia/cuda:* baseYes (redundant)FixSafe — safe to delete
ENV NVIDIA_VISIBLE_DEVICES=0 or =0,1 (device indices)YesFixSuggestion
ENV NVIDIA_VISIBLE_DEVICES=GPU-<uuid> or MIG-<uuid>YesFixSuggestion
ENV CUDA_VISIBLE_DEVICES=<non-empty>YesFixSuggestion
ENV NVIDIA_VISIBLE_DEVICES=all on non-CUDA baseNo — intentional for custom GPU images
ENV NVIDIA_VISIBLE_DEVICES=none / void / emptyNo — intentional disable signal
ENV NVIDIA_VISIBLE_DEVICES=${VAR} (variable reference)No — parameterized, not hardcoded
ENV CUDA_VISIBLE_DEVICES=none / NoDevFiles / emptyNo — intentional disable

Examples

Violation

# Redundant: nvidia/cuda already sets NVIDIA_VISIBLE_DEVICES=all
FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04
ENV NVIDIA_VISIBLE_DEVICES=all
# Hardcoded device indices make the image non-portable
FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04
ENV NVIDIA_VISIBLE_DEVICES=0,1
# CUDA_VISIBLE_DEVICES bakes deployment policy into the image
FROM ubuntu:22.04
ENV CUDA_VISIBLE_DEVICES=0
# GPU UUIDs are host-specific
FROM ubuntu:22.04
ENV NVIDIA_VISIBLE_DEVICES=GPU-aaaa-bbbb-cccc-dddd-eeee-ffffffffffff

No violation

# NVIDIA_VISIBLE_DEVICES=all on a non-CUDA base is intentional
FROM ubuntu:22.04
ENV NVIDIA_VISIBLE_DEVICES=all
# Disable signals are intentional
FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04
ENV NVIDIA_VISIBLE_DEVICES=none
# Variable references are parameterized, not hardcoded
FROM nvidia/cuda:12.2.0-runtime-ubuntu22.04
ARG GPU_DEVICES=all
ENV NVIDIA_VISIBLE_DEVICES=${GPU_DEVICES}

Auto-fix behavior

The rule offers two fix safety levels:
  • FixSafe (applied with --fix): removes the redundant NVIDIA_VISIBLE_DEVICES=all on nvidia/cuda base images. This is 100% behavior-preserving because the base image already sets this value.
  • FixSuggestion (applied with --fix --fix-unsafe): removes hardcoded device indices, UUIDs, or CUDA_VISIBLE_DEVICES values. This improves portability but changes deployment semantics — the user must ensure GPU visibility is provided at runtime.
For multi-key ENV instructions, only the flagged key is removed; other keys are preserved.

Configuration

This rule has no rule-specific options.
[rules.tally."gpu/no-hardcoded-visible-devices"]
severity = "warning"

References