Skip to main content
Version: 3.0.0-alpha (Diátaxis)

Concepts — GPU

Architecture

Hikube allows attaching NVIDIA GPUs directly to virtual machines and Kubernetes clusters. GPU allocation is managed by the NVIDIA GPU Operator on the Kubernetes side, and by PCI passthrough on the virtual machine side (KubeVirt).


Terminology

TermDescription
GPU OperatorNVIDIA GPU Operator — automatically manages drivers, the device plugin, and the GPU runtime on Kubernetes nodes.
Device PluginKubernetes plugin that exposes GPUs as schedulable resources (nvidia.com/<model>).
PCI PassthroughTechnique that assigns a physical GPU directly to a VM, providing native performance.
CUDANVIDIA parallel computing platform, used for GPU acceleration (ML, HPC, rendering).
Instance TypeCPU/RAM resource profile for the VM. Sized based on the number of GPUs (8-16 vCPU per GPU recommended).

Available GPU types

GPUArchitectureMemoryPerformance (INT8)Use case
L40SAda Lovelace48 GB GDDR6362 TOPSInference, development, prototyping
A100Ampere80 GB HBM2e312 TOPSML training, fine-tuning
H100Hopper80 GB HBM31979 TOPSLLM, exascale computing, distributed training

GPU identifiers in manifests

GPUgpus[].name / nvidia.com/ value
L40Snvidia.com/AD102GL_L40S
A100nvidia.com/GA100_A100_PCIE_80GB
H100nvidia.com/H100_94GB

GPU on virtual machines

GPUs are attached to VMs via PCI passthrough:

  • The physical GPU is dedicated to the VM (native performance)
  • Declared in spec.gpus[] of the VMInstance manifest
  • Multi-GPU is possible (repeat entries in gpus[])
  • NVIDIA drivers must be installed inside the VM
Recommended CPU/GPU ratio

Plan for 8 to 16 vCPU per GPU. For a single GPU, a u1.2xlarge (8 vCPU, 32 GB RAM) is a good starting point.


GPU on Kubernetes

GPUs are exposed to pods via the NVIDIA Device Plugin:

  • The GPU Operator must be enabled on the cluster (plugins.gpu-operator.enabled: true)
  • Pods request a GPU via resources.limits (e.g., nvidia.com/AD102GL_L40S: 1)
  • The Kubernetes scheduler places the pod on a node that has the requested GPU
  • GPU nodes are configured in node groups with the gpus[] field

VM vs Kubernetes comparison

CriterionGPU on VMGPU on Kubernetes
IsolationDedicated GPU (passthrough)GPU shared via device plugin
PerformanceNative performanceNative performance
FlexibilityFull OS, manual driversContainers, automatic scaling
Multi-GPUVia spec.gpus[]Via resources.limits
Use caseWorkstations, interactive environmentsML pipelines, large-scale inference

Limits and quotas

ParameterValue
GPU per VMMultiple (depending on availability)
GPU per Kubernetes podMultiple (via resources.limits)
GPU typesL40S, A100, H100
Max GPU memory80 GB (A100/H100)

Further reading