Skip to main content
Version: 2.0.2

GPUs on Hikube

Hikube provides access to NVIDIA accelerators via GPU Passthrough, enabling the execution of workloads requiring hardware acceleration. GPUs are available for two types of workloads: virtual machines and Kubernetes pods.


🎯 Usage Types​

GPU with Virtual Machines​

GPUs can be directly attached to virtual machines via VFIO-PCI GPU passthrough, providing complete and exclusive access to the accelerator.

Use cases:

  • Applications requiring complete GPU control
  • Legacy or specialized workloads
  • Isolated development environments
  • Graphics applications (rendering, CAD)

GPU with Kubernetes​

GPUs can be allocated to Kubernetes workers and then assigned to pods via resource requests/limits.

Use cases:

  • Containerized AI/ML workloads
  • Automatic scaling of GPU applications
  • GPU resource sharing between applications
  • Complex orchestration of parallel jobs

πŸ–₯️ Available Hardware​

Hikube offers three types of NVIDIA GPUs:

NVIDIA L40S​

  • Architecture : Ada Lovelace
  • Memory : 48 GB GDDR6 with ECC
  • Performance : 362 TOPS (INT8), 91.6 TFLOPs (FP32)
  • Typical usage : Generative AI, inference, real-time rendering

NVIDIA A100​

  • Architecture : Ampere
  • Memory : 80 GB HBM2e with ECC
  • Performance : 312 TOPS (INT8), 624 TFLOPs (Tensor)
  • Typical usage : ML training, high-performance computing

NVIDIA H100​

  • Architecture : Hopper
  • Memory : 80 GB HBM3 with ECC
  • Performance : 1979 TOPS (INT8), 989 TFLOPs (Tensor)
  • Typical usage : LLM, transformers, exascale computing

πŸ—οΈ Architecture​

GPU Allocation with VMs​

GPU Allocation with Kubernetes​


βš™οΈ Configuration​

GPU on VM​

apiVersion: apps.cozystack.io/v1alpha1
kind: VirtualMachine
spec:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"

GPU on Kubernetes Worker​

apiVersion: apps.cozystack.io/v1alpha1
kind: Kubernetes
spec:
nodeGroups:
gpu-workers:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"

GPU in Kubernetes Pod​

apiVersion: v1
kind: Pod
spec:
containers:
- name: gpu-app
image: nvidia/cuda:12.0-runtime-ubuntu20.04
resources:
limits:
nvidia.com/gpu: 1

πŸ“‹ Approach Comparison​

AspectGPU on VMGPU on Kubernetes
IsolationComplete (1 GPU = 1 VM)Shared (orchestrated)
PerformanceNative (passthrough)Native (device plugin)
ManagementManualAutomated
ScalingVertical onlyHorizontal + Vertical
SharingNoYes (between pods)
ComplexitySimpleComplex

πŸš€ Next Steps​

For Virtual Machines​

For Kubernetes​