Skip to main content

GPUs on Hikube

Hikube provides access to NVIDIA accelerators via GPU Passthrough, enabling the execution of workloads requiring hardware acceleration. GPUs are available for two types of workloads: virtual machines and Kubernetes pods.


🎯 Usage Types​

GPU with Virtual Machines​

GPUs can be directly attached to virtual machines via VFIO-PCI GPU passthrough, providing complete and exclusive access to the accelerator.

Use cases:

  • Applications requiring complete GPU control
  • Legacy or specialized workloads
  • Isolated development environments
  • Graphics applications (rendering, CAD)

GPU with Kubernetes​

GPUs can be allocated to Kubernetes workers and then assigned to pods via resource requests/limits.

Use cases:

  • Containerized AI/ML workloads
  • Automatic scaling of GPU applications
  • GPU resource sharing between applications
  • Complex orchestration of parallel jobs

πŸ–₯️ Available Hardware​

Hikube offers three types of NVIDIA GPUs:

NVIDIA L40S​

  • Architecture : Ada Lovelace
  • Memory : 48 GB GDDR6 with ECC
  • Performance : 362 TOPS (INT8), 91.6 TFLOPs (FP32)
  • Typical usage : Generative AI, inference, real-time rendering

NVIDIA A100​

  • Architecture : Ampere
  • Memory : 80 GB HBM2e with ECC
  • Performance : 312 TOPS (INT8), 624 TFLOPs (Tensor)
  • Typical usage : ML training, high-performance computing

NVIDIA H100​

  • Architecture : Hopper
  • Memory : 80 GB HBM3 with ECC
  • Performance : 1979 TOPS (INT8), 989 TFLOPs (Tensor)
  • Typical usage : LLM, transformers, exascale computing

πŸ—οΈ Architecture​

GPU Allocation with VMs​

GPU Allocation with Kubernetes​


βš™οΈ Configuration​

GPU on VM​

apiVersion: apps.cozystack.io/v1alpha1
kind: VirtualMachine
spec:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"

GPU on Kubernetes Worker​

apiVersion: apps.cozystack.io/v1alpha1
kind: Kubernetes
spec:
nodeGroups:
gpu-workers:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"

GPU in Kubernetes Pod​

apiVersion: v1
kind: Pod
spec:
containers:
- name: gpu-app
image: nvidia/cuda:12.0-runtime-ubuntu20.04
resources:
limits:
nvidia.com/gpu: 1

πŸ“‹ Approach Comparison​

AspectGPU on VMGPU on Kubernetes
IsolationComplete (1 GPU = 1 VM)Shared (orchestrated)
PerformanceNative (passthrough)Native (device plugin)
ManagementManualAutomated
ScalingVertical onlyHorizontal + Vertical
SharingNoYes (between pods)
ComplexitySimpleComplex

πŸš€ Next Steps​

For Virtual Machines​

For Kubernetes​