GPUs on Hikube
Hikube provides access to NVIDIA accelerators via GPU Passthrough, enabling the execution of workloads requiring hardware acceleration. GPUs are available for two types of workloads: virtual machines and Kubernetes pods.
π― Usage Typesβ
GPU with Virtual Machinesβ
GPUs can be directly attached to virtual machines via VFIO-PCI GPU passthrough, providing complete and exclusive access to the accelerator.
Use cases:
- Applications requiring complete GPU control
- Legacy or specialized workloads
- Isolated development environments
- Graphics applications (rendering, CAD)
GPU with Kubernetesβ
GPUs can be allocated to Kubernetes workers and then assigned to pods via resource requests/limits.
Use cases:
- Containerized AI/ML workloads
- Automatic scaling of GPU applications
- GPU resource sharing between applications
- Complex orchestration of parallel jobs
π₯οΈ Available Hardwareβ
Hikube offers three types of NVIDIA GPUs:
NVIDIA L40Sβ
- Architecture : Ada Lovelace
- Memory : 48 GB GDDR6 with ECC
- Performance : 362 TOPS (INT8), 91.6 TFLOPs (FP32)
- Typical usage : Generative AI, inference, real-time rendering
NVIDIA A100β
- Architecture : Ampere
- Memory : 80 GB HBM2e with ECC
- Performance : 312 TOPS (INT8), 624 TFLOPs (Tensor)
- Typical usage : ML training, high-performance computing
NVIDIA H100β
- Architecture : Hopper
- Memory : 80 GB HBM3 with ECC
- Performance : 1979 TOPS (INT8), 989 TFLOPs (Tensor)
- Typical usage : LLM, transformers, exascale computing
ποΈ Architectureβ
GPU Allocation with VMsβ
GPU Allocation with Kubernetesβ
βοΈ Configurationβ
GPU on VMβ
apiVersion: apps.cozystack.io/v1alpha1
kind: VirtualMachine
spec:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"
GPU on Kubernetes Workerβ
apiVersion: apps.cozystack.io/v1alpha1
kind: Kubernetes
spec:
nodeGroups:
gpu-workers:
instanceType: "u1.xlarge"
gpus:
- name: "nvidia.com/AD102GL_L40S"
GPU in Kubernetes Podβ
apiVersion: v1
kind: Pod
spec:
containers:
- name: gpu-app
image: nvidia/cuda:12.0-runtime-ubuntu20.04
resources:
limits:
nvidia.com/gpu: 1
π Approach Comparisonβ
| Aspect | GPU on VM | GPU on Kubernetes |
|---|---|---|
| Isolation | Complete (1 GPU = 1 VM) | Shared (orchestrated) |
| Performance | Native (passthrough) | Native (device plugin) |
| Management | Manual | Automated |
| Scaling | Vertical only | Horizontal + Vertical |
| Sharing | No | Yes (between pods) |
| Complexity | Simple | Complex |
π Next Stepsβ
For Virtual Machinesβ
- Create a GPU VM β Practical guide
- API Reference β Complete configuration
For Kubernetesβ
- GPU Clusters β Workers with GPU
- Advanced configuration β GPU NodeGroups