Version: 2.0.2

GPUs on Hikube

Hikube provides access to NVIDIA accelerators via GPU Passthrough, enabling the execution of workloads requiring hardware acceleration. GPUs are available for two types of workloads: virtual machines and Kubernetes pods.

🎯 Usage Types

GPU with Virtual Machines

GPUs can be directly attached to virtual machines via VFIO-PCI GPU passthrough, providing complete and exclusive access to the accelerator.

Use cases:

Applications requiring complete GPU control
Legacy or specialized workloads
Isolated development environments
Graphics applications (rendering, CAD)

GPU with Kubernetes

GPUs can be allocated to Kubernetes workers and then assigned to pods via resource requests/limits.

Use cases:

Containerized AI/ML workloads
Automatic scaling of GPU applications
GPU resource sharing between applications
Complex orchestration of parallel jobs

🖥️ Available Hardware

Hikube offers three types of NVIDIA GPUs:

NVIDIA L40S

Architecture : Ada Lovelace
Memory : 48 GB GDDR6 with ECC
Performance : 362 TOPS (INT8), 91.6 TFLOPs (FP32)
Typical usage : Generative AI, inference, real-time rendering

NVIDIA A100

Architecture : Ampere
Memory : 80 GB HBM2e with ECC
Performance : 312 TOPS (INT8), 624 TFLOPs (Tensor)
Typical usage : ML training, high-performance computing

NVIDIA H100

Architecture : Hopper
Memory : 80 GB HBM3 with ECC
Performance : 1979 TOPS (INT8), 989 TFLOPs (Tensor)
Typical usage : LLM, transformers, exascale computing

🏗️ Architecture

GPU Allocation with VMs

GPU Allocation with Kubernetes

⚙️ Configuration

GPU on VM

apiVersion: apps.cozystack.io/v1alpha1
kind: VirtualMachine
spec:
  instanceType: "u1.xlarge"
  gpus:
    - name: "nvidia.com/AD102GL_L40S"

GPU on Kubernetes Worker

apiVersion: apps.cozystack.io/v1alpha1
kind: Kubernetes
spec:
  nodeGroups:
    gpu-workers:
      instanceType: "u1.xlarge"
      gpus:
        - name: "nvidia.com/AD102GL_L40S"

GPU in Kubernetes Pod

apiVersion: v1
kind: Pod
spec:
  containers:
  - name: gpu-app
    image: nvidia/cuda:12.0-runtime-ubuntu20.04
    resources:
      limits:
        nvidia.com/gpu: 1

📋 Approach Comparison

Aspect	GPU on VM	GPU on Kubernetes
Isolation	Complete (1 GPU = 1 VM)	Shared (orchestrated)
Performance	Native (passthrough)	Native (device plugin)
Management	Manual	Automated
Scaling	Vertical only	Horizontal + Vertical
Sharing	No	Yes (between pods)
Complexity	Simple	Complex

GPUs on Hikube

🎯 Usage Types

GPU with Virtual Machines

GPU with Kubernetes

🖥️ Available Hardware

NVIDIA L40S

NVIDIA A100

NVIDIA H100

🏗️ Architecture

GPU Allocation with VMs

GPU Allocation with Kubernetes

⚙️ Configuration

GPU on VM

GPU on Kubernetes Worker

GPU in Kubernetes Pod

📋 Approach Comparison

🚀 Next Steps

For Virtual Machines

For Kubernetes

🎯 Usage Types​

GPU with Virtual Machines​

GPU with Kubernetes​

🖥️ Available Hardware​

NVIDIA L40S​

NVIDIA A100​

NVIDIA H100​

🏗️ Architecture​

GPU Allocation with VMs​

GPU Allocation with Kubernetes​

⚙️ Configuration​

GPU on VM​

GPU on Kubernetes Worker​

GPU in Kubernetes Pod​

📋 Approach Comparison​

🚀 Next Steps​

For Virtual Machines​

For Kubernetes​

🎯 Usage Types

GPU with Virtual Machines

GPU with Kubernetes

🖥️ Available Hardware

NVIDIA L40S

NVIDIA A100

NVIDIA H100

🏗️ Architecture

GPU Allocation with VMs

GPU Allocation with Kubernetes

⚙️ Configuration

GPU on VM

GPU on Kubernetes Worker

GPU in Kubernetes Pod

📋 Approach Comparison

🚀 Next Steps

For Virtual Machines

For Kubernetes