Cluster Configuration

Configure your dedicated cluster with the exact specifications your workloads require. Our team works with you to design the optimal setup for your use case.

GPU Selection

Choose from the latest NVIDIA hardware:

GPU Model	Memory	Best For
Blackwell (B200)	192GB HBM3e	Cutting-edge training and inference
H200	141GB HBM3e	Next-gen LLM training and inference
H100	80GB HBM3	Industry standard for large-scale training

Custom Mix: You can combine different GPU types in a single cluster for specialized workloads.

Networking Options

High-speed interconnects are critical for distributed training. Choose the right networking for your needs:

InfiniBand (400Gb/s)

Ultra-low latency networking for distributed training at scale.

Best for: Multi-node training with 32+ GPUs
Latency: Sub-microsecond
Topology: Fat-tree or custom

RoCE (200Gb/s)

High-speed Ethernet alternative with RDMA support.

Best for: Mixed training/inference workloads
Latency: Low microsecond
Easier integration with existing infrastructure

NVLink

Direct GPU-to-GPU communication within a node.

Best for: Intra-node communication
Bandwidth: Up to 900GB/s
Included with multi-GPU nodes

Custom Topology

Need something specific? We can design custom network architectures for your requirements.

Compute Specifications

Configure the CPU and memory for your nodes:

Component	Options
CPU	Up to 256 cores per node
RAM	Up to 2TB per node
Internet Bandwidth	10-100 Gbps connectivity

Storage Solutions

Local Storage

Fast NVMe storage attached directly to your nodes.

Capacity: Up to 30TB per node
Performance: Up to 7GB/s read/write
Best for: Training checkpoints, scratch space

Shared Storage

Network-attached storage accessible from all nodes.

Type	Capacity	Best For
Lustre	Up to 1PB	High-performance parallel I/O
NFS	Up to 1PB	General-purpose shared storage

Object Storage

S3-compatible storage for datasets and artifacts.

Capacity: Unlimited
Best for: Large datasets, model artifacts, backups

Backup Options

Automated snapshots
Point-in-time recovery
Cross-region replication (optional)

Example Configurations

LLM Training Cluster

Optimized for training large language models:

Component	Specification
GPUs	64x H100 80GB
Networking	InfiniBand 400Gb/s
CPU	128 cores per node
RAM	1TB per node
Storage	15TB NVMe local + 500TB Lustre shared

Contact sales for pricing based on your specific configuration and commitment terms.

Inference Cluster

Optimized for high-throughput inference:

Component	Specification
GPUs	16x H100 80GB
Networking	RoCE 200Gb/s
CPU	64 cores per node
RAM	512GB per node
Storage	8TB NVMe local

Research Cluster

Flexible configuration for R&D teams:

Component	Specification
GPUs	32x H200 141GB
Networking	NVLink + RoCE
CPU	128 cores per node
RAM	1TB per node
Storage	10TB NVMe local + 200TB NFS shared

Security Options

VPN Access: Secure connectivity to your cluster
Private Networking: Isolated network environment
Custom Firewall Rules: Control inbound/outbound traffic
Compliance Configurations: SOC2, HIPAA-ready setups available

Next Steps

Get Started

Begin the reservation process with our sales team

Cluster Management

Learn about deployment and monitoring options

Overview

On-Demand GPU

Serverless Inference

Reserved Clusters

General Platform

Cluster Configuration

Cluster Configuration

GPU Selection

Networking Options

InfiniBand (400Gb/s)

RoCE (200Gb/s)

NVLink

Custom Topology

Compute Specifications

Storage Solutions

Local Storage

Shared Storage

Object Storage

Backup Options

Example Configurations

LLM Training Cluster

Inference Cluster

Research Cluster

Security Options

Next Steps

Get Started

Cluster Management

Overview

On-Demand GPU

Serverless Inference

Reserved Clusters

General Platform

Documentation Index

​Cluster Configuration

​GPU Selection

​Networking Options

​InfiniBand (400Gb/s)

​RoCE (200Gb/s)

​NVLink

​Custom Topology

​Compute Specifications

​Storage Solutions

​Local Storage

​Shared Storage

​Object Storage

​Backup Options

​Example Configurations

​LLM Training Cluster

​Inference Cluster

​Research Cluster

​Security Options

​Next Steps

Get Started

Cluster Management

Cluster Configuration

GPU Selection

Networking Options

InfiniBand (400Gb/s)

RoCE (200Gb/s)

NVLink

Custom Topology

Compute Specifications

Storage Solutions

Local Storage

Shared Storage

Object Storage

Backup Options

Example Configurations

LLM Training Cluster

Inference Cluster

Research Cluster

Security Options

Next Steps