Documentation Index
Fetch the complete documentation index at: https://docs.hyperbolic.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Platform Comparison Guide
Choose the right Hyperbolic service based on your specific needs. This comprehensive comparison will help you understand the differences between our three core offerings.Quick Comparison Table
| Feature | On-Demand GPU | Serverless Inference | Reserved Clusters |
|---|---|---|---|
| Best For | Training, development, experiments | Production APIs, prototypes | Enterprise workloads |
| Setup Time | < 5 minutes | Instant | 24-48 hours |
| Minimum Commitment | None (hourly) | None (pay-per-use) | 3 months |
| Pricing Model | $/hour | $/1M tokens | $/month |
| GPU Access | Full SSH access | API only | Full SSH access |
| Scaling | Manual | Automatic | Pre-configured |
| Support Level | Community | Standard | Priority |
| SLA | 99.5% | 99.9% | 99.99% |
Detailed Service Comparison
On-Demand GPUs
When to Use:
- Training custom models
- Running experiments and notebooks
- Batch processing jobs
- Need full control over environment
- Temporary compute needs
- Full root access via SSH
- Custom Docker images
- Persistent storage options
- Choose specific GPU models
- Multiple GPU configurations
- H100 SXM: Starting at $1.39/hour
- H200 SXM: Starting at $1.99/hour
- No setup fees or commitments
- No automatic scaling
- Manual instance management
- Availability depends on supply
- No built-in load balancing
Serverless Inference
When to Use:
- Production API endpoints
- Quick prototyping
- Variable/unpredictable traffic
- Using standard models
- Cost-sensitive applications
- Instant deployment
- Auto-scaling
- OpenAI-compatible API
- 25+ pre-loaded models
- Pay only for usage
- Check our pricing page for up-to-date details.
- No custom models (without arrangement)
- Rate limits apply
Reserved Clusters
When to Use:
- 24/7 production workloads
- Guaranteed availability needed
- High-volume consistent usage
- Custom configurations required
- Enterprise compliance needs
- Dedicated resources
- Custom configurations
- Priority support
- 99.99% SLA
- Volume discounts up to 40%
- Custom quotes based on:
- Cluster size (32-100+ GPUs)
- Contract length (3-24 months)
- GPU type and configuration
- Typical savings: Depends on usage and contract length
- Minimum 3-month commitment
- 24-48 hour setup time
- Less flexibility
- Capacity planning required
Use Case Scenarios
Scenario 1: AI Startup Building an App
- Development Phase
- MVP/Beta Phase
- Production Scale
Recommended: On-Demand GPU
- Experiment with different models
- Train custom models
- Test various configurations
- No commitment while iterating
Scenario 2: Research Team
Recommended: On-Demand GPU- Access to latest GPU models
- Full control for experiments
- Flexible rental periods
- Multiple concurrent experiments
Scenario 3: Enterprise Integration
Recommended: Reserved Clusters- Compliance requirements (SOC2, HIPAA)
- Dedicated resources
- Custom security configurations
- SLA guarantees
- Direct support channel
Scenario 4: Hackathon Project
Recommended: Serverless Inference- Instant API access
- No setup required
- Free tier available
- Focus on building, not infrastructure
Getting Started
Assess Your Needs
- Workload type (training vs inference)
- Usage pattern (constant vs variable)
- Budget constraints
- Timeline requirements
Start Small
- Try Serverless for quick testing
- Rent On-Demand for a few hours
- Run benchmarks and calculate costs
Need Help Deciding?
Talk to Sales
Get personalized recommendations and custom quotes
Contact Support
Get help from our support team
Get Started
Get started with Hyperbolic in minutes

