Skip to content
Replicate logo

Replicate Pricing 2026

Plans, hidden costs, and cheaper alternatives compared

Is Replicate worth the price?

85/10

Replicate's pricing for public models is quite fair and generous, especially with the 'scale to zero' feature, making it highly cost-effective for intermittent use.

Dedicated hardware, while offering more control, can become expensive quickly, with an 8x H100 GPU costing $43.92/hr. This structure is best for developers and researchers needing flexible, on-demand access to ML models, with dedicated options for production-grade private models.

Pricing Plans

Pay-as-you-go (Public Models)

Usage-based

  • CPU: $0.0001/sec
  • Nvidia T4 GPU: $0.000225/sec
  • Nvidia L40S GPU: $0.000975/sec
  • Up to 8x H100 GPU: $0.0112/sec
  • Image models: $0.025–$0.09 per output
  • LLMs: $3.00–$3.75 per million input tokens
  • Video models: $0.09–$0.25 per second of output
  • Scale to zero — no charge when idle
  • Thousands of community models included

Dedicated Hardware (Private Models)

From $0.09/hr

  • CPU Small: $0.09/hr ($0.000025/sec)
  • Up to 8x H100 GPU: $43.92/hr ($0.0122/sec)
  • Dedicated instances for custom models
  • Pay for all time instances are online including idle
  • Fast-booting fine-tunes exempt from idle charges

Enterprise

Custom

  • Volume discounts
  • Dedicated support
  • Custom SLAs
  • Contact sales for pricing

Hidden Costs & Gotchas

Dedicated hardware charges for idle time

High costs for sustained dedicated GPU use

Potential for rapid spend on high-end GPUs

Which Plan Do You Need?

ML developers and researchers

Startups with fluctuating ML needs

Teams deploying custom private models

How Replicate Compares to Competitors

Compared to AWS SageMaker, Replicate offers a more simplified, developer-centric pricing model, particularly for public models with its 'scale to zero' benefit. While SageMaker's instance pricing can be complex, Replicate's dedicated H100 GPU at $43.92/hr is competitive but requires careful management to avoid idle charges, unlike some serverless ML platforms that abstract away infrastructure costs more completely.

Replicate Pricing FAQ

How much does Replicate cost?

Replicate uses custom pricing. Contact Replicate directly for a quote based on your team size and requirements.

Does Replicate have a free plan?

Yes. Replicate offers a free plan called "Dedicated Hardware (Private Models)". It includes: CPU Small: $0.09/hr ($0.000025/sec), Up to 8x H100 GPU: $43.92/hr ($0.0122/sec), Dedicated instances for custom models.

Is there a cheaper alternative to Replicate?

Yes. Popular alternatives to Replicate include Hugging Face, Together AI, RunPod, Nexus Repository. Free alternatives include Hugging Face, Nexus Repository. Compare them side-by-side on Toolradar.

Cheaper alternatives to Replicate

Direct competitors with similar features. Many offer free tiers or lower per-seat pricing.