Qdrant occupies the sweet spot between developer-friendly open source and production-grade managed cloud.
The open-source engine is genuinely full-featured — unlike some competitors that gate critical features behind paid tiers, Qdrant's filtering, quantization, and distributed mode are all available self-hosted at zero cost. The free cloud tier at 1 GB RAM is the most generous among vector database providers (Pinecone's free tier limits to fewer dimensions and lower throughput).
The managed cloud pricing is usage-based with no per-query charges, which makes costs predictable once you size your cluster — you pay for infrastructure, not API calls. The main trade-off is that Qdrant Cloud does not publish exact per-resource rates on its website, requiring the pricing calculator or sales contact to get precise numbers.
A typical production cluster runs $150-200/month for moderate workloads, which is competitive with Pinecone Serverless and significantly cheaper than Pinecone's pod-based pricing. For teams already running Kubernetes, self-hosting Qdrant is the highest-value option — the Helm chart is well-maintained and horizontal scaling is straightforward.
Free
Opaque per-unit pricing
Qdrant Cloud does not publish per-vCPU, per-GB-RAM, or per-GB-storage rates on its pricing page. You must use the pricing calculator at cloud.qdrant.io/calculator or contact sales to estimate costs. This makes budget planning harder compared to competitors with transparent rate cards
High availability doubles or triples cost
the free tier is single-node. Production high availability requires 3-node replication, which roughly triples your compute and storage costs. A $150/month single-node cluster becomes $450/month with HA — plan for this in production budgets
Backup storage is billed separately
snapshots and backups consume additional storage that is charged on top of your cluster's primary disk allocation. Frequent backups of large collections can add 20-50% to your storage costs
Embedding model costs are separate
Qdrant stores and searches vectors but does not generate them. You need an embedding API (OpenAI, Cohere, or open-source models) to convert text/images to vectors before storing them. At scale, embedding API costs can exceed Qdrant infrastructure costs — budget for both
Egress and data transfer
cloud providers charge for data transfer out of the region. If your application queries Qdrant from a different region or cloud provider, egress fees apply. Co-locate your application and Qdrant cluster in the same region
Premium tier minimum spend
the Premium tier requires an annual minimum commitment (amount undisclosed). Organizations wanting SSO and VPC links cannot access these features on a purely usage-based model — the minimum spend acts as an effective price floor
Production RAG application with 2 million vectors (768 dimensions), 12 months, moderate query volume
A permanently free single-node cluster with 0.5 vCPU, 1 GB RAM, and 4 GB disk on Qdrant Cloud. Enough to store roughly 500,000 vectors at 768 dimensions and run basic similarity search queries. Includes free cloud inference with selected embedding models — so you can test the full pipeline (embed → store → search) without any external API costs. No credit card required, no time limit.
Pay-as-you-go pricing billed hourly based on actual vCPU, RAM, and storage consumption. Supports dedicated clusters with flexible scaling on AWS, GCP, or Azure. Includes high availability (multi-node replication), backup and disaster recovery, and a 99.5% uptime SLA. A typical starter production cluster with 2 vCPU, 8 GB RAM, and 30 GB storage runs approximately $150-200/month. Scale up by adding nodes or resizing — costs increase linearly with resources.
Everything in Standard plus SSO (SAML/OIDC), private VPC links for network isolation, enhanced support with dedicated engineers, and a 99.9% uptime SLA (up from 99.5%). Requires a minimum annual spend commitment — contact sales for pricing. Designed for organizations where vector search is a core production dependency and downtime has direct revenue impact.
Qdrant is open source (Apache 2.0). Run it on your own infrastructure — bare metal, cloud VMs, or Kubernetes — at zero licensing cost. The full feature set is available self-hosted, including HNSW indexing, filtering, quantization, and distributed mode. You trade Qdrant Cloud's managed convenience for complete data sovereignty and elimination of vendor pricing. A single 8 GB RAM VM on AWS ($60-80/month) can handle millions of vectors.
Qdrant manages and monitors your clusters, but they run in your cloud account or data center. Combines the operational simplicity of managed cloud with the data residency of self-hosted. Pricing is usage-based like Standard but with an additional management fee. Ideal for regulated industries (healthcare, finance) where vector data cannot leave the corporate network.
startup
Start with the free cloud tier (1 GB RAM) for prototyping your RAG or search pipeline — it handles 500k vectors without cost. When you move to production, a Standard cluster at $150-200/month covers most early-stage workloads. Avoid over-provisioning: start with a 2 vCPU / 8 GB RAM node and scale up based on actual latency metrics. If you are already running Kubernetes, self-hosting with the Helm chart saves 50-70% over managed cloud.
enterprise
Premium tier for SSO, VPC links, and 99.9% SLA. The annual minimum spend commitment is the entry cost for enterprise features — negotiate based on projected usage. For regulated industries, Hybrid Cloud lets Qdrant manage clusters that run inside your own infrastructure, solving data residency requirements without self-hosting ops burden. Budget for 3-node HA clusters (3x single-node cost) and backup storage (20-50% overhead).
freelancer
The free cloud tier is more than enough for client projects and personal RAG applications. For production client deployments, a Standard cluster at $150-200/month is reasonable to pass through to the client. Alternatively, self-host Qdrant on a $5-10/month VPS (Hetzner, DigitalOcean) for small-scale projects — the Docker image is lightweight and the Rust engine is efficient with minimal resources.