
Serverless vector and full-text search engine built on object storage for AI applications.
Visit WebsitePros
Cons
$64/month
$256/month
Contact us
No reviews yet. Be the first to review Turbopuffer!
Top alternatives based on features, pricing, and user needs.
Turbopuffer achieves significant cost savings by being built from first principles on object storage. It separates compute and storage, intelligently moving data between NVMe and object storage, which optimizes resource utilization and reduces the overall infrastructure cost compared to solutions that rely on more expensive, always-on compute and storage configurations.
Turbopuffer's pricing is based on logical bytes for vector storage. Logical storage size for vectors is calculated as the number of vectors multiplied by the vector dimension and the size of the data type (e.g., 4 bytes for float32). Full-text search attributes, vector attributes, and other attributes are billed based on their compressed logical size.
Yes, Turbopuffer supports hybrid search capabilities. This allows users to combine vector similarity search with full-text search and metadata filtering within a single query, enabling more precise and relevant search results for complex AI applications and recommendation systems.
For a single namespace, Turbopuffer supports up to 500 million documents, totaling approximately 2TB of data. The maximum write throughput for a single namespace is 10,000 writes per second, with data ingestion rates up to 32 MB per second.
For enterprise customers, Turbopuffer provides a comprehensive suite of security and compliance features. This includes a SOC2 report, GDPR-ready Data Processing Agreement (DPA), HIPAA-ready Business Associate Agreement (BAA), Single Sign-On (SSO), Customer Managed Encryption Keys (CMEK) per namespace, and Private Networking. These features ensure data protection and adherence to industry-specific regulations.
For a warm namespace, Turbopuffer achieves a p50 latency of 8ms and a p99 latency of 35ms for vector search queries (e.g., 768 dimensions, 1M documents). For a cold namespace, the p50 latency is 343ms and p99 latency is 554ms, demonstrating its performance even when data needs to be retrieved from underlying storage.
Source: turbopuffer.com