Applies Git-like version control to data lakes for managing data lifecycle and provenance.
Enables isolated testing, instant rollbacks, and reproducible AI/ML training.
Integrates with existing data and AI stacks, supporting various storage, compute, and orchestration tools.
Pricing: Free plan available
Best for: Growing teams
Pros & Cons
Pros
Accelerates AI delivery and development velocity.
Ensures data quality and compliance with isolated testing and instant rollbacks.
Reduces storage costs by avoiding data duplication.
Streamlines data science and MLOps workflows.
Provides transparent, traceable, and repeatable development for AI.
Cons
Specific advanced features like Iceberg REST Catalog and Metadata Search are only available in the Enterprise plan.
SOC2 support is exclusively offered in the Enterprise plan.
While lakeFS supports multiple cloud providers, lakeFS Cloud currently supports AWS, Azure, and GCP, potentially limiting options for users on other cloud platforms for the managed service.
Preview
Key Features
Format-Agnostic Data Version ControlCloud-Agnostic Zero Clone copy for isolated environment (via branches)Atomic Data Promotion (via merges)Configurable Garbage CollectionData CI/CD Using lakeFS HooksRole-Based Access Control (RBAC)Integrates with Your Data StackAudit Logs
Pricing Plans
Open Source
Free forever
Format-Agnostic Data Version Control
Cloud-Agnostic
Zero Clone copy for isolated environment (via branches)
Atomic Data Promotion (via merges)
Data Stays in One Place
Configurable Garbage Collection
Data CI/CD Using lakeFS Hooks
Integrates with Your Data Stack
Role-Based Access Control (RBAC)
Single Sign On (SSO)
SCIM Support
IAM Roles Mount Capability
Audit Logs
Transactional Mirroring
Iceberg REST Catalog
Metadata Search
Multiple Storage Backends Support
Simplified Garbage Collection (Managed or Standalone)
SOC2 Support
SLA
Run locally
Enterprise
Contact sales
Unlimited seats
Format-Agnostic Data Version Control
Cloud-Agnostic
Zero Clone copy for isolated environment (via branches)
Atomic Data Promotion (via merges)
Data Stays in One Place
Configurable Garbage Collection
Data CI/CD Using lakeFS Hooks
Integrates with Your Data Stack
Role-Based Access Control (RBAC)
Single Sign On (SSO)
SCIM Support
IAM Roles Mount Capability
Audit Logs
Transactional Mirroring
Iceberg REST Catalog
Metadata Search
Multiple Storage Backends Support
Simplified Garbage Collection (Managed or Standalone)
lakeFS is a data version control system designed to bridge the AI infrastructure gap by bringing software engineering best practices to data management. It provides a control plane for AI-ready data, enabling teams to manage the data lifecycle, provenance, and unified access for AI and data initiatives. Built on a scalable architecture, lakeFS allows users to test pipeline and model changes in isolation on production data without creating copies, instantly rollback from data incidents, and enforce data quality and compliance standards.
The platform helps make training reproducible by tracking data used in experiments and model training, offering full visibility into data history with a built-in audit trail, and automatically satisfying model governance requirements. It also reduces data access friction by allowing users to work with any tool on remote data as if it were local, manage access permissions across all storage from one place, and keep GPUs busy without waiting for data. lakeFS integrates seamlessly with a wide range of object storage solutions, compute engines, ingest technologies, data formats, orchestration tools, and ML/AI stacks, making it a versatile solution for organizations looking to accelerate AI delivery, ensure reproducibility, and reduce data friction.
lakeFS is a data version control system that applies Git-like operations to data lakes. It helps manage the data lifecycle, provenance, and unified access for AI and data teams, enabling reproducible experiments, isolated testing, and instant rollbacks for data incidents.
How much does LakeFS cost?
lakeFS offers a freemium model. There is a free Open Source version available forever, and an Enterprise plan with unlimited seats for which you need to contact sales for pricing details.
Is LakeFS free?
Yes, lakeFS has a free Open Source version that is available forever. There is also a paid Enterprise plan with additional features.
Who is LakeFS for?
lakeFS is for AI and data teams, data engineers, ML engineers, and organizations looking to manage data at scale, accelerate their data, AI, and ML initiatives, ensure reproducibility, and improve data quality and governance.