
LakeFS
UnclaimedApply Git-like version control to your data lake for reproducible AI and streamlined data workflows.
Visit WebsiteFreemiumVisit Website
Tracked since2026
0 reviews trackedThe Bottom Line
Entry price
Free plan available, paid tiers above
Biggest pro
Accelerates AI delivery and development velocity.
Biggest con
Specific advanced features like Iceberg REST Catalog and Metadata Search are only available in the Enterprise plan.
TL;DR - LakeFS
- Applies Git-like version control to data lakes for managing data lifecycle and provenance.
- Enables isolated testing, instant rollbacks, and reproducible AI/ML training.
- Integrates with existing data and AI stacks, supporting various storage, compute, and orchestration tools.
Pricing: Free plan available
Best for: Growing teams
What is LakeFS?
lakeFS is a data version control system designed to bridge the AI infrastructure gap by bringing software engineering best practices to data management. It provides a control plane for AI-ready data, enabling teams to manage the data lifecycle, provenance, and unified access for AI and data initiatives. Built on a scalable architecture, lakeFS allows users to test pipeline and model changes in isolation on production data without creating copies, instantly rollback from data incidents, and enforce data quality and compliance standards.
The platform helps make training reproducible by tracking data used in experiments and model training, offering full visibility into data history with a built-in audit trail, and automatically satisfying model governance requirements. It also reduces data access friction by allowing users to work with any tool on remote data as if it were local, manage access permissions across all storage from one place, and keep GPUs busy without waiting for data. lakeFS integrates seamlessly with a wide range of object storage solutions, compute engines, ingest technologies, data formats, orchestration tools, and ML/AI stacks, making it a versatile solution for organizations looking to accelerate AI delivery, ensure reproducibility, and reduce data friction.
Available on: Web
Pros & Cons
Pros
- Accelerates AI delivery and development velocity.
- Ensures data quality and compliance with isolated testing and instant rollbacks.
- Reduces storage costs by avoiding data duplication.
- Streamlines data science and MLOps workflows.
- Provides transparent, traceable, and repeatable development for AI.
Cons
- Specific advanced features like Iceberg REST Catalog and Metadata Search are only available in the Enterprise plan.
- SOC2 support is exclusively offered in the Enterprise plan.
- While lakeFS supports multiple cloud providers, lakeFS Cloud currently supports AWS, Azure, and GCP, potentially limiting options for users on other cloud platforms for the managed service.
Preview
Key Features
Format-Agnostic Data Version ControlCloud-Agnostic Zero Clone copy for isolated environment (via branches)Atomic Data Promotion (via merges)Configurable Garbage CollectionData CI/CD Using lakeFS HooksRole-Based Access Control (RBAC)Integrates with Your Data StackAudit Logs
Pricing Plans
Pricing checked May 28, 2026
Open Source
Free forever
- Format-Agnostic Data Version Control
- Cloud-Agnostic
- Zero Clone copy for isolated environment (via branches)
- Atomic Data Promotion (via merges)
- Data Stays in One Place
- Configurable Garbage Collection
- Data CI/CD Using lakeFS Hooks
- Integrates with Your Data Stack
- Role-Based Access Control (RBAC)
- Single Sign On (SSO)
- SCIM Support
- IAM Roles Mount Capability
- Audit Logs
- Transactional Mirroring
- Iceberg REST Catalog
- Metadata Search
- Multiple Storage Backends Support
- Simplified Garbage Collection (Managed or Standalone)
- SOC2 Support
- SLA
- Run locally
Enterprise
Contact sales
- Unlimited seats
- Format-Agnostic Data Version Control
- Cloud-Agnostic
- Zero Clone copy for isolated environment (via branches)
- Atomic Data Promotion (via merges)
- Data Stays in One Place
- Configurable Garbage Collection
- Data CI/CD Using lakeFS Hooks
- Integrates with Your Data Stack
- Role-Based Access Control (RBAC)
- Single Sign On (SSO)
- SCIM Support
- IAM Roles Mount Capability
- Audit Logs
- Transactional Mirroring
- Iceberg REST Catalog
- Metadata Search
- Multiple Storage Backends Support
- Simplified Garbage Collection (Managed or Standalone)
- SOC2 Support
- SLA
- Run locally
Reviews
Be the first to review LakeFS
Your take helps the next buyer. Verified LinkedIn reviewers get a badge.
Write a reviewBest LakeFS Alternatives
Top alternatives based on features, pricing, and user needs.
Still deciding?
Most buyers shortlist 2 or 3 tools before committing. Pull a side-by-side comparison or browse the full alternatives shortlist below.
Explore More
LakeFS FAQ
What is LakeFS?
lakeFS is a data version control system that applies Git-like operations to data lakes. It helps manage the data lifecycle, provenance, and unified access for AI and data teams, enabling reproducible experiments, isolated testing, and instant rollbacks for data incidents.
How much does LakeFS cost?
lakeFS offers a freemium model. There is a free Open Source version available forever, and an Enterprise plan with unlimited seats for which you need to contact sales for pricing details.
Is LakeFS free?
Yes, lakeFS has a free Open Source version that is available forever. There is also a paid Enterprise plan with additional features.
Who is LakeFS for?
lakeFS is for AI and data teams, data engineers, ML engineers, and organizations looking to manage data at scale, accelerate their data, AI, and ML initiatives, ensure reproducibility, and improve data quality and governance.
Source: lakefs.io