Skip to content
LakeFS logo

LakeFS

Unclaimed

Apply Git-like version control to your data lake for reproducible AI and streamlined data workflows.

Visit Website
Tracked since2026
0 reviews tracked

The Bottom Line

Entry price

Free plan available, paid tiers above

Biggest pro

Accelerates AI delivery and development velocity.

Biggest con

Specific advanced features like Iceberg REST Catalog and Metadata Search are only available in the Enterprise plan.

TL;DR - LakeFS

  • Applies Git-like version control to data lakes for managing data lifecycle and provenance.
  • Enables isolated testing, instant rollbacks, and reproducible AI/ML training.
  • Integrates with existing data and AI stacks, supporting various storage, compute, and orchestration tools.
Pricing: Free plan available
Best for: Growing teams

What is LakeFS?

Editorial review
lakeFS is a data version control system designed to bridge the AI infrastructure gap by bringing software engineering best practices to data management. It provides a control plane for AI-ready data, enabling teams to manage the data lifecycle, provenance, and unified access for AI and data initiatives. Built on a scalable architecture, lakeFS allows users to test pipeline and model changes in isolation on production data without creating copies, instantly rollback from data incidents, and enforce data quality and compliance standards. The platform helps make training reproducible by tracking data used in experiments and model training, offering full visibility into data history with a built-in audit trail, and automatically satisfying model governance requirements. It also reduces data access friction by allowing users to work with any tool on remote data as if it were local, manage access permissions across all storage from one place, and keep GPUs busy without waiting for data. lakeFS integrates seamlessly with a wide range of object storage solutions, compute engines, ingest technologies, data formats, orchestration tools, and ML/AI stacks, making it a versatile solution for organizations looking to accelerate AI delivery, ensure reproducibility, and reduce data friction.

Available on: Web

Pros & Cons

Pros

  • Accelerates AI delivery and development velocity.
  • Ensures data quality and compliance with isolated testing and instant rollbacks.
  • Reduces storage costs by avoiding data duplication.
  • Streamlines data science and MLOps workflows.
  • Provides transparent, traceable, and repeatable development for AI.

Cons

  • Specific advanced features like Iceberg REST Catalog and Metadata Search are only available in the Enterprise plan.
  • SOC2 support is exclusively offered in the Enterprise plan.
  • While lakeFS supports multiple cloud providers, lakeFS Cloud currently supports AWS, Azure, and GCP, potentially limiting options for users on other cloud platforms for the managed service.

Preview

Key Features

Format-Agnostic Data Version ControlCloud-Agnostic Zero Clone copy for isolated environment (via branches)Atomic Data Promotion (via merges)Configurable Garbage CollectionData CI/CD Using lakeFS HooksRole-Based Access Control (RBAC)Integrates with Your Data StackAudit Logs

Pricing Plans

Pricing checked May 28, 2026

Open Source

Free forever

  • Format-Agnostic Data Version Control
  • Cloud-Agnostic
  • Zero Clone copy for isolated environment (via branches)
  • Atomic Data Promotion (via merges)
  • Data Stays in One Place
  • Configurable Garbage Collection
  • Data CI/CD Using lakeFS Hooks
  • Integrates with Your Data Stack
  • Role-Based Access Control (RBAC)
  • Single Sign On (SSO)
  • SCIM Support
  • IAM Roles Mount Capability
  • Audit Logs
  • Transactional Mirroring
  • Iceberg REST Catalog
  • Metadata Search
  • Multiple Storage Backends Support
  • Simplified Garbage Collection (Managed or Standalone)
  • SOC2 Support
  • SLA
  • Run locally

Enterprise

Contact sales

  • Unlimited seats
  • Format-Agnostic Data Version Control
  • Cloud-Agnostic
  • Zero Clone copy for isolated environment (via branches)
  • Atomic Data Promotion (via merges)
  • Data Stays in One Place
  • Configurable Garbage Collection
  • Data CI/CD Using lakeFS Hooks
  • Integrates with Your Data Stack
  • Role-Based Access Control (RBAC)
  • Single Sign On (SSO)
  • SCIM Support
  • IAM Roles Mount Capability
  • Audit Logs
  • Transactional Mirroring
  • Iceberg REST Catalog
  • Metadata Search
  • Multiple Storage Backends Support
  • Simplified Garbage Collection (Managed or Standalone)
  • SOC2 Support
  • SLA
  • Run locally

Reviews

Be the first to review LakeFS

Your take helps the next buyer. Verified LinkedIn reviewers get a badge.

Write a review

Best LakeFS Alternatives

Top alternatives based on features, pricing, and user needs.

Most buyers shortlist 2 or 3 tools before committing. Pull a side-by-side comparison or browse the full alternatives shortlist below.

Explore More

LakeFS FAQ

What is LakeFS?

lakeFS is a data version control system that applies Git-like operations to data lakes. It helps manage the data lifecycle, provenance, and unified access for AI and data teams, enabling reproducible experiments, isolated testing, and instant rollbacks for data incidents.

How much does LakeFS cost?

lakeFS offers a freemium model. There is a free Open Source version available forever, and an Enterprise plan with unlimited seats for which you need to contact sales for pricing details.

Is LakeFS free?

Yes, lakeFS has a free Open Source version that is available forever. There is also a paid Enterprise plan with additional features.

Who is LakeFS for?

lakeFS is for AI and data teams, data engineers, ML engineers, and organizations looking to manage data at scale, accelerate their data, AI, and ML initiatives, ensure reproducibility, and improve data quality and governance.

Source: lakefs.io

Guides & Articles