Skip to content
DVC logo

Manage data and machine learning models like code with Git-like version control.

Visit Website
Tracked since2026
0 reviews tracked

The Bottom Line

Entry price

Free plan available, paid tiers above

Biggest pro

Free and open source

Biggest con

Requires familiarity with Git concepts

TL;DR - DVC

  • Applies Git-like version control to data and machine learning models.
  • Enables reproducibility, collaboration, and traceability for data science projects.
  • Scalable for both individual data scientists and enterprise AI teams.
Pricing: Free plan available
Best for: Growing teams

What is DVC?

Editorial review
DVC (Data Version Control) brings software engineering best practices, specifically Git-like version control, to data, AI/ML, and data science teams. It allows users to manage data and models in the same way they manage code, enabling reproducibility, collaboration, and traceability in data-intensive projects. DVC is designed to serve both individual data scientists and enterprise AI/data engineering teams. For individual data scientists, it offers an easy-to-use Git extension for small data science projects, integrating data version control into workflows with minimal overhead. For enterprise teams, DVC provides a highly scalable data version control infrastructure suitable for complex AI operations and big data environments, including petabyte-scale multimodal object stores and data lakes. By applying version control to data, DVC helps teams track changes, revert to previous versions, and ensure consistency across different stages of their machine learning pipelines. It aims to make data science projects more robust, collaborative, and easier to manage, similar to how Git revolutionized software development.

Available on: Web, macOS, Linux, Windows

Pros & Cons

Pros

  • Free and open source
  • Brings software engineering best practices to data science
  • Enhances reproducibility and collaboration
  • Scalable for various project sizes
  • Integrates well with existing Git workflows

Cons

  • Requires familiarity with Git concepts
  • May have a learning curve for new users

Key Features

Git-like data version controlIntegration with GitSupport for large datasets and modelsScalable for enterprise AI operationsCompatible with object stores and data lakesVS Code extension available

Pricing Plans

lakeFS Enterprise

Contact us

  • Highly scalable data version control infrastructure
  • Designed for complex AI operations and big data environments
  • Petabyte-scale multimodal object stores and data lakes

lakeFS (Free and open source)

Free

DVC (Free and open source)

Free

  • Easy to use data version control Git extension
  • For small data science projects
  • Apply data version control to your data science workflows with minimal overhead

Reviews

Be the first to review DVC

Your take helps the next buyer. Verified LinkedIn reviewers get a badge.

Write a review

Best DVC Alternatives

Top alternatives based on features, pricing, and user needs.

Most buyers shortlist 2 or 3 tools before committing. Pull a side-by-side comparison or browse the full alternatives shortlist below.

Explore More

DVC FAQ

How does DVC integrate with existing Git workflows for data science projects?

DVC functions as a Git extension, allowing data scientists to apply version control practices directly to their data within their established Git repositories. This integration enables tracking of data and models alongside code with minimal overhead, streamlining data science workflows.

What is the primary use case for DVC compared to lakeFS?

DVC is designed for individual data scientists and small data science projects, providing an easy-to-use Git extension for data version control. In contrast, lakeFS is a highly scalable data version control infrastructure built for enterprise AI and data engineering teams managing petabyte-scale multimodal object stores and data lakes.

Can DVC manage large datasets, or is it better suited for smaller data science projects?

DVC is specifically described as an 'easy to use data version control Git extension for small data science projects.' While it brings software engineering best practices to data, its primary focus and efficiency are optimized for projects with smaller data footprints, leaving petabyte-scale management to solutions like lakeFS.

What kind of data storage does DVC support for versioning?

DVC leverages a Git-like model to manage data, implying it works with various data storage types that can be referenced and tracked through its system. It extends Git's capabilities to version data, rather than directly storing large data files within the Git repository itself.

How does DVC facilitate collaboration in data science teams?

By applying a Git-like model to data, DVC enables data science teams to manage data collaboratively, similar to how code is managed. This allows for versioning, tracking changes, and sharing data and models effectively among team members, fostering better collaboration and reproducibility.

Source: dvc.org

Guides & Articles