Applies software engineering best practices to data and ML pipelines.
Provides tools for pipeline visualization, data cataloging, and project standardization.
Enables reproducible, maintainable, and modular data science and engineering code.
Pricing: Free forever
Best for: Individuals & startups
Pros & Cons
Pros
Promotes clean, reproducible, and maintainable code for data pipelines.
Standardizes project structure and encourages team collaboration.
Reduces time spent on 'plumbing' and allows focus on core problems.
Facilitates seamless transition from development to production.
Offers extensive integrations with popular data and ML tools.
Cons
Requires familiarity with Python and software engineering concepts.
May have a learning curve for users new to structured pipeline development.
Preview
Key Features
Pipeline Visualisation (Kedro-Viz) with data lineage and execution detailsData Catalog with lightweight connectors for various file formats and systems (S3, GCP, Azure, Pandas, Spark, etc.)Project Template for standardizing configuration, code, tests, and notebooksDedicated IDE support for Visual Studio Code (code navigation, autocompletion)Pipeline Abstraction with dataset-driven workflow and automatic dependency resolutionCoding Standards (pytest, Sphinx, ruff, Python logging)Flexible Deployment (single/distributed machine, Argo, Prefect, Kubeflow, AWS Sagemaker, Databricks)
Kedro is an open-source Python framework that applies software engineering best practices to the development of data and machine learning pipelines. It provides a structured approach to building reproducible, maintainable, and modular code, helping data scientists and engineers transition seamlessly from exploratory development to production.
The framework offers scaffolding for complex projects, standardizes code organization, and includes tools for pipeline visualization, data cataloging, and flexible deployment. By abstracting away tedious 'plumbing' tasks and automating dependency resolution, Kedro allows teams to focus on problem-solving and collaborate more effectively on data and ML initiatives.
Kedro is an open-source Python framework hosted by the Linux Foundation (LF AI & Data). It uses software engineering best practices to help you build production-ready data engineering and data science code.
How much does Kedro cost?
Kedro is an open-source project and is free to use. You can install it using pip or conda.
Is Kedro free?
Yes, Kedro is an open-source framework and is completely free to use.
Who is Kedro for?
Kedro is for data scientists, machine learning engineers, and data engineers who want to build production-ready, reproducible, and maintainable data and machine learning pipelines. It's also beneficial for product leads whose teams need to standardize their data science workflows.