Chapter 2: Pachyderm Basics

Pachyderm is a data science platform that enables data scientists to create an end-to-end machine learning workflow that covers the most important stages of a machine learning life cycle, starting from data ingestion all the way into production.

If you are familiar with Git, a version control and life cycle system for code, you will find many similarities between the most important Git and Pachyderm concepts. Version control systems such as Git and its hosted version GitHub have become an industry standard for thousands of developers worldwide. Git enables you to keep a history of changes in your code and go back when needed. Data scientists deserve a platform that will let them track the versions of their experiments, ...

Get Reproducible Data Science with Pachyderm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.