Preface

Pachyderm is a distributed version control platform for building end-to-end data science workflows. Since its creation in 2016, Pachyderm has become a go-to solution for large and small organizations. The core functionality of Pachyderm is open source and has a vivid community of engineers around it. This book walks you through basic and advanced examples of Pachyderm usage. This book will help you get started quickly and integrate a reliable data science solution into your infrastructure.

Reproducible Data Science with Pachyderm provides a clear overview of Pachyderm, as well as instructions on how to install and run Pachyderm in the cloud, and how to use the Pachyderm Software-as-a-Service (SaaS) version – Pachyderm Hub. This book ...

Get Reproducible Data Science with Pachyderm now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.