Comet for Data Science

Book description

Gain the key knowledge and skills required to manage data science projects using Comet

Key Features

  • Discover techniques to build, monitor, and optimize your data science projects
  • Move from prototyping to production using Comet and DevOps tools
  • Get to grips with the Comet experimentation platform

Book Description

This book provides concepts and practical use cases which can be used to quickly build, monitor, and optimize data science projects. Using Comet, you will learn how to manage almost every step of the data science process from data collection through to creating, deploying, and monitoring a machine learning model.

The book starts by explaining the features of Comet, along with exploratory data analysis and model evaluation in Comet. You'll see how Comet gives you the freedom to choose from a selection of programming languages, depending on which is best suited to your needs. Next, you will focus on workspaces, projects, experiments, and models. You will also learn how to build a narrative from your data, using the features provided by Comet. Later, you will review the basic concepts behind DevOps and how to extend the GitLab DevOps platform with Comet, further enhancing your ability to deploy your data science projects. Finally, you will cover various use cases of Comet in machine learning, NLP, deep learning, and time series analysis, gaining hands-on experience with some of the most interesting and valuable data science techniques available.

By the end of this book, you will be able to confidently build data science pipelines according to bespoke specifications and manage them through Comet.

What you will learn

  • Prepare for your project with the right data
  • Understand the purposes of different machine learning algorithms
  • Get up and running with Comet to manage and monitor your pipelines
  • Understand how Comet works and how to get the most out of it
  • See how you can use Comet for machine learning
  • Discover how to integrate Comet with GitLab
  • Work with Comet for NLP, deep learning, and time series analysis

Who this book is for

This book is for anyone who has programming experience, and wants to learn how to manage and optimize a complete data science lifecycle using Comet and other DevOps platforms. Although an understanding of basic data science concepts and programming concepts is needed, no prior knowledge of Comet and DevOps is required.

Table of contents

  1. Comet for Data Science
  2. Foreword
  3. Contributors
  4. About the author
  5. About the reviewers
  6. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Download the color images
    6. Conventions used
    7. Get in touch
    8. Share Your Thoughts
  7. Section 1 – Getting Started with Comet
  8. Chapter 1: An Overview of Comet
    1. Technical requirements
      1. comet-ml
      2. matplotlib
      3. numpy
      4. pandas
      5. scikit-learn
    2. Motivation, purpose, and first access to the Comet platform
      1. Motivation
      2. Purpose
      3. First access to the Comet platform
    3. Getting started with workspaces, projects, experiments, and panels
      1. Workspaces
      2. Projects
      3. Experiments
      4. Panels
    4. First use case – tracking images in Comet
      1. Downloading the dataset
      2. Dataset cleaning
      3. Building the visualizations
      4. Integrating the graphs in Comet
      5. Building a panel
    5. Second use case – simple linear regression
      1. Initializing the context
      2. Defining, fitting, and evaluating the model
      3. Showing results in Comet
    6. Summary
    7. Further reading
  9. Chapter 2: Exploratory Data Analysis in Comet
    1. Technical requirements
      1. pandas Profiling
      2. seaborn
      3. sweetviz
    2. Introducing EDA
      1. Problem setting
      2. Data preparation
      3. Preliminary data analysis
      4. Preliminary results
    3. Exploring EDA techniques
      1. Loading and preparing the dataset
      2. Non-visual EDA
      3. Visual EDA
    4. Using Comet for EDA
      1. Comet logs
      2. Panels
      3. Comet Report
    5. Summary
    6. Further reading
  10. Chapter 3: Model Evaluation in Comet
    1. Technical requirements
    2. Introducing model evaluation
      1. Data splitting
      2. Choosing metrics
    3. Exploring model evaluation techniques
      1. Loading and preparing the dataset
      2. Regression
      3. Classification
      4. Clustering
    4. Using Comet for model evaluation
      1. Comet Log
      2. Comet Dashboard
      3. Registry
      4. Reports
    5. Summary
    6. Further reading
  11. Section 2 – A Deep Dive into Comet
  12. Chapter 4: Workspaces, Projects, Experiments, and Models
    1. Technical requirements
      1. Python
      2. R
      3. Java
    2. Exploring the Comet UI
      1. Workspaces
      2. Projects
    3. Using experiments and models
      1. Experiments
      2. Models
    4. Exploring other languages supported by Comet
      1. R
      2. Java
    5. First use case – offline and existing experiments
      1. Running an offline experiment
      2. Continuing an existing experiment
      3. Improving an existing experiment offline
    6. Second use case – model optimization
      1. Creating and configuring an Optimizer
      2. Optimizing the model
      3. Showing the results in Comet
    7. Summary
    8. Further reading
  13. Chapter 5: Building a Narrative in Comet
    1. Technical requirements
    2. Discovering the DIKW pyramid
      1. Data
      2. Information
      3. Knowledge
      4. Wisdom
    3. Moving from data to wisdom
      1. Turning data into information
      2. Turning information into knowledge
      3. Turning knowledge into wisdom
    4. Choosing the correct chart type
      1. A line chart
      2. A bar chart
      3. An area chart
      4. A pie chart
    5. Using Comet to build a narrative
      1. Using JavaScript panels
      2. Building advanced reports
    6. Summary
    7. Further reading
  14. Chapter 6: Integrating Comet into DevOps
    1. Technical requirements
      1. Python
      2. Docker
      3. Kubernetes
    2. Exploring DevOps and MLOps principles and best practices
      1. The DevOps life cycle
      2. Moving from DevOps to MLOps
    3. Combining Comet and DevOps/MLOps
      1. Comet in the DevOps life cycle
      2. Setting up the Comet REST API service
      3. Using the Comet REST API
    4. Implementing Docker
      1. Overview of Docker
      2. Running Comet in Docker container
    5. Implementing Kubernetes
      1. The Kubernetes architecture
      2. Configuring Kubernetes
      3. Deploying a local Kubernetes cluster
    6. Summary
    7. Further reading
  15. Chapter 7: Extending the GitLab DevOps Platform with Comet
    1. Technical requirements
      1. Python
      2. Git client
    2. Introducing the concept of CI/CD
      1. An overview of CI/CD
      2. The concept of an SCS
      3. The CI/CD workflow
    3. Implementing the CI/CD workflow in GitLab
      1. Creating/modifying a GitLab project
      2. Exploring GitLab's internal structure
      3. Exploring GitLab concepts for CI/CD
      4. Building the CI/CD pipeline
      5. Creating a release
    4. Integrating Comet with GitLab
      1. Running Comet in the CI/CD workflow
      2. Using webhooks
    5. Integrating Docker with the CI/CD workflow
    6. Summary
    7. Further reading
  16. Section 3 – Examples and Use Cases
  17. Chapter 8: Comet for Machine Learning
    1. Technical requirements
      1. shap
    2. Introducing machine learning
      1. Exploring the machine learning workflow
      2. Classifying machine learning systems
      3. Exploring machine learning challenges
      4. Explaining machine learning models
    3. Reviewing the main machine learning models
      1. Supervised learning
      2. Unsupervised learning
    4. Reviewing the scikit-learn package
      1. Preprocessing
      2. Dimensionality reduction
      3. Model selection
      4. Supervised and unsupervised learning
    5. Building a machine learning project from setup to report
      1. Reviewing the scenario
      2. Selecting the best model
      3. Calculating the SHAP value
      4. Building the final report
    6. Summary
    7. Further reading
  18. Chapter 9: Comet for Natural Language Processing
    1. Technical requirements
    2. Introducing basic NLP concepts
      1. Exploring the NLP workflow
      2. Classifying NLP systems
      3. Exploring NLP challenges
      4. Reviewing the most popular models’ hubs
    3. Exploring the Spark NLP package
      1. Introducing the Spark NLP package
      2. Integrating Spark NLP with Comet
    4. Setting up the environment for Spark NLP
      1. Installing Java
      2. Installing Scala (optional)
      3. Installing Apache Spark
      4. Installing PySpark and Spark NLP
    5. Using NLP, from project setup to report building
      1. Configuring the environment
      2. Loading the dataset
      3. Implementing a pretrained pipeline
      4. Logging results in Comet
      5. Using a custom pipeline
      6. Building the final report
    6. Summary
    7. Further reading
  19. Chapter 10: Comet for Deep Learning
    1. Technical requirements
      1. gradio
      2. tensorFlow
    2. Introducing basic deep learning concepts
      1. Introducing neural networks
      2. Exploring the difference between deep learning and neural networks
      3. Classifying deep learning networks
    3. Exploring the TensorFlow package
      1. Introducing the TensorFlow package
      2. Integrating TensorFlow with Comet
    4. Using deep learning- from project setup to report building
      1. Introducing Gradio
      2. Loading the dataset
      3. Implementing a basic model
      4. Exploring results in Comet
      5. Building a prediction interface
      6. Building the final report
    5. Summary
    6. Further reading
  20. Chapter 11: Comet for Time Series Analysis
    1. Technical requirements
      1. Prophet
      2. statsmodels
    2. Introducing basic concepts related to time series analysis
      1. Loading a time series in Python
      2. Checking whether a time series is stationary
      3. Exploring the time series components
      4. Identifying breakpoints in a time series
    3. Exploring the Prophet package
      1. Introducing the Prophet package
      2. Integrating Prophet with Comet
    4. Using time series analysis from project setup to report building
      1. Configuring the Deepnote environment
      2. Loading and preparing the dataset
      3. Checking stationarity in data
      4. Building the models
      5. Exploring results in Comet
      6. Building the final report
    5. Summary
    6. Further reading
    7. Why subscribe?
  21. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share Your Thoughts

Product information

  • Title: Comet for Data Science
  • Author(s): Angelica Lo Duca, Gideon Mendels
  • Release date: August 2022
  • Publisher(s): Packt Publishing
  • ISBN: 9781801814430