Skip to content
  • Sign In
  • Try Now
View all events
Data Engineering

Managing Data Downtime

Published by O'Reilly Media, Inc.

Beginner to intermediate content levelBeginner to intermediate

How to apply observability to your data pipelines

This live event utilizes Jupyter Notebook technology

Do your product dashboards look funky? Are your quarterly reports way off? Are you sick and tired of running a SQL query only to discover that the dataset you’re using is broken or just plain wrong? These errors are highly costly and affect almost every team, yet they’re typically only addressed on an ad hoc basis and in a reactive manner.

As companies increasingly rely on data to lead operations and drive decision making, you need to ensure that your data pipelines are consistently healthy and reliable. In the same way that software developers tackle application downtime, data professionals deal with their own set of availability challenges. In other words, data downtime—the periods of time when your data is partial, erroneous, missing, or otherwise inaccurate. To identify and eliminate data downtime, teams must leverage the five pillars of data observability and embrace automated checks to monitor pipeline performance.

Join experts Barr Moses and Ryan Kearns to learn how to minimize data downtime and increase observability into your data ecosystem. You’ll explore the concept of data downtime and see how to measure it to determine the quality and health of your data using SQL, a sample data table, and a Jupyter notebook. From there, you’ll apply software engineering principles of observability to your data through five key pillars of data health—volume, schema, lineage, freshness, and distribution—as you set service-level objectives for data observability in your data table and implement basic data observability checks. You’ll end by creating your very own anomaly detection algorithm that will help capture data downtime incidents in your data table.

What you’ll learn and how you can apply it

By the end of this live online course, you’ll understand:

  • What data downtime is and how to measure it
  • How to determine the quality of your data
  • The five pillars of data observability
  • How to set SLOs for data observability
  • Basic data observability checks
  • Best practices for eliminating data downtime

And you’ll be able to:

  • Apply best practices from DevOps to data analytics and data engineering
  • Write SQL scripts that accomplish basic data observability checks
  • Identify broken data pipelines
  • Perform basic data lineage searches
  • Set alerts for data quality issues

This live event is for you because...

  • You’re a data professional who relies on reliable, accurate data to generate rich analytics and won’t settle for anything less.
  • You have a love-hate relationship with SQL and are constantly on the lookout for query hacks.
  • You believe that data downtime doesn’t receive the diligence it deserves.
  • You want to learn new ways to fold observability best practices into your data management routine.

Prerequisites

  • A basic understanding of SQL
  • Familiarity with common data warehouse technologies and the principles of DevOps observability

Recommended preparation:

Recommended follow-up:

Schedule

The time frames are only estimates and may vary according to how the class is progressing.

Introducing data downtime (40 minutes)

  • Presentation: Walk-through of a data downtime incident; defining data downtime; what data downtime looks like under the hood; measuring data downtime
  • Group discussion: Have you encountered data downtime in your pipelines or analytics?; How much time do you spend on data downtime incidents?
  • Jupyter Notebook exercise: Find the data issues in a dataset; measure data downtime
  • Q&A

Break (5 minutes)

Introducing data observability (40 minutes)

  • Presentation: Traditional methods of data quality monitoring—row counts and ad hoc queries; additional important measurements; applying best practices from software engineering and DevOps observability to data—SLOs, SLAs, and monitoring, alerting, and triaging; the five pillars of data observability—volume, schema, freshness, lineage, and distribution
  • Jupyter Notebook exercise: Identify the five pillars of data observability in your dataset
  • Q&A

Break (5 minutes)

Detecting data anomalies (40 minutes)

  • Presentation: What is anomaly detection?; What are data anomalies, and how do you find them? (manual approaches, how AI can help); signs you have anomalous data
  • Jupyter Notebook exercise: Create an anomaly detection algorithm (for data volume or freshness)
  • Q&A

Break (5 minutes)

Eliminating data downtime (35 minutes)

  • Presentation: Data observability principles to help you eliminate data downtime
  • Jupyter Notebook exercise: Use your anomaly detection algorithm on your dataset; consider a few approaches to ensure long-term data observability

Wrap-up and Q&A (10 minutes)

Your Instructor

  • Barr Moses

    Barr Moses is cofounder and CEO of Monte Carlo, a data reliability company backed by Accel and other top Silicon Valley investors. Previously, she was VP of customer operations at customer success company Gainsight, where she helped scale the company 10x in revenue and, among other functions, built the data and analytics team; a management consultant at Bain & Company; and a research assistant in the Statistics Department at Stanford. She also served in the Israeli Air Force as a commander of an intelligence data analyst unit. Barr holds a BSc in mathematical and computational science from Stanford.

    linkedinXlinksearch