Data Superstream: Analytics Engineering
Published by O'Reilly Media, Inc.
Successful data-driven organizations need access to high-quality data. Analytics engineering—an amalgam of data engineering and data analysis—supports this critical requirement by transforming data and by participating in testing, deployment, and documentation. As a discipline, analytics engineering has evolved from providing clean datasets to end users to enabling them to answer their own questions about that data. These expert-led sessions will get you up to speed on this quickly evolving field and take you through the tools and technologies at the forefront of data transformation.
About the Data Superstream Series: This three-part Superstream series is designed to help your organization maximize the business impact of your data. Each day covers different topics, with unique sessions lasting no more than four hours. And they’re packed with insights from key innovators and the latest tools and technologies to help you stay ahead of it all.
What you’ll learn and how you can apply it
- Explore the future of analytics engineering and learn how to prepare yourself and your organization for this transformation
- Learn the five kinds of work that data teams are taking over because of the democratization of analytics engineering
- See how to apply new approaches to datasets through the lens of an analytics engineer, from ingestion to exploration to modeling and presentation
- Simplify data discovery at your organization through testing strategies, code review processes, naming conventions, and more
- Discover the metrics layer—a new piece of the modern data stack maintained by analytics engineers
- Explore the landscape of data orchestration tools within the modern data stack and learn how to implement them
This live event is for you because...
- You’re a data practitioner looking to understand engineering analytics—one of the hottest new areas in data today.
- You’re a data analyst, BI analyst, or data warehouse developer who wants to make the move to a career in engineering analytics.
- You’re an analytics engineer who wants to deepen your knowledge of the space and explore new tools and methodologies for the modern data stack.
Prerequisites
- Come with your questions
- Have a pen and paper handy to capture notes, insights, and inspiration
Recommended follow-up:
- Read Hands-On Data Visualization (book)
- Read Lean Analytics (book)
- Read Communicating with Data (book)
- Read Snowflake: The Definitive Guide (early release book)
- Read The Self-Service Data Roadmap (book)
- Read Data Quality Fundamentals (early release book)
- Read Data Governance: The Definitive Guide (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Alistair Croll: Introduction (5 minutes) - 8:00am PT | 11:00am ET | 3:00pm UTC/GMT
- Alistair Croll welcomes you to the Data Superstream.
Elizabeth Caley: Keynote–Making the Invisible Visible: Data for the New World (10 minutes) 8:05am PT | 11:05am ET | 3:05pm UTC/GMT
- The most valuable data in the world right now is literally invisible, but it affects 7 billion people 90% of the time. How can the rapidly evolving field of analytics engineering unlock business- and life-changing insights for decision makers? And how will the planet change over the next decades if we do this well? Join Elizabeth Caley to find out.
- Elizabeth Caley is co-CEO of Poppy Health. Over the last 20 years, she's held executive roles at global technology companies including Microsoft. Prior to Poppy Health, EC was COO of Meta, an ML company that organized the world’s scientific information. After the acquisition of Meta by Mark Zuckerberg and Priscilla Chan, EC was an executive at the Chan Zuckerberg Initiative, overseeing tech and science information initiatives for social good. EC is an active contributor to the Bay Area, Oxford, and Toronto technology communities as a mentor, advisor, and speaker. She’s mentored 125 science-based startups and has been honored as one of the most inspiring women in STEM.
Anna Filippova: The Future of Analytics Engineering (30 minutes) - 8:15am PT | 11:15am ET | 3:15pm UTC/GMT
- What does it mean to do analytics engineering, and how is it different from doing analytics, data science, or data engineering? What are the core skills you need to call yourself an analytics engineer (and how do you acquire them to land your first gig)? And, most importantly, how can you prepare yourself and your organization for this transformation? Join Anna Filippova for a deep dive into analytics engineering. You’ll discover what the community of analytics engineering practitioners looks like today—and think through where it's going—as you explore the structural forces of the data industry that have made this the hottest new job in data today.
- Anna Filippova tends to the dbt Community garden of over 25,000 members as the director of community at dbt Labs. She also writes about the intersection of modern data tools and open source in the Analytics Engineering Roundup. Previously, Anna built the first analytics engineering team at GitHub. In her past life, Anna published research on building, maintaining, and sustaining open source communities. She’s also studied how distributed and open source communities worked, fought and learned in a postdoc at Carnegie Mellon, and earned a PhD in communication and media from the National University of Singapore. From time to time, you can find Anna traveling the coast of California and working from her camper van. And she’s always open to an AMA session.
Emilie Schario: The Five Kinds of Work Analytics Engineering Empowers Your Data Teams to Do (30 minutes) - 8:45am PT | 11:45am ET | 3:45pm UTC/GMT
- With the rise of the modern data stack, it’s easier than ever to get started with a data warehouse. And with the modern data stack and analytics engineering, data teams are no longer confined to dashboard building. Emilie Schario walks you through the five kinds of work that data teams are taking over because of the democratization of analytics engineering: operational analytics, metrics management, insight/exploratory work, experimentation reporting, and servicing other teams.
- Emilie Schario is a data strategist in residence at Amplify Partners. Previously, she was the director of data at Netlify, where she led 8% of the company's headcount, and was the first data analyst at many companies, including GitLab, Doist, and Smile Direct Club. She lives in Columbus, GA, with her husband, son, and dog.
- Break (10 minutes)
Lewis Davies: From Data to Dashboard (30 minutes) - 9:25am PT | 12:25pm ET | 4:25pm
- Join Lewis Davies for an end-to-end view of how analytics engineers approach a brand-new dataset. Starting from the stakeholder request, you’ll explore the key milestones and considerations including ingestion (how you'll move data from the source to destination); exploration (understanding never-before-seen data and the importance of domain knowledge); modeling (making your machine-optimized data more human friendly); and presentation (choosing a delivery format—it doesn't always have to be a dashboard!—and iterative improvements). You’ll also get insight into the true first step: questioning the stakeholder to make sure this is the right project to work on at the current time.
- Lewis Davies is a senior analytics engineer at Aula Education, where he provides reliable data and analytics to thousands of university educators every day. Having worked in a range of roles at Deliveroo, Hootsuite, and Best Buy, he believes that end-to-end ownership is the key to building high-quality data products. Lewis is based in Durham, England.
Jacob Frackson: Facilitating Data Discovery with Analytics Engineering and dbt (30 minutes) - 9:55am PT | 12:55pm ET | 4:55pm UTC/GMT
- Data discovery is a challenge for large and small businesses alike, and for a variety of reasons that span from the simple to the complex. Jacob Frackson presents a roadmap for using analytics engineering and dbt to facilitate data discovery at your organization. You’ll explore the four stages that will take you from ground zero to a mature data discovery ecosystem: testing to verify data quality; version control and built-in documentation to make data modeling inspectable; Jinja and modularity to make data logic extensible; and naming conventions and code reviews to make data products intuitive.
- Jacob Frackson is a senior analytics consultant at Montreal Analytics—a full stack data consultancy that helps organizations implement modern data stacks to establish a single source of truth, lower maintenance costs, and accelerate time to insights. Jacob leads the company’s growing enablement practice.
- Break (5 minutes)
Benn Stancil: On the Metrics Layer—A New Piece of the Modern Data Stack Maintained by Analytics Engineers (30 minutes) - 10:30am PT | 1:30pm ET | 5:30pm UTC/GMT
- Join Benn Stancil for a primer on all things metrics layer. You’ll learn what it is, why it's useful, and why it could become a critical part of every data stack. Benn will also explain why the metrics layer matters to analytics engineers and how they can benefit from it specifically. You’ll then explore the different versions that people are building and the strategies that will help you determine which is right for you.
- Benn Stancil is a cofounder and chief analytics officer at Mode, a collaborative data platform that unites an analyst-centered workflow with modern business intelligence. Benn currently leads Mode’s data organization but has also held roles leading Mode’s product, marketing, solutions, and executive teams. Previously, he worked on analytics teams at Microsoft and Yammer. Benn regularly writes about data and technology at benn.substack.com.
Nick Acosta: Orchestrating a Modern Data Stack (30 minutes) - 11:00am PT | 2:00pm ET | 6:00pm UTC/GMT
- A single analytics engineering data pipeline can consist of dozens of different modular services that are each performing different operations on data. As the number of data pipelines grows, it can be difficult to manage which services of a modern data stack should be running, when they should run, and what data they expect. Data orchestration tools solve these problems by composing all of the tasks that make up a pipeline into a single logical workflow that allows analytics engineers to effectively decouple and integrate the tools and technologies they’re using. Join Nick Acosta for an introduction to data orchestration tools and their place within the modern data stack, then explore examples of effective options for implementing these tools.
- Nick Acosta is a developer advocate at Fivetran who enjoys helping developers automate data pipelines. Nick’s the author and maintainer of Fivetran's Airflow provider and specializes in integrating the modern data stack with data orchestration and infrastructure-as-code tools. He studied computer science at Purdue University and the University of Southern California and has worked in high-performance computing at Hewlett-Packard and AI at IBM.
Alistair Croll: Closing Remarks (5 minutes) - 11:30am PT | 2:30pm ET | 6:30pm UTC/GMT
- Alistair Croll closes out today’s event.
Upcoming Data Superstream events:
- Building Data Pipelines and Connectivity - August 10, 2022
Your Host
Alistair Croll
Alistair Croll is an entrepreneur, author, and conference organizer. He's written four books on technology and society, including the best-selling Lean Analytics, which has been translated into eight languages. He's the cofounder of web performance startup Coradiant (acquired by BMC), the Year One Labs startup accelerator, and a number of other early-stage companies.
A prolific speaker, Alistair was a visiting executive at Harvard Business School, where he helped create a course on data science and critical thinking. He's founded and chaired a number of the world's leading technology events, including Cloud Connect, Strata, Startupfest, Scaletech, and the FWD50 Digital Government conference. He's currently working on Just Evil Enough, the subversive marketing playbook. Alistair lives in Montreal, Canada, and writes at acroll.substack.com.