Generative AI for Automating Data Pipelines and Analytic Queries
Published by O'Reilly Media, Inc.
A code generation framework for data engineering and analytics
Course outcomes
- Understand key architectural patterns for integrating GenAI within data platforms
- Learn how to automatically generate data pipelines and analytical patterns
- Understand deployment and security strategies and best practices involving GenAI
Course description
Join expert Ashish Mrig to take a dive into generative data engineering. You’ll grasp basic concepts and contexts before learning how to generate data pipelines and analytics code, rather than writing the code manually. You’ll understand how design patterns can shorten lead time, automate undifferentiated work, and allow data teams to move up the data value chain. Through demonstrations of real-world use cases built on top of key data platform design paradigms and data models, you’ll gain a deep understanding of how to build generative data pipeline engines and deploy them in your production environments.
What you’ll learn and how you can apply it
- Understand generative AI capabilities with different data platforms
- Apply large language models (such as GPT-4) to generate code
- Use architecture and design strategies to leverage code generation capabilities of GenAI to solve complex data engineering and analytics business use cases
- Understand and use guidelines and strategies for successful production deployment using GenAI
This live event is for you because...
- You’re a data engineer, business/data analyst, product manager, data leader, architect, data scientist, BI analyst, or IT executive who’s looking to streamline the process of collecting, cleaning, transforming, and organizing data.
Prerequisites
- A basic understanding of SQL, Python, and ETL, data storage, warehousing, and modeling, and cloud platforms such as AWS, Google Cloud, or Azure (helpful but not required)
Recommended follow-up:
- Read Prompt Engineering for Generative AI (book)
- Read Data Pipelines Pocket Reference (book)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Introduction (35 minutes)
- Presentation: Overview of core data platforms
- Q&A
Overview of generative AI technology (30 minutes)
- Presentation: History and evolution of GenAI technology; overview of GenAI models; key implementation methods and algorithms
- Q&A
- Break
Key design patterns for integrating GenAI within data platforms (35 minutes)
- Presentation: Integration architecture; design choices; tech stack
- Q&A
How to automatically generate data pipelines (45 minutes)
- Demonstration: Case study—ETL business problem; GenAI design pattern; implementation details and results
- Break
How to build a bot to expose your data silos (35 minutes)
- Demonstration: Case study—analytics business problem; GenAI design pattern; implementation details and results
- Q&A
Your Instructor
Ashish Mrig
A Boston based data technologist and AI/ML practitioner, Ashish has been practicing the data engineering craft for 20+ years before it was considered cool ! Ashish has deep understanding of core design patterns, abstractions and automations, one of his core competencies is to reduce data chaos and build scalable data platforms. He loves to tackle new challenges and push the engineering boundary. In his free time loves to play tennis and visit our beautiful national parks.