Scripting with Python and SQL for Data Engineering

Video description

Scripting with Python and SQL for Data Engineering

Learn Python data structures, web scraping, SQL, and MySQL from the ground up. Master essential skills for collecting, storing, and working with data.

In this hands-on course for beginners, you'll learn how to:

  • Store and manipulate data using Python lists, dictionaries, JSON
  • Write reusable scripts to process data
  • Connect Python to databases like SQLite and MySQL
  • Query, import, and export data with SQL
  • Scrape and parse websites using Beautiful Soup and Scrapy
  • Persist scraped data to databases and files

You'll use the following example repositories to practice:

Key topics include:

  • Mapping, iterating, and persisting data structures
  • Creating modules, scripts, and workflows in Python
  • SQL essentials - queries, statements, aggregations
  • Setting up connections from Python to SQLite and MySQL
  • Scraping data locally and at scale with spiders
  • Storing scraped data to optimize pipelines

You'll build your data wrangling skills through practical examples and hands-on coding exercises in every lesson. By the end of the course, you'll have experience building end-to-end data engineering scripts.

Whether you're a beginner looking to learn Python and SQL, or want to develop robust data engineering skills, this course will get you started. Enroll now and start collecting, storing, and working with data using Python and SQL

You'll gain hands-on experience building Python scripts and SQL queries for common data engineering tasks.

This course is divided in 4 weeks:

Week 1 Working with Data in Python

By the end of Week 1 you'll be able to:

  • Apply Python data structures like lists, dicts
  • Extract data from sources like CSV, JSON
  • Load and persist data using JSON

Lesson 1: Data Structures in Python

Lesson Outline

  • Lists, tuples, dictionaries
  • Working with pandas DataFrames
  • Loading data files like CSV into data structures

Lesson 2: Reading and Writing Data

Lesson Outline

  • Reading and writing CSV files
  • Serializing Python objects with JSON
  • Parsing and dumping JSON data

Lesson 3: Persisting and Loading Data in Python

Lesson Outline

  • Loading data from files
  • Saving data from Python to disk
  • Loading and saving data to JSON

Week 2 Python Scripting and SQL

By the end of Week 2 you'll be able to:

  • Write reusable Python scripts
  • Use SQLite to persist data
  • Query SQLite databases with Python

Lesson 1: Python Scripting Techniques

Lesson Outline

  • Writing modular, reusable Python scripts
  • Exception handling and logging
  • Python virtual environments

Lesson 2: Python with SQLite

Lesson Outline

  • Creating SQLite databases from Python
  • Writing tables with SQLAlchemy
  • Querying SQLite from Python with SQLAlchemy

Week 3 Learning Objectives

By the end of Week 3 you'll be able to:

  • Scrape and collect data from websites
  • Build scalable scraping scripts
  • Persist scraped data to files/databases

Lesson 1: Web Scraping with Python

Lesson Outline

  • HTML parsing and structure
  • Using Beautiful Soup for scraping
  • Storing scraped data in Python

Lesson 2: Scalable Web Scraping

Lesson Outline

  • Scraping best practices
  • Scaling scraping with multiprocessing
  • Storing scraped data in databases

Week 4 Learning Objectives

By the end of Week 4 you'll be able to:

  • Connect to MySQL from Python
  • Execute SQL statements and queries
  • Import and export data from MySQL

Lesson 1: Python and MySQL

Lesson Outline

  • Installing MySQL and configuration
  • Connecting Python to MySQL
  • Executing queries and statements

Lesson 2: Running SQL queries from VSCode

  • Use Visual Studio Code to build SQL queries
  • Execute and review SQL queries from Visual Studio Code

Lesson 3: Importing and Exporting Data

Lesson Outline

  • Loading and exporting CSV data
  • Best practices for moving data into MySQL
  • Automating data imports with Python

About your instructor

Alfredo Deza has over a decade of experience as a Software Engineer doing DevOps, automation, and scalable system architecture. Before getting into technology he participated in the 2004 Olympic Games and was the first-ever World Champion in High Jump representing Peru. He currently works in Developer Relations at Microsoft and is an Adjunct Professor at Duke University. This solid background in technology and teaching, including his experience teaching and authoring content about DevOps and MLOps will give you everything you need to get started applying these powerful concepts.

Resources

Table of contents

  1. Lesson 1
    1. "Introduction To Python Data Strucctures"
    2. "Using Lists For Data"
    3. "Using Dictionaries For Data"
    4. "Overview Of Other Data Structures"
    5. "Recap Of Data Structures"
    6. "Introduction Choosing Data Structures"
    7. "Iterating Over Lists"
    8. "Iterating Over Other Data Structures"
    9. "Storing Data In Between Data Structures"
    10. "Recap Of Mapping Data"
    11. "Introduction To Data Sources"
    12. "Loading Data From Files"
    13. "Working With Json"
    14. "Saving Data To Disk"
    15. "Recap Of Persisting Data"
  2. Lesson 2
    1. "Introduction To Scripting In Python"
    2. "Creating A Script As A Module"
    3. "Traversing The Filesystem"
    4. "Recap Of Python Scripting"
    5. "Introduction To Embedded Databases"
    6. "What Is Sqlite"
    7. "Creating And Connecting To Sqlite"
    8. "Saving And Querying"
    9. "Recap Of Sqlite And Python"
    10. "Introduction To Querying With Sql"
    11. "Basic Sql Commands"
    12. "Extracting Distinct Data"
    13. "Searching With Sql"
    14. "Recap Of Querying With Sql"
  3. Lesson 3
    1. "Introduction To Web Scraping"
    2. "Challenges With Web Data"
    3. "Parsing Html With Htmlparser"
    4. "Recap Of Web Scraping"
    5. "Introduction To Scrapy And Xpath"
    6. "Creating A Web Scraping Project"
    7. "Parsing With Xpath"
    8. "Using A Spider For Web Scrapping"
    9. "Recap Of Scrapy And Xpath"
    10. "Challenges With Web Scraping"
    11. "Scraping Locally"
    12. "Persisting Data With Csv And Json"
    13. "Persisting Data In Sqlite"
    14. "Recap Of Persistence And Efficiency"
  4. Lesson 4
    1. "Introduction To Working With Mysql"
    2. "Connecting To A Mysql Server"
    3. "Recap Of Working With Vscode And Mysql"
    4. "Challenges With Running Mysql Queries"
    5. "Using Vscode To Execute Mysql Queries"
    6. "Recap Of Running Mysql Queries"
    7. "Challenges With Importing To Databases"
    8. "Importing Csv Data To Mysql"
    9. "Exporting Data From Mysql"
    10. "Recap Of Importing And Exporting Data"

Product information

  • Title: Scripting with Python and SQL for Data Engineering
  • Author(s): Alfredo Deza
  • Release date: August 2023
  • Publisher(s): Pragmatic AI Labs
  • ISBN: 28190528VIDEOPAIML