From Pandas to Polars
Published by O'Reilly Media, Inc.
Working with Polars in Python
Course Outcomes
- Use Polars to load and transform data from CSV or Parquet files
- Write Polars code that takes advantage of parallelisation and query optimisation
- Work with larger-than-memory datasets
- Apply Polars to the correct use cases in relation to other data processing frameworks such as Pandas and Spark
Course Description
Polars runs much faster than Pandas and scales to larger datasets. There are, however, some fundamental differences between Pandas and Polars. This course explains those differences and shows how to write Polars code that takes full advantage of parallelisation and optimisation in Polars. Gain a deep understanding of Polars' unique features and learn optimization techniques to harness its full potential. Through hands-on exercises, you'll tackle real-world scenarios, optimizing code for speed and resource utilization. By the end of the course, you'll not only have a strong command of Polars but also the skills to apply this knowledge in solving data challenges in your organization.
The course is composed of Jupyter notebooks that demonstrate the key concepts needed by data scientists to solve day-to-day challenges. Each notebook ends with exercises that help to develop attendees’ understanding of the concepts.
What you’ll learn and how you can apply it
- The key differences between Pandas and Polars
- How to load and transform CSV and Parquet data in Polars
- How to write queries that run in parallel
- How to write queries that take advantage of query optimisation and fast-track algorithms
- When Polars can work with larger-than-memory datasets and how to work with larger datasets
This live event is for you because...
- The primary audience is data scientists who have basic Python skills and experience with other dataframe libraries such as Pandas.
Prerequisites
- Basic skills with Python and Pandas (attendees should be able to load a CSV and do a groupby aggregation with Pandas)
- No Polars experience is necessary
Course Setup:
- Setup instructions available via zipfile
Recommended Preparation
- Install Polars
Recommended Follow-Up
- Read Python Polars: The Definitive Guide (book in Early Release)
Schedule
The time frames are only estimates and may vary according to how the class is progressing.
Segment 1 Core ideas I (60 minutes)
- Presentation: Pandas to Polars: need-to-know
- Exercise: Loading and transforming data I
- Q&A
- Break
Segment 2 Core ideas II (60 minutes)
- Presentation: Effective transformations
- Exercise: Loading and transforming data II
- Q&A
- Break
Segment 3: Optimizing Polars (60 minutes)
- Presentation: Lazy mode and optimization
- Exercise: Optimizing queries
- Q&A
- Break
Segment 4: Scaling Polars (60 minutes)
- Presentation: Scaling Polars
- Exercise/Lab: Scaling Polars
- Q&A
Your Instructor
Liam Brannigan
Liam Brannigan has been a Polars contributor for 2 years with a focus on documentation and accessibility for new users. Liam is also a Lead Data Scientist and has been deploying Polars in production machine learning pipelines for over a year. Liam has a PhD in Physical Oceanography from the University of Oxford and is a data science communicator on numerous social media platforms.