High Performance Python, 3rd Edition

Book description

Your Python code may run correctly, but what if you need it to run faster? This practical book shows you how to locate performance bottlenecks and significantly speed up your code in high-data-volume programs. By explaining the fundamental theory behind design choices, this expanded edition of High Performance Python helps experienced Python programmers gain a deeper understanding of Python's implementation.

How do you take advantage of multicore architectures or clusters? Or build a system that scales up and down without losing reliability? Authors Micha Gorelick and Ian Ozsvald reveal concrete solutions to many issues and include war stories from companies that use high-performance Python for social media analytics, productionized machine learning, and more.

  • Get a better grasp of NumPy, Cython, and profilers
  • Learn how Python abstracts the underlying computer architecture
  • Use profiling to find bottlenecks in CPU time and memory usage
  • Write efficient programs by choosing appropriate data structures
  • Speed up matrix and vector computations
  • Process DataFrames quickly with pandas, Dask, and Polars
  • Speed up your neural networks and GPU computations
  • Use tools to compile Python down to machine code
  • Manage multiple I/O and computational operations concurrently
  • Convert multiprocessing code to run on local or remote clusters
  • Deploy code faster using tools like Docker

Publisher resources

View/Submit Errata

Table of contents

  1. Brief Table of Contents (Not Yet Final)
  2. 1. Understanding Performant Python
    1. The Fundamental Computer System
      1. Computing Units
      2. Memory Units
      3. Communications Layers
    2. Putting the Fundamental Elements Together
      1. Idealized Computing Versus the Python Virtual Machine
    3. So Why Use Python?
    4. How to Be a Highly Performant Programmer
      1. Good Working Practices
      2. Optimizing for the Team Rather than the Code Block
      3. The Remote Performant Programmer
      4. Some Thoughts on Good Notebook Practice
      5. Getting the Joy Back into Your Work
    5. The future of Python
      1. Where did the GIL go?
      2. Does Python have a JIT?
  3. 2. Profiling to Find Bottlenecks
    1. Profiling Efficiently
    2. Introducing the Julia Set
    3. Calculating the Full Julia Set
    4. Simple Approaches to Timing—print and a Decorator
    5. Simple Timing Using the Unix time Command
    6. Using the cProfile Module
    7. Visualizing cProfile Output with SnakeViz
    8. Using line_profiler for Line-by-Line Measurements
    9. Using memory_profiler to Diagnose Memory Usage
    10. Combining CPU and Memory Profiling with Scalene
    11. Introspecting an Existing Process with PySpy
    12. VizTracer for an interactive time-based call stack
    13. Bytecode: Under the Hood
      1. Using the dis Module to Examine CPython Bytecode
      2. Digging into bytecode specialisation with Specialist
      3. Different Approaches, Different Complexity
    14. Unit Testing During Optimization to Maintain Correctness
      1. No-op @profile Decorator
    15. Strategies to Profile Your Code Successfully
    16. Wrap-Up
  4. 3. Lists and Tuples
    1. A More Efficient Search
    2. Lists Versus Tuples
      1. Lists as Dynamic Arrays
      2. Tuples as Static Arrays
    3. Wrap-Up
  5. 4. Dictionaries and Sets
    1. How Do Dictionaries and Sets Work?
      1. Inserting and Retrieving
      2. Deletion
      3. Resizing
      4. Hash Functions and Entropy
    2. Wrap-Up
  6. 5. Iterators and Generators
    1. Iterators for Infinite Series
    2. Lazy Generator Evaluation
    3. Wrap-Up
  7. 6. Pandas, Dask and Polars
    1. Pandas
      1. Pandas’s Internal Model
      2. Arrow and NumPy
      3. Applying a Function to Many Rows of Data
      4. Numba to Compile NumPy for Pandas
      5. Building DataFrames and Series from Partial Results Rather than Concatenating
      6. There’s More Than One (and Possibly a Faster) Way to Do a Job
      7. Advice for Effective Pandas Development
    2. Dask for Distributed Data Structures and DataFrames
      1. Diagnostics
      2. Parallel Pandas with Dask
      3. Parallelized apply with Swifter on Dask
    3. Polars for Fast DataFrames
    4. Wrap-Up
  8. 7. Compiling to C
    1. What Sort of Speed Gains Are Possible?
    2. JIT Versus AOT Compilers
    3. Why Does Type Information Help the Code Run Faster?
    4. Using a C Compiler
    5. Reviewing the Julia Set Example
    6. Cython
      1. Compiling a Pure Python Version Using Cython
    7. pyximport
      1. Cython Annotations to Analyze a Block of Code
      2. Adding Some Type Annotations
    8. Cython and numpy
      1. Parallelizing the Solution with OpenMP on One Machine
    9. Numba
    10. PyPy
      1. Garbage Collection Differences
      2. Running PyPy and Installing Modules
    11. A Summary of Speed Improvements
    12. When to Use Each Technology
      1. Other Upcoming Projects
    13. Foreign Function Interfaces
      1. ctypes
      2. cffi
      3. f2py
      4. CPython Extensions: C
      5. CPython Extensions: Rust
    14. Wrap-Up
  9. 8. Asynchronous I/O
    1. Introduction to Asynchronous Programming
    2. How Does async/await Work?
      1. Serial Crawler
      2. Asynchronous Crawler
    3. Shared CPU–I/O Workload
      1. Serial
      2. Batched Results
      3. Full Async
    4. Wrap-Up
  10. 9. The multiprocessing Module
    1. An Overview of the multiprocessing Module
    2. Estimating Pi Using the Monte Carlo Method
    3. Estimating Pi Using Processes and Threads
      1. Using Python Objects
      2. Replacing multiprocessing with Joblib
      3. Random Numbers in Parallel Systems
      4. Using numpy
    4. Finding Prime Numbers
      1. Queues of Work
    5. Verifying Primes Using Interprocess Communication
      1. Serial Solution
      2. Naive Pool Solution
      3. A Less Naive Pool Solution
      4. Using Manager.Value as a Flag
      5. Using Redis as a Flag
      6. Using RawValue as a Flag
      7. Using mmap as a Flag
      8. Using mmap as a Flag Redux
    6. Sharing numpy Data with multiprocessing
    7. Synchronizing File and Variable Access
      1. File Locking
      2. Locking a Value
    8. Wrap-Up
  11. 10. Clusters and Job Queues
    1. Benefits of Clustering
    2. Drawbacks of Clustering
      1. $462 Million Wall Street Loss Through Poor Cluster Upgrade Strategy
      2. Skype’s 24-Hour Global Outage
    3. Common Cluster Designs
    4. How to Start a Clustered Solution
    5. Ways to Avoid Pain When Using Clusters
    6. Two Clustering Solutions
      1. Using IPython Parallel to Support Research
    7. NSQ for Robust Production Clustering
      1. Queues
      2. Pub/sub
      3. Distributed Prime Calculation
    8. Other Clustering Tools to Look At
    9. Docker
      1. Docker’s Performance
      2. Advantages of Docker
    10. Wrap-Up
  12. About the Authors

Product information

  • Title: High Performance Python, 3rd Edition
  • Author(s): Micha Gorelick, Ian Oszvald
  • Release date: May 2025
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781098165963