Book description
Python is a first-class tool for many researchers, primarily because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the new edition of Python Data Science Handbook do you get them all—IPython, NumPy, pandas, Matplotlib, Scikit-Learn, and other related tools.
Working scientists and data crunchers familiar with reading and writing Python code will find the second edition of this comprehensive desk reference ideal for tackling day-to-day issues: manipulating, transforming, and cleaning data; visualizing different types of data; and using data to build statistical or machine learning models. Quite simply, this is the must-have reference for scientific computing in Python.
With this handbook, you'll learn how:
- IPython and Jupyter provide computational environments for scientists using Python
- NumPy includes the ndarray for efficient storage and manipulation of dense data arrays
- Pandas contains the DataFrame for efficient storage and manipulation of labeled/columnar data
- Matplotlib includes capabilities for a flexible range of data visualizations
- Scikit-learn helps you build efficient and clean Python implementations of the most important and established machine learning algorithms
Publisher resources
Table of contents
- Preface
- I. Jupyter: Beyond Normal Python
- 1. Getting Started in IPython and Jupyter
- 2. Enhanced Interactive Features
- 3. Debugging and Profiling
- II. Introduction to NumPy
- 4. Understanding Data Types in Python
- 5. The Basics of NumPy Arrays
- 6. Computation on NumPy Arrays: Universal Functions
- 7. Aggregations: min, max, and Everything in Between
- 8. Computation on Arrays: Broadcasting
- 9. Comparisons, Masks, and Boolean Logic
- 10. Fancy Indexing
- 11. Sorting Arrays
- 12. Structured Data: NumPy’s Structured Arrays
- III. Data Manipulation with Pandas
- 13. Introducing Pandas Objects
- 14. Data Indexing and Selection
- 15. Operating on Data in Pandas
- 16. Handling Missing Data
- 17. Hierarchical Indexing
- 18. Combining Datasets: concat and append
- 19. Combining Datasets: merge and join
- 20. Aggregation and Grouping
- 21. Pivot Tables
- 22. Vectorized String Operations
- 23. Working with Time Series
- 24. High-Performance Pandas: eval and query
- IV. Visualization with Matplotlib
- 25. General Matplotlib Tips
- 26. Simple Line Plots
- 27. Simple Scatter Plots
- 28. Density and Contour Plots
- 29. Customizing Plot Legends
- 30. Customizing Colorbars
- 31. Multiple Subplots
- 32. Text and Annotation
- 33. Customizing Ticks
- 34. Customizing Matplotlib: Configurations and Stylesheets
- 35. Three-Dimensional Plotting in Matplotlib
- 36. Visualization with Seaborn
- V. Machine Learning
- 37. What Is Machine Learning?
- 38. Introducing Scikit-Learn
- 39. Hyperparameters and Model Validation
- 40. Feature Engineering
- 41. In Depth: Naive Bayes Classification
- 42. In Depth: Linear Regression
- 43. In Depth: Support Vector Machines
- 44. In Depth: Decision Trees and Random Forests
- 45. In Depth: Principal Component Analysis
- 46. In Depth: Manifold Learning
- 47. In Depth: k-Means Clustering
- 48. In Depth: Gaussian Mixture Models
- 49. In Depth: Kernel Density Estimation
- 50. Application: A Face Detection Pipeline
- Index
- About the Author
Product information
- Title: Python Data Science Handbook, 2nd Edition
- Author(s):
- Release date: December 2022
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781098121228
You might also like
book
Python Data Science Handbook
For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, …
book
Python for Data Analysis, 3rd Edition
Get the definitive handbook for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python …
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
book
Python for Data Analysis, 2nd Edition
Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, …