7.14 Intro to Data Science: pandas Series
and DataFrame
s
NumPy’s array
is optimized for homogeneous numeric data that’s accessed via integer indices. Data science presents unique demands for which more customized data structures are required. Big data applications must support mixed data types, customized indexing, missing data, data that’s not structured consistently and data that needs to be manipulated into forms appropriate for the databases and data analysis packages you use.
Pandas is the most popular library for dealing with such data. It provides two key collections that you’ll use in several of our Intro to Data Science sections and throughout the data science case studies—Series
for one-dimensional collections and DataFrames
for two-dimensional ...
Get Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud now with the O’Reilly learning platform.
O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.