Book description
This is the eBook of the printed book and may not include any media, website access codes, or print supplements that may come packaged with the bound book.
For introductory-level Python programming and/or data-science courses.
A groundbreaking, flexible approach to computer science and data science
The Deitels’ Introduction to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and the Cloud offers a unique approach to teaching introductory Python programming, appropriate for both computer-science and data-science audiences. Providing the most current coverage of topics and applications, the book is paired with extensive traditional supplements as well as Jupyter Notebooks supplements. Real-world datasets and artificial-intelligence technologies allow students to work on projects making a difference in business, industry, government and academia. Hundreds of examples, exercises, projects (EEPs), and implementation case studies give students an engaging, challenging and entertaining introduction to Python programming and hands-on data science.
The book’s modular architecture enables instructors to conveniently adapt the text to a wide range of computer-science and data-science courses offered to audiences drawn from many majors. Computer-science instructors can integrate as much or as little data-science and artificial-intelligence topics as they’d like, and data-science instructors can integrate as much or as little Python as they’d like. The book aligns with the latest ACM/IEEE CS-and-related computing curriculum initiatives and with the Data Science Undergraduate Curriculum Proposal sponsored by the National Science Foundation.
Table of contents
- Intro to Python® for Computer Science and Data Science
- Deitel® Series Page
- Intro to Python® for Computer Science and Data Science
- Intro to Python® for Computer Science and Data Science
- Contents
-
Preface
- Python for Computer Science and Data Science Education
- Modular Architecture
- Audiences for the Book
- Key Features
- Chapter Dependencies
- Computing and Data Science Curricula
- Data Science Overlaps with Computer Science28
- Jobs Requiring Data Science Skills
- Jupyter Notebooks
- Docker
- Class Tested
- “Flipped Classroom”
- Special Feature: IBM Watson Analytics and Cognitive Computing
- Teaching Approach
- Software Used in the Book
- Python Documentation
- Getting Your Questions Answered
- Student and Instructor Supplements
- Instructor Supplements on Pearson’s Instructor Resource Center
- Instructor Examination Copies
- Keeping in Touch with the Authors
- Acknowledgments
- About the Authors
- About Deitel® & Associates, Inc.
- Before You Begin
-
1 Introduction to Computers and Python
- Objectives
- Outline
- 1.1 Introduction
- 1.2 Hardware and Software
- 1.3 Data Hierarchy
- 1.4 Machine Languages, Assembly Languages and High-Level Languages
- 1.5 Introduction to Object Technology
- 1.6 Operating Systems
- 1.7 Python
- 1.8 It’s the Libraries!
- 1.9 Other Popular Programming Languages
-
1.10 Test-Drive: Using IPython and Jupyter Notebooks
- 1.10.1 Using IPython Interactive Mode as a Calculator
- Self Check
- 1.10.2 Executing a Python Program Using the IPython Interpreter
- Self Check
-
1.10.3 Writing and Executing Code in a Jupyter Notebook
- Opening JupyterLab in Your Browser
- Creating a New Jupyter Notebook
- Renaming the Notebook
- Evaluating an Expression
- Adding and Executing Another Cell
- Saving the Notebook
- Notebooks Provided with Each Chapter’s Examples
- Opening and Executing an Existing Notebook
- Closing JupyterLab
- JupyterLab Tips
- More Information on Working with JupyterLab
- Self Check
- 1.11 Internet and World Wide Web
- 1.12 Software Technologies
- 1.13 How Big Is Big Data?
- 1.14 Case Study—A Big-Data Mobile Application
- 1.15 Intro to Data Science: Artificial Intelligence—at the Intersection of CS and Data Science
- Exercises
-
2 Introduction to Python Programming
- Objectives
- Outline
- 2.1 Introduction
- 2.2 Variables and Assignment Statements
- 2.3 Arithmetic
- 2.4 Function print and an Intro to Single- and Double-Quoted Strings
- 2.5 Triple-Quoted Strings
- 2.6 Getting Input from the User
- 2.7 Decision Making: The if Statement and Comparison Operators
- 2.8 Objects and Dynamic Typing
- 2.9 Intro to Data Science: Basic Descriptive Statistics
- 2.10 Wrap-Up
- Exercises
-
3 Control Statements and Program Development
- Objectives
- Outline
- 3.1 Introduction
- 3.2 Algorithms
- 3.3 Pseudocode
- 3.4 Control Statements
- 3.5 if Statement
- 3.6 if…else and if…elif…else Statements
- 3.7 while Statement
- 3.8 for Statement
- 3.9 Augmented Assignments
- 3.10 Program Development: Sequence-Controlled Repetition
- 3.11 Program Development: Sentinel-Controlled Repetition
- 3.12 Program Development: Nested Control Statements
- 3.13 Built-In Function range: A Deeper Look
- 3.14 Using Type Decimal for Monetary Amounts
- 3.15 break and continue Statements
- 3.16 Boolean Operators and, or and not
- 3.17 Intro to Data Science: Measures of Central Tendency—Mean, Median and Mode
- 3.18 Wrap-Up
- Exercises
-
4 Functions
- Objectives
- Outline
- 4.1 Introduction
- 4.2 Defining Functions
- 4.3 Functions with Multiple Parameters
- 4.4 Random-Number Generation
- 4.5 Case Study: A Game of Chance
- 4.6 Python Standard Library
- 4.7 math Module Functions
- 4.8 Using IPython Tab Completion for Discovery
- 4.9 Default Parameter Values
- 4.10 Keyword Arguments
- 4.11 Arbitrary Argument Lists
- 4.12 Methods: Functions That Belong to Objects
- 4.13 Scope Rules
- 4.14 import: A Deeper Look
- 4.15 Passing Arguments to Functions: A Deeper Look
- 4.16 Function-Call Stack
- 4.17 Functional-Style Programming
- 4.18 Intro to Data Science: Measures of Dispersion
- 4.19 Wrap-Up
- Exercises
-
5 Sequences: Lists and Tuples
- Objectives
- Outline
- 5.1 Introduction
- 5.2 Lists
- 5.3 Tuples
- 5.4 Unpacking Sequences
- 5.5 Sequence Slicing
- 5.6 del Statement
- 5.7 Passing Lists to Functions
- 5.8 Sorting Lists
- 5.9 Searching Sequences
- 5.10 Other List Methods
- 5.11 Simulating Stacks with Lists
- 5.12 List Comprehensions
- 5.13 Generator Expressions
- 5.14 Filter, Map and Reduce
- 5.15 Other Sequence Processing Functions
- 5.16 Two-Dimensional Lists
-
5.17 Intro to Data Science: Simulation and Static Visualizations
- 5.17.1 Sample Graphs for 600, 60,000 and 6,000,000 Die Rolls
- Self Check
-
5.17.2 Visualizing Die-Roll Frequencies and Percentages
- Launching IPython for Interactive Matplotlib Development
- Importing the Libraries
- Rolling the Die and Calculating Die Frequencies
- Creating the Initial Bar Plot
- Setting the Window Title and Labeling the x- and y-Axes
- Finalizing the Bar Plot
- Rolling Again and Updating the Bar Plot—Introducing IPython Magics
- Saving Snippets to a File with the %save Magic
- Command-Line Arguments; Displaying a Plot from a Script
- Self Check
- 5.18 Wrap-Up
- Exercises
-
6 Dictionaries and Sets
- Objectives
- Outline
- 6.1 Introduction
-
6.2 Dictionaries
- 6.2.1 Creating a Dictionary
- 6.2.2 Iterating through a Dictionary
- Self Check
- 6.2.3 Basic Dictionary Operations
- Self Check
- 6.2.4 Dictionary Methods keys and values
- Self Check
- 6.2.5 Dictionary Comparisons
- Self Check
- 6.2.6 Example: Dictionary of Student Grades
- 6.2.7 Example: Word Counts2
- Self Check
- 6.2.8 Dictionary Method update
- Self Check
- 6.3 Sets
- 6.4 Intro to Data Science: Dynamic Visualizations
- 6.5 Wrap-Up
- Exercises
-
7 Array-Oriented Programming with NumPy
- Objectives
- Outline
- 7.1 Introduction
- 7.2 Creating arrays from Existing Data
- 7.3 array Attributes
- 7.4 Filling arrays with Specific Values
- 7.5 Creating arrays from Ranges
- 7.6 List vs. array Performance: Introducing %timeit
- 7.7 array Operators
- 7.8 NumPy Calculation Methods
- 7.9 Universal Functions
- 7.10 Indexing and Slicing
- 7.11 Views: Shallow Copies
- 7.12 Deep Copies
- 7.13 Reshaping and Transposing
-
7.14 Intro to Data Science: pandas Series and DataFrames
-
7.14.1 pandas Series
- Creating a Series with Default Indices
- Displaying a Series
- Creating a Series with All Elements Having the Same Value
- Accessing a Series’ Elements
- Producing Descriptive Statistics for a Series
- Creating a Series with Custom Indices
- Dictionary Initializers
- Accessing Elements of a Series Via Custom Indices
- Creating a Series of Strings
- Self Check
-
7.14.2 DataFrames
- Creating a DataFrame from a Dictionary
- Customizing a DataFrame’s Indices with the index Attribute
- Accessing a DataFrame’s Columns
- Selecting Rows via the loc and iloc Attributes
- Selecting Rows via Slices and Lists with the loc and iloc Attributes
- Selecting Subsets of the Rows and Columns
- Boolean Indexing
- Accessing a Specific DataFrame Cell by Row and Column
- Descriptive Statistics
- Transposing the DataFrame with the T Attribute
- Sorting by Rows by Their Indices
- Sorting by Column Indices
- Sorting by Column Values
- Copy vs. In-Place Sorting
- Self Check
-
7.14.1 pandas Series
- 7.15 Wrap-Up
- Exercises
-
8 Strings: A Deeper Look
- Objectives
- Outline
- 8.1 Introduction
-
8.2 Formatting Strings
- 8.2.1 Presentation Types
- Integers
- Characters
- Strings
- Floating-Point and Decimal Values
- Self Check
- 8.2.2 Field Widths and Alignment
- Explicitly Specifying Left and Right Alignment in a Field
- Centering a Value in a Field
- Self Check
- 8.2.3Numeric Formatting
- Formatting Positive Numbers with Signs
- Using a Space Where a + Sign Would Appear in a Positive Value
- Grouping Digits
- Self Check
- 8.2.4String’s format Method
- Multiple Placeholders
- Referencing Arguments By Position Number
- Referencing Keyword Arguments
- Self Check
- 8.3 Concatenating and Repeating Strings
- 8.4 Stripping Whitespace from Strings
- 8.5 Changing Character Case
- 8.6 Comparison Operators for Strings
- 8.7 Searching for Substrings
- 8.8 Replacing Substrings
- 8.9 Splitting and Joining Strings
- 8.10 Characters and Character-Testing Methods
- 8.11 Raw Strings
- 8.12 Introduction to Regular Expressions
- 8.13 Intro to Data Science: Pandas, Regular Expressions and Data Munging
- 8.14 Wrap-Up
- Exercises
-
9 Files and Exceptions
- Objectives
- Outline
- 9.1 Introduction
- 9.2 Files
- 9.3 Text-File Processing
- 9.4 Updating Text Files
- 9.5 Serialization with JSON
- 9.6 Focus on Security: pickle Serialization and Deserialization
- 9.7 Additional Notes Regarding Files
- 9.8 Handling Exceptions
- 9.9 finally Clause
- 9.10 Explicitly Raising an Exception
- 9.11 (Optional) Stack Unwinding and Tracebacks
- 9.12 Intro to Data Science: Working with CSV Files
- 9.13 Wrap-Up
- Exercises
-
10 Object-Oriented Programming
- Objectives
- Outline
- 10.1 Introduction
- 10.2 Custom Class Account
- 10.3 Controlling Access to Attributes
- 10.4 Properties for Data Access
- 10.5 Simulating “Private” Attributes
- 10.6 Case Study: Card Shuffling and Dealing Simulation
- 10.7 Inheritance: Base Classes and Subclasses
- 10.8 Building an Inheritance Hierarchy; Introducing Polymorphism
- 10.9 Duck Typing and Polymorphism
- 10.10 Operator Overloading
- 10.11 Exception Class Hierarchy and Custom Exceptions
- 10.12 Named Tuples
- 10.13 A Brief Intro to Python 3.7’s New Data Classes
- 10.14 Unit Testing with Docstrings and doctest
- 10.15 Namespaces and Scopes
- 10.16 Intro to Data Science: Time Series and Simple Linear Regression
- 10.17 Wrap-Up
- Exercises
-
11 Computer Science Thinking: Recursion, Searching, Sorting and Big O
- Objectives
- Outline
- 11.1 Introduction
- 11.2 Factorials
- 11.3 Recursive Factorial Example
- 11.4 Recursive Fibonacci Series Example
- 11.5 Recursion vs. Iteration
- 11.6 Self Check
- 11.6 Searching and Sorting
- 11.7 Linear Search
- 11.8 Efficiency of Algorithms: Big O
- 11.9 Binary Search
- 11.10 Sorting Algorithms
- 11.11 Selection Sort
- 11.12 Insertion Sort
- 11.13 Merge Sort
- 11.14 Big O Summary for This Chapter’s Searching and Sorting Algorithms
- 11.15 Visualizing Algorithms
- 11.16 Wrap-Up
- Exercises
-
12 Natural Language Processing (NLP)
- Objectives
- Outline
- 12.1 Introduction
-
12.2 TextBlob1
- Self Check
- 12.2.1 Create a TextBlob
- Self Check
- 12.2.2 Tokenizing Text into Sentences and Words
- Self Check
- 12.2.3 Parts-of-Speech Tagging
- Self Check
- 12.2.4 Extracting Noun Phrases
- Self Check
- 12.2.5 Sentiment Analysis with TextBlob’s Default Sentiment Analyzer
- Self Check
- 12.2.6 Sentiment Analysis with the NaiveBayesAnalyzer
- Self Check
- 12.2.7 Language Detection and Translation
- Self Check
- 12.2.8 Inflection: Pluralization and Singularization
- Self Check
- 12.2.9 Spell Checking and Correction
- Self Check
- 12.2.10 Normalization: Stemming and Lemmatization
- Self Check
- 12.2.11 Word Frequencies
- Self Check
- 12.2.12 Getting Definitions, Synonyms and Antonyms from WordNet
- Self Check
- 12.2.13 Deleting Stop Words
- Self Check
- 12.2.14 n-grams
- Self Check
- 12.3 Visualizing Word Frequencies with Bar Charts and Word Clouds
- 12.4 Readability Assessment with Textatistic
- 12.5 Named Entity Recognition with spaCy
- 12.6 Similarity Detection with spaCy
- 12.7 Other NLP Libraries and Tools
- 12.8 Machine Learning and Deep Learning Natural Language Applications
- 12.9 Natural Language Datasets
- 12.10 Wrap-Up
- Exercises
-
13 Data Mining Twitter
- Objectives
- Outline
- 13.1 Introduction
- 13.2 Overview of the Twitter APIs
- 13.3 Creating a Twitter Account
- 13.4 Getting Twitter Credentials—Creating an App
- 13.5 What’s in a Tweet?
- 13.6 Tweepy
- 13.7 Authenticating with Twitter Via Tweepy
- 13.8 Getting Information About a Twitter Account
- 13.9 Introduction to Tweepy Cursors: Getting an Account’s Followers and Friends
- 13.10 Searching Recent Tweets
- 13.11 Spotting Trends: Twitter Trends API
- 13.12 Cleaning/Preprocessing Tweets for Analysis
- 13.13 Twitter Streaming API
- 13.14 Tweet Sentiment Analysis
-
13.15 Geocoding and Mapping
- Self Check
-
13.15.1 Getting and Mapping the Tweets
- Get the API Object
- Collections Required By LocationListener
- Creating the LocationListener
- Configure and Start the Stream of Tweets
- Displaying the Location Statistics
- Geocoding the Locations
- Displaying the Bad Location Statistics
- Cleaning the Data
- Creating a Map with Folium
- Creating Popup Markers for the Tweet Locations
- Saving the Map
- Self Check
- 13.15.2 Utility Functions in tweetutilities.py
- Self Check
- 13.15.3 Class LocationListener
- 13.16 Ways to Store Tweets
- 13.17 Twitter and Time Series
- 13.18 Wrap-Up
- Exercises
-
14 IBM Watson and Cognitive Computing
- Outline
- 14.1 Introduction: IBM Watson and Cognitive Computing
- 14.2 IBM Cloud Account and Cloud Console
- 14.3 Watson Services
- 14.4 Additional Services and Tools
- 14.5 Watson Developer Cloud Python SDK
- 14.6 Case Study: Traveler’s Companion Translation App
- 14.7 Watson Resources
- 14.8 Wrap-Up
- Exercises
-
15 Machine Learning: Classification, Regression and Clustering
- Outline
- 15.1 Introduction to Machine Learning
- 15.2 Case Study: Classification with k-Nearest Neighbors and the Digits Dataset, Part 1
- 15.3 Case Study: Classification with k-Nearest Neighbors and the Digits Dataset, Part 2
- 15.4 Case Study: Time Series and Simple Linear Regression
-
15.5 Case Study: Multiple Linear Regression with the California Housing Dataset
- 15.5.1 Loading the Dataset
- 15.5.2 Exploring the Data with Pandas
- Self Check
- 15.5.3 Visualizing the Features
- Self Check
- 15.5.4 Splitting the Data for Training and Testing
- 15.5.5 Training the Model
- Self Check
- 15.5.6 Testing the Model
- 15.5.7 Visualizing the Expected vs. Predicted Prices
- 15.5.8 Regression Model Metrics
- Self Check
- 15.5.9 Choosing the Best Model
- 15.6 Case Study: Unsupervised Machine Learning, Part 1—Dimensionality Reduction
-
15.7 Case Study: Unsupervised Machine Learning, Part 2—k-Means Clustering
- Self Check
- 15.7.1 Loading the Iris Dataset
- 15.7.2 Exploring the Iris Dataset: Descriptive Statistics with Pandas
- 15.7.3 Visualizing the Dataset with a Seaborn pairplot
- 15.7.4 Using a KMeans Estimator
- 15.7.5 Dimensionality Reduction with Principal Component Analysis
- 15.7.6 Choosing the Best Clustering Estimator
- 15.8 Wrap-Up
- Exercises
-
16 Deep Learning
- Objectives
- Outline
- 16.1 Introduction
- 16.2 Keras Built-In Datasets
- 16.3 Custom Anaconda Environments
- 16.4 Neural Networks
- 16.5 Tensors
-
16.6 Convolutional Neural Networks for Vision; Multi-Classification with the MNIST Dataset
- Self Check
- 16.6.1 Loading the MNIST Dataset
- Self Check
- 16.6.2 Data Exploration
- 16.6.3 Data Preparation
- Self Check
-
16.6.4 Creating the Neural Network
- Adding Layers to the Network
- Convolution
- Adding a Convolution Layer
- Dimensionality of the First Convolution Layer’s Output
- Overfitting
- Adding a Pooling Layer
- Adding Another Convolutional Layer and Pooling Layer
- Flattening the Results
- Adding a Dense Layer to Reduce the Number of Features
- Adding Another Dense Layer to Produce the Final Output
- Printing the Model’s Summary
- Visualizing a Model’s Structure
- Compiling the Model
- Self Check
- 16.6.5 Training and Evaluating the Model
- Self Check
- 16.6.6 Saving and Loading a Model
- Self Check
- 16.7 Visualizing Neural Network Training with TensorBoard
- 16.8 ConvnetJS: Browser-Based Deep-Learning Training and Visualization
- 16.9 Recurrent Neural Networks for Sequences; Sentiment Analysis with the IMDb Dataset
- 16.10 Tuning Deep Learning Models
- 16.11 Convnet Models Pretrained on ImageNet
- 16.12 Reinforcement Learning
- 16.13 Wrap-Up
-
Exercises
- Convolutional Neural Networks
- Recurrent Neural Networks
- ConvnetJS Visualization
- Convolutional Neural Network Projects and Research
- Recurrent Neural Network Projects and Research
- Automated Deep Learning Project
- Reinforcement Learning Projects and Research
- Generative Deep Learning
- Deep Fakes
- Additional Research
-
17 Big Data: Hadoop, Spark, NoSQL and IoT
- Objectives
- Outline
- 17.1 Introduction
- 17.2 Relational Databases and Structured Query Language (SQL)
- 17.3 NoSQL and NewSQL Big-Data Databases: A Brief Tour
-
17.4 Case Study: A MongoDB JSON Document Database
- 17.4.1 Creating the MongoDB Atlas Cluster
-
17.4.2 Streaming Tweets into MongoDB
- Use Tweepy to Authenticate with Twitter
- Loading the Senators’ Data
- Configuring the MongoClient
- Setting up Tweet Stream
- Starting the Tweet Stream
- Class TweetListener
- Counting Tweets for Each Senator
- Show Tweet Counts for Each Senator
- Get the State Locations for Plotting Markers
- Grouping the Tweet Counts by State
- Creating the Map
- Creating a Choropleth to Color the Map
- Creating the Map Markers for Each State
- Displaying the Map
- Self Check for Section 17.4
-
17.5 Hadoop
- 17.5.1 Hadoop Overview
- 17.5.2 Summarizing Word Lengths in Romeo and Juliet via MapReduce
- 17.5.3 Creating an Apache Hadoop Cluster in Microsoft Azure HDInsight
- 17.5.4 Hadoop Streaming
- 17.5.5 Implementing the Mapper
- 17.5.6 Implementing the Reducer
- 17.5.7 Preparing to Run the MapReduce Example
- 17.5.8 Running the MapReduce Job
- Self Check for Section 17.5
- 17.6 Spark
-
17.7 Spark Streaming: Counting Twitter Hashtags Using the pyspark-notebook Docker Stack
- 17.7.1 Streaming Tweets to a Socket
-
17.7.2 Summarizing Tweet Hashtags; Introducing Spark SQL
- Importing the Libraries
- Utility Function to Get the SparkSession
- Utility Function to Display a Barchart Based on a Spark DataFrame
- Utility Function to Summarize the Top-20 Hashtags So Far
- Getting the SparkContext
- Getting the StreamingContext
- Setting Up a Checkpoint for Maintaining State
- Connecting to the Stream via a Socket
- Tokenizing the Lines of Hashtags
- Mapping the Hashtags to Tuples of Hashtag-Count Pairs
- Totaling the Hashtag Counts So Far
- Specifying the Method to Call for Every RDD
- Starting the Spark Stream
- Self Check for Section 17.7
-
17.8 Internet of Things and Dashboards
- 17.8.1 Publish and Subscribe
- 17.8.2 Visualizing a PubNub Sample Live Stream with a Freeboard Dashboard
- 17.8.3 Simulating an Internet-Connected Thermostat in Python
- 17.8.4 Creating the Dashboard with Freeboard.io
-
17.8.5 Creating a Python PubNub Subscriber
- Message Format
- Importing the Libraries
- List and DataFrame Used for Storing Company Names and Prices
- Class SensorSubscriberCallback
- Function Update
- Configuring the Figure
- Configuring the FuncAnimation and Displaying the Window
- Configuring the PubNub Client
- Subscribing to the Channel
- Ensuring the Figure Remains on the Screen
- Self Check for Section 17.8
- 17.9 Wrap-Up
- Exercises
- Index
Product information
- Title: Intro to Python for Computer Science and Data Science: Learning to Program with AI, Big Data and The Cloud
- Author(s):
- Release date: May 2019
- Publisher(s): Pearson
- ISBN: 9780135404676
You might also like
book
Data Science from Scratch, 2nd Edition
To really learn data science, you should not only master the tools—data science libraries, frameworks, modules, …
video
Introduction to Python: Learn How to Program Today with Python
7+ Hours of Video Instruction Overview Python is a great, beginner-friendly programming language because it was …
book
Python Crash Course, 3rd Edition
Python Crash Course is the world's best-selling guide to the Python guide programming language, with over …
book
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 3rd Edition
Through a recent series of breakthroughs, deep learning has boosted the entire field of machine learning. …