Book description
Take your first steps to becoming a fully qualified data analyst by learning how to explore complex datasets
Key Features
- Master each concept through practical exercises and activities
- Discover various statistical techniques to analyze your data
- Implement everything you’ve learned on a real-world case study to uncover valuable insights
Book Description
Every day, businesses operate around the clock, and a huge amount of data is generated at a rapid pace. This book helps you analyze this data and identify key patterns and behaviors that can help you and your business understand your customers at a deep, fundamental level.
SQL for Data Analytics, Third Edition is a great way to get started with data analysis, showing how to effectively sort and process information from raw data, even without any prior experience.
You will begin by learning how to form hypotheses and generate descriptive statistics that can provide key insights into your existing data. As you progress, you will learn how to write SQL queries to aggregate, calculate, and combine SQL data from sources outside of your current dataset. You will also discover how to work with advanced data types, like JSON. By exploring advanced techniques, such as geospatial analysis and text analysis, you will be able to understand your business at a deeper level. Finally, the book lets you in on the secret to getting information faster and more effectively by using advanced techniques like profiling and automation.
By the end of this book, you will be proficient in the efficient application of SQL techniques in everyday business scenarios and looking at data with the critical eye of analytics professional.
What you will learn
- Use SQL to clean, prepare, and combine different datasets
- Aggregate basic statistics using GROUP BY clauses
- Perform advanced statistical calculations using a WINDOW function
- Import data into a database to combine with other tables
- Export SQL query results into various sources
- Analyze special data types in SQL, including geospatial, date/time, and JSON data
- Optimize queries and automate tasks
- Think about data problems and find answers using SQL
Who this book is for
If you're a database engineer looking to transition into analytics or a backend engineer who wants to develop a deeper understanding of production data and gain practical SQL knowledge, you will find this book useful. This book is also ideal for data scientists or business analysts who want to improve their data analytics skills using SQL. Basic familiarity with SQL (such as basic SELECT, WHERE, and GROUP BY clauses) as well as a good understanding of linear algebra, statistics, and PostgreSQL 14 are necessary to make the most of this SQL data analytics book.
Table of contents
- SQL for Data Analytics
- Third edition
- Preface
-
1. Understanding and Describing Data
- Introduction
- Data Analytics and Statistics
-
Types of Statistics
- Methods of Descriptive Statistics
- Univariate Analysis
- Exercise 1.01: Creating a Histogram
- Exercise 1.02: Calculating the Quartiles for Add-On Sales
- Exercise 1.03: Calculating the Central Tendency of Add-On Sales
- Exercise 1.04: Dispersion of Add-On Sales
- Bivariate Analysis
- Exercise 1.05: Calculating the Pearson Correlation Coefficient for Two Variables
- Interpreting and Analyzing the Correlation Coefficient
- Activity 1.02: Exploring Dealership Sales Data
- Working with Missing Data
- Statistical Significance Testing
- SQL and Analytics
- Summary
-
2. The Basics of SQL for Analytics
- Introduction
- The World of Data
- Relational Databases and SQL
-
PostgreSQL Relational Database Management System (RDBMS)
- Exercise 2.01: Running Your First SELECT Query
- SELECT Statement
- The WHERE Clause
- The AND/OR Clause
- The IN/NOT IN Clause
- ORDER BY Clause
- The LIMIT Clause
- IS NULL/IS NOT NULL Clause
- Exercise 2.02: Querying the salespeople Table Using Basic Keywords in a SELECT Query
- Activity 2.01: Querying the customers Table Using Basic Keywords in a SELECT Query
- Creating Tables
- Basic Data Types of SQL
- Data Structures: JSON and Arrays
- Column Constraints
-
Updating Tables
- Adding and Removing Columns
- Adding New Data
- Updating Existing Rows
- Exercise 2.04: Updating the Table to Increase the Price of a Vehicle
- Deleting Data and Tables
- Deleting Values from a Row
- Deleting Rows from a Table
- Deleting Tables
- Exercise 2.05: Deleting an Unnecessary Reference Table
- Activity 2.02: Creating and Modifying Tables for Marketing Operations
- SQL and Analytics
- Summary
- 3. SQL for Data Preparation
- 4. Aggregate Functions for Data Analysis
- 5. Window Functions for Data Analysis
-
6. Importing and Exporting Data
- Introduction
- The COPY Command
-
Using Python with your Database
- Getting Started with Python
- Improving PostgreSQL Access in Python with SQLAlchemy and pandas
- What is SQLAlchemy?
- Using Python with SQLAlchemy and pandas
- Reading and Writing to a Database with pandas
- Writing Data to the Database Using Python
- Exercise 6.02: Reading, Visualizing, and Saving Data in Python
- Improving Python Write Speed with COPY
- Reading and Writing CSV Files with Python
- Best Practices for Importing and Exporting Data
- Going Passwordless
- Summary
- 7. Analytics Using Complex Data Types
-
8. Performant SQL
- Introduction
- The Importance of Highly Efficient SQL
-
Database Scanning Methods
- Query Planning
- Exercise 8.01: Interpreting the Query Planner
- Activity 8.01: Query Planning
- Index Scanning
- The B-Tree Index
- Exercise 8.02: Creating an Index Scan
- Activity 8.02: Implementing Index Scans
- The Hash Index
- Exercise 8.03: Generating Several Hash Indexes to Investigate Performance
- Activity 8.03: Implementing Hash Indexes
- Effective Index Use
- Killing Queries
-
Functions and Triggers
- Function Definitions
- Exercise 8.05: Creating Functions without Arguments
- Activity 8.04: Defining a Largest Sale Value Function
- Exercise 8.06: Creating Functions with Arguments
- Activity 8.05: Creating Functions with Arguments
- Triggers
- Exercise 8.07: Creating Triggers to Update Fields
- Activity 8.06: Creating a Trigger to Track Average Purchases
- Summary
-
9. Using SQL to Uncover the Truth: A Case Study
- Introduction
-
Case Study
- The Scientific Method
- Exercise 9.01: Preliminary Data Collection Using SQL Techniques
- Exercise 9.02: Extracting the Sales Information
- Activity 9.01: Quantifying the Sales Drop
- Exercise 9.03: Launch Timing Analysis
- Activity 9.02: Analyzing the Difference in the Sales Price Hypothesis
- Exercise 9.04: Analyzing Sales Growth by Email Opening Rate
- Exercise 9.05: Analyzing the Performance of the Email Marketing Campaign
- Conclusions
- In-Field Testing
- Summary
-
Appendix
- 1. Understanding and Describing Data
- 2. The Basics of SQL for Analytics
- 3. SQL for Data Preparation
- 4. Aggregate Functions for Data Analysis
- 5. Window Functions for Data Analysis
- 6. Importing and Exporting Data
- 7. Analytics Using Complex Data Types
- 8. Performant SQL
- 9. Using SQL to Uncover the Truth: A Case Study
Product information
- Title: SQL for Data Analytics - Third Edition
- Author(s):
- Release date: August 2022
- Publisher(s): Packt Publishing
- ISBN: 9781801812870
You might also like
book
SQL for Data Analysis
With the explosion of data, computing power, and cloud data warehouses, SQL has become an even …
book
SQL for Data Analytics
Take your first steps to become a fully qualified data analyst by learning how to explore …
book
Data Wrangling with SQL
Become a data wrangling expert and make well-informed decisions by effectively utilizing and analyzing raw unstructured …
video
Master SQL for Data Analysis
SQL is a popular language for extracting, stacking, and querying data from databases. Master SQL to …