Statistics for Data Science

by James D. Miller

Released November 2017

Publisher(s): Packt Publishing

ISBN: 9781788290678

Start your free trial

Book description

Get your statistics basics right before diving into the world of data science

About This Book

No need to take a degree in statistics, read this book and get a strong statistics base for data science and real-world programs;
Implement statistics in data science tasks such as data cleaning, mining, and analysis
Learn all about probability, statistics, numerical computations, and more with the help of R programs

Who This Book Is For

This book is intended for those developers who are willing to enter the field of data science and are looking for concise information of statistics with the help of insightful programs and simple explanation. Some basic hands on R will be useful.

What You Will Learn

Analyze the transition from a data developer to a data scientist mindset
Get acquainted with the R programs and the logic used for statistical computations
Understand mathematical concepts such as variance, standard deviation, probability, matrix calculations, and more
Learn to implement statistics in data science tasks such as data cleaning, mining, and analysis
Learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks
Get comfortable with performing various statistical computations for data science programmatically

In Detail

Data science is an ever-evolving field, which is growing in popularity at an exponential rate. Data science includes techniques and theories extracted from the fields of statistics; computer science, and, most importantly, machine learning, databases, data visualization, and so on.

This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks. It starts off with simple statistics and then move on to statistical methods that are used in data science algorithms. The R programs for statistical computation are clearly explained along with logic. You will come across various mathematical concepts, such as variance, standard deviation, probability, matrix calculations, and more. You will learn only what is required to implement statistics in data science tasks such as data cleaning, mining, and analysis. You will learn the statistical techniques required to perform tasks such as linear regression, regularization, model assessment, boosting, SVMs, and working with neural networks.

By the end of the book, you will be comfortable with performing various statistical computations for data science programmatically.

Style and approach

Step by step comprehensive guide with real world examples