Book description
Explore the world of data science from scratch with Julia by your side
About This Book
An in-depth exploration of Julia's growing ecosystem of packages
Work with the most powerful open-source libraries for deep learning, data wrangling, and data visualization
Learn about deep learning using Mocha.jl and give speed and high performance to data analysis on large data sets
Who This Book Is For
This book is aimed at data analysts and aspiring data scientists who have a basic knowledge of Julia or are completely new to it. The book also appeals to those competent in R and Python and wish to adopt Julia to improve their skills set in Data Science. It would be beneficial if the readers have a good background in statistics and computational mathematics.
What You Will Learn
Apply statistical models in Julia for data-driven decisions
Understanding the process of data munging and data preparation using Julia
Explore techniques to visualize data using Julia and D3 based packages
Using Julia to create self-learning systems using cutting edge machine learning algorithms
Create supervised and unsupervised machine learning systems using Julia. Also, explore ensemble models
Build a recommendation engine in Julia
Dive into Julia’s deep learning framework and build a system using Mocha.jl
In Detail
Julia is a fast and high performing language that's perfectly suited to data science with a mature package ecosystem and is now feature complete. It is a good tool for a data science practitioner. There was a famous post at Harvard Business Review that Data Scientist is the sexiest job of the 21st century. (https://hbr.org/2012/10/data-scientist-the-sexiest-job-of-the-21st-century).
This book will help you get familiarised with Julia's rich ecosystem, which is continuously evolving, allowing you to stay on top of your game.
This book contains the essentials of data science and gives a high-level overview of advanced statistics and techniques. You will dive in and will work on generating insights by performing inferential statistics, and will reveal hidden patterns and trends using data mining. This has the practical coverage of statistics and machine learning. You will develop knowledge to build statistical models and machine learning systems in Julia with attractive visualizations.
You will then delve into the world of Deep learning in Julia and will understand the framework, Mocha.jl with which you can create artificial neural networks and implement deep learning.
This book addresses the challenges of real-world data science problems, including data cleaning, data preparation, inferential statistics, statistical modeling, building high-performance machine learning systems and creating effective visualizations using Julia.
Style and approach
This practical and easy-to-follow yet comprehensive guide will get you learning about Julia with respect to data science. Each topic is explained thoroughly and placed in context. For the more inquisitive, we dive deeper into the language and its use case. This is the one true guide to working with Julia in data science.
Table of contents
-
Julia for Data Science
- Julia for Data Science
- Credits
- About the Author
- About the Reviewer
- www.PacktPub.com
- Preface
- 1. The Groundwork – Julia's Environment
-
2. Data Munging
- What is data munging?
-
What is a DataFrame?
- The NA data type and its importance
- DataArray – a series-like data structure
- DataFrames – tabular data structures
- Installation and using DataFrames.jl
- Working with DataFrames
- The Split-Apply-Combine strategy
- Reshaping the data
- Sorting a dataset
- Formula - a special data type for mathematical expressions
- Pooling data
- Web scraping
- Summary
- References
- 3. Data Exploration
-
4. Deep Dive into Inferential Statistics
- Installation
- Understanding the sampling distribution
- Understanding the normal distribution
- Type hierarchy in Distributions.jl
- Univariate distributions
- Truncated distributions
- Understanding multivariate distributions
- Understanding matrixvariate distributions
- Distribution fitting
- Confidence interval
- Understanding z-score
- Understanding the significance of the P-value
- Summary
- References
-
5. Making Sense of Data Using Visualization
- Difference between using and importall
- Pyplot for Julia
- Unicode plots
- Visualizing using Vega
-
Data visualization using Gadfly
- Installing Gadfly
- Interacting with Gadfly using plot function
- Using Gadfly to plot DataFrames
- Using Gadfly to visualize functions and expressions
- Generating an image with multiple layers
- Generating plots with different aesthetics using statistics
- Generating plots with different aesthetics using Geometry
- Elements - scale
- Elements - guide
- Understanding how Gadfly works
- Summary
- References
- 6. Supervised Machine Learning
- 7. Unsupervised Machine Learning
- 8. Creating Ensemble Models
- 9. Time Series
- 10. Collaborative Filtering and Recommendation System
- 11. Introduction to Deep Learning
Product information
- Title: Julia for Data Science
- Author(s):
- Release date: September 2016
- Publisher(s): Packt Publishing
- ISBN: 9781785289699
You might also like
book
Julia for Data Analysis
Master core data analysis skills using Julia. Interesting hands-on projects guide you through time series data, …
book
Python and R for the Modern Data Scientist
Success in data science depends on the flexible and appropriate use of tools. That includes Python …
book
Python Data Science Essentials - Third Edition
Gain useful insights from your data using popular data science tools Key Features A one-stop guide …
book
Getting Started with Julia
Enter the exciting world of Julia, a high-performance language for technical computing In Detail Julia is …