Book description
The book starts off presenting supervised and unsupervised ML and DL models, and then it examines big data frameworks along with ML and DL frameworks. Author Tshepo Chris Nokeri considers a parametric model known as the Generalized Linear Model and a survival regression model known as the Cox Proportional Hazards model along with Accelerated Failure Time (AFT). Also presented is a binary classification model (logistic regression) and an ensemble model (Gradient Boosted Trees). The book introduces DL and an artificial neural network known as the Multilayer Perceptron (MLP) classifier. A way of performing cluster analysis using the K-Means model is covered. Dimension reduction techniques such as Principal Components Analysis and Linear Discriminant Analysis are explored. And automated machine learning is unpacked.
This book is for intermediate-level data scientists and machine learning engineers who want to learn how to apply key big data frameworks and ML and DL frameworks. You will need prior knowledge of the basics of statistics, Python programming, probability theories, and predictive analytics.
- Understand widespread supervised and unsupervised learning, including key dimension reduction techniques
- Know the big data analytics layers such as data visualization, advanced statistics, predictive analytics, machine learning, and deep learning
- Integrate big data frameworks with a hybrid of machine learning frameworks and deep learning frameworks
- Design, build, test, and validate skilled machine models and deep learning models
- Optimize model performance using data transformation, regularization, outlier remedying, hyperparameter optimization, and data split ratio alteration
Table of contents
- Cover
- Front Matter
- 1. Exploring Machine Learning
- 2. Big Data, Machine Learning, and Deep Learning Frameworks
- 3. Linear Modeling with Scikit-Learn, PySpark, and H2O
- 4. Survival Analysis withPySpark and Lifelines
- 5. Nonlinear Modeling With Scikit-Learn, PySpark, and H2O
- 6. Tree Modeling and Gradient Boosting with Scikit-Learn, XGBoost, PySpark, and H2O
- 7. Neural Networks with Scikit-Learn, Keras, and H2O
- 8. Cluster Analysis with Scikit-Learn, PySpark, and H2O
- 9. Principal Component Analysis with Scikit-Learn, PySpark, and H2O
- 10. Automating the Machine Learning Process with H2O
- Back Matter
Product information
- Title: Data Science Solutions with Python: Fast and Scalable Models Using Keras, PySpark MLlib, H2O, XGBoost, and Scikit-Learn
- Author(s):
- Release date: October 2021
- Publisher(s): Apress
- ISBN: 9781484277621
You might also like
book
Interpretable Machine Learning with Python
A deep and detailed dive into the key aspects and challenges of machine learning interpretability, complete …
book
Machine Learning for Time-Series with Python
Get better insights from time-series data and become proficient in model performance analysis Key Features Explore …
book
Machine Learning with Python Cookbook
This practical guide provides nearly 200 self-contained recipes to help you solve machine learning challenges you …
book
Machine Learning Engineering with Python
Supercharge the value of your machine learning models by building scalable and robust solutions that can …