Streamlit for Data Science - Second Edition

Book description

An easy-to-follow and comprehensive guide to creating data apps with Streamlit, including how-to guides for working with cloud data warehouses like Snowflake, using pretrained Hugging Face and OpenAI models, and creating apps for job interviews.

Key Features

  • Create machine learning apps with random forest, Hugging Face, and GPT-3.5 turbo models
  • Gain an insight into how experts harness Streamlit with in-depth interviews with Streamlit power users
  • Discover the full range of Streamlit’s capabilities via hands-on exercises to effortlessly create and deploy well-designed apps

Book Description

If you work with data in Python and are looking to create data apps that showcase ML models and make beautiful interactive visualizations, then this is the ideal book for you. Streamlit for Data Science, Second Edition, shows you how to create and deploy data apps quickly, all within Python. This helps you create prototypes in hours instead of days!

Written by a prolific Streamlit user and senior data scientist at Snowflake, this fully updated second edition builds on the practical nature of the previous edition with exciting updates, including connecting Streamlit to data warehouses like Snowflake, integrating Hugging Face and OpenAI models into your apps, and connecting and building apps on top of Streamlit databases. Plus, there is a totally updated code repository on GitHub to help you practice your newfound skills.

You'll start your journey with the fundamentals of Streamlit and gradually build on this foundation by working with machine learning models and producing high-quality interactive apps. The practical examples of both personal data projects and work-related data-focused web applications will help you get to grips with more challenging topics such as Streamlit Components, beautifying your apps, and quick deployment.

By the end of this book, you'll be able to create dynamic web apps in Streamlit quickly and effortlessly.

What you will learn

  • Set up your first development environment and create a basic Streamlit app from scratch
  • Create dynamic visualizations using built-in and imported Python libraries
  • Discover strategies for creating and deploying machine learning models in Streamlit
  • Deploy Streamlit apps with Streamlit Community Cloud, Hugging Face Spaces, and Heroku
  • Integrate Streamlit with Hugging Face, OpenAI, and Snowflake
  • Beautify Streamlit apps using themes and components
  • Implement best practices for prototyping your data science work with Streamlit

Who this book is for

This book is for data scientists and machine learning enthusiasts who want to get started with creating data apps in Streamlit. It is terrific for junior data scientists looking to gain some valuable new skills in a specific and actionable fashion and is also a great resource for senior data scientists looking for a comprehensive overview of the library and how people use it. Prior knowledge of Python programming is a must, and you’ll get the most out of this book if you’ve used Python libraries like Pandas and NumPy in the past.

Table of contents

  1. Preface
    1. Who this book is for
    2. What this book covers
    3. Acknowledgment
    4. To get the most out of this book
    5. Get in touch
  2. An Introduction to Streamlit
    1. Technical requirements
    2. Why Streamlit?
    3. Installing Streamlit
      1. Organizing Streamlit apps
      2. Streamlit plotting demo
    4. Making an app from scratch
      1. Using user input in Streamlit apps
    5. Finishing touches – adding text to Streamlit
    6. Summary
  3. Uploading, Downloading, and Manipulating Data
    1. Technical requirements
    2. The setup – Palmer’s Penguins
    3. Exploring Palmer’s Penguins
    4. Flow control in Streamlit
    5. Debugging Streamlit apps
    6. Developing in Streamlit
    7. Exploring in Jupyter and then copying to Streamlit
    8. Data manipulation in Streamlit
    9. An introduction to caching
    10. Persistence with Session State
    11. Summary
  4. Data Visualization
    1. Technical requirements
    2. San Francisco Trees – a new dataset
    3. Streamlit visualization use cases
    4. Streamlit’s built-in graphing functions
    5. Streamlit’s built-in visualization options
      1. Plotly
      2. Matplotlib and Seaborn
      3. Bokeh
      4. Altair
      5. PyDeck
      6. Configuration options
    6. Summary
  5. Machine Learning and AI with Streamlit
    1. Technical requirements
    2. The standard ML workflow
    3. Predicting penguin species
    4. Utilizing a pre-trained ML model in Streamlit
    5. Training models inside Streamlit apps
    6. Understanding ML results
    7. Integrating external ML libraries – a Hugging Face example
    8. Integrating external AI libraries – an OpenAI example
      1. Authenticating with OpenAI
      2. OpenAI API cost
      3. Streamlit and OpenAI
    9. Summary
  6. Deploying Streamlit with Streamlit Community Cloud
    1. Technical requirements
    2. Getting started with Streamlit Community Cloud
    3. A quick primer on GitHub
    4. Deploying with Streamlit Community Cloud
      1. Debugging Streamlit Community Cloud
      2. Streamlit Secrets
    5. Summary
  7. Beautifying Streamlit Apps
    1. Technical requirements
    2. Setting up the SF Trees dataset
      1. Working with columns in Streamlit
      2. Exploring page configuration
    3. Using Streamlit tabs
    4. Using the Streamlit sidebar
    5. Picking colors with a color picker
    6. Multi-page apps
    7. Editable DataFrames
    8. Summary
  8. Exploring Streamlit Components
    1. Technical requirements
    2. Adding editable DataFrames with streamlit-aggrid
    3. Creating drill-down graphs with streamlit-plotly-events
    4. Using Streamlit Components – streamlit-lottie
    5. Using Streamlit Components – streamlit-pandas-profiling
    6. Interactive maps with st-folium
    7. Helpful mini-functions with streamlit-extras
    8. Finding more Components
    9. Summary
  9. Deploying Streamlit Apps with Hugging Face and Heroku
    1. Technical requirements
    2. Choosing between Streamlit Community Cloud, Hugging Face, and Heroku
    3. Deploying Streamlit with Hugging Face
    4. Deploying Streamlit with Heroku
      1. Setting up and logging in to Heroku
      2. Cloning and configuring our local repository
      3. Deploying to Heroku
    5. Summary
  10. Connecting to Databases
    1. Technical requirements
    2. Connecting to Snowflake with Streamlit
    3. Connecting to BigQuery with Streamlit
      1. Adding user input to queries
      2. Organizing queries
    4. Summary
  11. Improving Job Applications with Streamlit
    1. Technical requirements
    2. Using Streamlit for proof-of-skill data projects
      1. Machine learning – the Penguins app
      2. Visualization – the Pretty Trees app
    3. Improving job applications in Streamlit
      1. Questions
      2. Answering Question 1
      3. Answering Question 2
    4. Summary
  12. The Data Project – Prototyping Projects in Streamlit
    1. Technical requirements
    2. Data science ideation
    3. Collecting and cleaning data
    4. Making an MVP
      1. How many books do I read each year?
      2. How long does it take for me to finish a book that I have started?
      3. How long are the books that I have read?
      4. How old are the books that I have read?
      5. How do I rate books compared to other Goodreads users?
    5. Iterative improvement
      1. Beautification via animation
      2. Organization using columns and width
      3. Narrative building through text and additional statistics
    6. Hosting and promotion
    7. Summary
  13. Streamlit Power Users
    1. Fanilo Andrianasolo
    2. Adrien Treuille
    3. Gerard Bentley
    4. Arnaud Miribel and Zachary Blackwood
    5. Yuichiro Tachibana
    6. Summary
  14. Other Books You May Enjoy
  15. Index

Product information

  • Title: Streamlit for Data Science - Second Edition
  • Author(s): Tyler Richards
  • Release date: September 2023
  • Publisher(s): Packt Publishing
  • ISBN: 9781803248226