Graph Data Modeling in Python

Book description

Learn how to transform, store, evolve, refactor, model, and create graph projections using the Python programming language Purchase of the print or Kindle book includes a free PDF eBook

Key Features

  • Transform relational data models into graph data model while learning key applications along the way
  • Discover common challenges in graph modeling and analysis, and learn how to overcome them
  • Practice real-world use cases of community detection, knowledge graph, and recommendation network

Book Description

Graphs have become increasingly integral to powering the products and services we use in our daily lives, driving social media, online shopping recommendations, and even fraud detection. With this book, you’ll see how a good graph data model can help enhance efficiency and unlock hidden insights through complex network analysis.

Graph Data Modeling in Python will guide you through designing, implementing, and harnessing a variety of graph data models using the popular open source Python libraries NetworkX and igraph. Following practical use cases and examples, you’ll find out how to design optimal graph models capable of supporting a wide range of queries and features. Moreover, you’ll seamlessly transition from traditional relational databases and tabular data to the dynamic world of graph data structures that allow powerful, path-based analyses. As well as learning how to manage a persistent graph database using Neo4j, you’ll also get to grips with adapting your network model to evolving data requirements.

By the end of this book, you’ll be able to transform tabular data into powerful graph data models. In essence, you’ll build your knowledge from beginner to advanced-level practitioner in no time.

What you will learn

  • Design graph data models and master schema design best practices
  • Work with the NetworkX and igraph frameworks in Python Store, query, ingest, and refactor graph data
  • Store your graphs in memory with Neo4j
  • Build and work with projections and put them into practice
  • Refactor schemas and learn tactics for managing an evolved graph data model

Who this book is for

If you are a data analyst or database developer interested in learning graph databases and how to curate and extract data from them, this is the book for you. It is also beneficial for data scientists and Python developers looking to get started with graph data modeling. Although knowledge of Python is assumed, no prior experience in graph data modeling theory and techniques is required.

Table of contents

  1. Graph Data Modeling in Python
  2. Contributors
  3. About the authors
  4. About the reviewer
  5. Preface
    1. Who this book is for
    2. What this book covers
    3. To get the most out of this book
    4. Download the example code files
    5. Conventions used
    6. Get in touch
    7. Share your thoughts
    8. Download a free PDF copy of this book
  6. Part 1: Getting Started with Graph Data Modeling
  7. Chapter 1: Introducing Graphs in the Real World
    1. Technical requirements
    2. Why should you use graphs?
      1. Composite components of a graph
    3. The fundamentals of nodes and edges and the properties of a graph
      1. Undirected graphs
      2. Directed graphs
      3. Node properties
      4. Heterogeneous graphs
      5. Schema design
    4. Comparing RDBs and GDBs
      1. GDBs to the rescue
    5. The use of graphs across various industries
    6. Introduction to NetworkX and igraph
      1. NetworkX basics
      2. igraph basics
    7. Summary
  8. Chapter 2: Working with Graph Data Models
    1. Technical requirements
    2. Making the transition from tabular to graph data
      1. Examining the data
      2. Designing a schema
    3. Implementing the model in Python
      1. Adding nodes and attributes
      2. Adding edges
      3. Writing a generic graph import method
    4. The most popular TV show – a real-world use case
      1. Examining the graph structure
      2. Measuring connectedness
      3. Looking at the top degree nodes
      4. Using select() to interrogate the graph
      5. Properties of our popular nodes
    5. Summary
  9. Part 2: Making the Graph Transition
  10. Chapter 3: Data Model Transformation – Relational to Graph Databases
    1. Technical requirements
    2. Recommending a game to a user
      1. Installing MySQL
      2. Setting up a MySQL database
      3. Querying MySQL in Python
      4. Examining the data in Python
      5. Path-based analytics in tabular data
    3. From relational to graph databases
      1. Schema design
    4. Ingestion considerations
      1. Path-based analytics in igraph
    5. Our recommendation system
      1. Generic MySQL to igraph methods
      2. A more advanced recommendation system using Jaccard similarity
    6. Summary
  11. Chapter 4: Building a Knowledge Graph
    1. Technical requirements
    2. Introducing knowledge graphs
    3. Cleaning the data for our knowledge graph
    4. Ingesting data into a knowledge graph
      1. Designing a knowledge graph schema
      2. Linking text to terms
      3. Constructing the knowledge graph
    5. Knowledge graph analysis and community detection
      1. Examining the knowledge graph structure
      2. Identifying abstracts of interest
      3. Identifying fields with community detection
    6. Summary
  12. Part 3: Storing and Productionizing Graphs
  13. Chapter 5: Working with Graph Databases
    1. Technical requirements
    2. Using graph databases
      1. Neo4j as a graph database
      2. The Cypher query language
      3. Querying Neo4j from Python
    3. Storing a graph in Neo4j
      1. Preprocessing data
      2. Moving nodes, edges, and properties to Neo4j
    4. Optimizing travel with Python and Cypher
      1. Travel recommendations
    5. Moving to ingestion pipelines
    6. Summary
  14. Chapter 6: Pipeline Development
    1. Technical requirements
    2. Graph pipeline development
      1. A graph database for retail
    3. Designing a schema and pipeline
      1. Setting up a new database
      2. Schema design
      3. Adding static product information
      4. Simulating customer interactions
    4. Making product recommendations
      1. Product recommendations by brand
      2. Drawing on other customers purchases
      3. Using similarity scores to recommend products
    5. Summary
  15. Chapter 7: Refactoring and Evolving Schemas
    1. Technical requirements
    2. Refactoring reasoning
      1. Change in relational and graph databases
    3. Effectively evolving with graph schema design
    4. Putting the changes into development
      1. Initializing a new database
      2. Adding constraints
      3. Pre-change schema
      4. Updating the schema
    5. Summary
  16. Part 4: Graphing Like a Pro
  17. Chapter 8: Perfect Projections
    1. Technical requirements
    2. What are projections?
    3. How to use a projection
      1. Creating a projection in igraph
      2. Creating a projection in Neo4j
    4. Putting the projection to work
      1. Analyzing the igraph actor projection
      2. Exploring connected components
      3. Exploring cliques in our graph
      4. Analyzing the Neo4j film projection
    5. Summary
  18. Chapter 9: Common Errors and Debugging
    1. Technical requirements
    2. Debugging graph issues
    3. Common igraph issues
      1. No nodes in the graph
      2. Node IDs in igraph
      3. Adding properties
      4. Using the select method
      5. Chained statements and select
      6. Efficiency and path lengths
    4. Common Neo4j issues
      1. Slow writing from file to Neo4j
      2. Indexing for query performance
      3. Caching results
      4. Memory limitations
      5. Handling duplicates with MERGE
      6. Handling duplicates with constraints
      7. EXPLAIN, PROFILE, and the eager operator
    5. Summary
  19. Index
    1. Why subscribe?
  20. Other Books You May Enjoy
    1. Packt is searching for authors like you
    2. Share your thoughts
    3. Download a free PDF copy of this book

Product information

  • Title: Graph Data Modeling in Python
  • Author(s): Gary Hutson, Matt Jackson
  • Release date: June 2023
  • Publisher(s): Packt Publishing
  • ISBN: 9781804618035