This short book is designed to help readers quickly gain a working-level knowledge of building knowledge graph and GraphRAG-based applications tailored for process industry operations. The book covers the complete journey from understanding why traditional retrieval-augmented generation falls short on relationship-heavy plant questions, to designing a domain ontology, extracting entities from unstructured incident reports with LLMs, querying the resulting knowledge graph with natural language, and finally deploying everything as a web application. With a hands-on, tutorial-style approach throughout, readers will learn how to set up Neo4j, write Cypher queries that traverse multi-hop relationships, use OpenAI structured outputs for reliable extraction, build a natural-language-to-Cypher pipeline with LangChain, and combine graph traversal with vector search for hybrid retrieval. The application-focused approach of the book is reader-friendly and easily digestible for practicing and aspiring process engineers and data scientists. Upon completion, readers will be able to confidently build and deploy GraphRAG solutions for their plants and make informed design decisions suitable for their industrial environments.
The following topics are broadly covered:
• Why knowledge graphs and GraphRAG matter for process industry operations
• Where traditional approaches (relational databases and simple RAG) fall short
• Setting up Neo4j AuraDB and learning the Cypher query language from scratch
• Designing a domain ontology for process troubleshooting data
• LLM-powered entity and relationship extraction with OpenAI structured outputs
• Automated ingestion of incident reports into a Neo4j knowledge graph
• Building a natural-language-to-Cypher query pipeline with LangChain
• Demo Application: A complete Streamlit-based process troubleshooting assistant