Metadata Recycling into Graph Data Models
This book is 80% complete
Last updated on 2019-03-24
About the Book
Why waste good legacy data models just because you go to NoSQL? Literally legions of data modelers have spent tons of hours and days on producing (mostly) rather good representations of a business context. AKA data models.
I tried to get an overview of the data modeling tools market (see addendum below) and I found a list of 77 ERD-supporting tools. Add to that that wikipedia has a list of 49 UML tools, many of which are also used for data modeling. And the history of these tools go back to the mid 80es with the CASE tools (see also below) based on the emerging IBM PC/AT computers.
A reader says that "this book was a serendipitous find for me as it will help out with a challenging project at work... We are attempting to create a consolidated data model out of all the different databases and source code bases, and metadata recycling can help us overcome that".
There must be hundreds of thousands of good, reusable data models! Why waste such a large resource of business metadata?
Much similar to data science we need to be able to (in "Metadata Science") to read, transform, scope / reduce / enhance and adapt to modern database technologies. Not least graph databases, if you ask me.
This book explains how to do that. Cypher®-scripts are included for Concept Maps (CmapTools®), CSDL, XML Schemas, StarUML v1 and UML® via XMI®. More to follow. It also explains how to build a simple graph-based metadata repository for:
- Business level concept models
- Solution level logical data models, and
- Physical models.
The repository has full lineage back to the source data model, and it supports identities, uniqueness, mandatory fields and basic datatypes.
Cypher-scripts for repository handling are indeed also part of the book (under a MIT license)..
We suggest a choice of two approaches:
- Fast Track Data Models (agile)
- Super Data Models (crafted, using the repository)
I started looking at things this way because of a recent client situation. The challenge was building a graph database based on data, which:
- Resided in Oracle®
- Were modeled as UML® class diagrams, and
- Were also available in JSON-format.
The target was a graph database and I started looking at metadata transformation using Neo4j®. One of the nice things about graph technology is that you evolve the data model as you go. Furthermore you have a very powerful, declarative language with many, many useful procedures and functions. It is called Cypher®, and it is becoming the SQL of graph.
To make the story short: We built (mostly we generated) a graph model by way of extracting metadata, mostly from an XMI®-representation of the UML® model, but also from some JSON meta files. We only used Cypher®-scripts and it was a whole lot easier than we thought it would be. This saved us a lot of time and gave us good opportunities for caring about the scope of the forthcoming graph data model.
Looking for a graph database modeling tool? Now, everything you need for recycling of data models is available inside this book. 47 scripts (and growing) plus the repository model providing data lineage.
Missing a data modeling tool on the list? Suggestions of other formats to be supported can be sent to @VizDataModeler (info at graphdatamodeling dot com).
Overview of the Value Proposition of Recycling Metadata
- Concerns Drive Full Scale Data Modeling
- Full Scale Data Architecture
The Atoms and Molecules of Data Models
- Entity-Relationship Modeling
- Relational and SQL
- Object Orientation and UML®
- Graph Data Models
- Object Role Modeling (a.k.a. Fact Modeling)
- The Key Things
- The Universal Constituent Parts of Data Models
The Process of Generating Data Models
- The Mechanics
- Designing a Flexible Approach
Super Data Models
- Load & Transform
- Subset to get the Solution Model
- Extend the Solution Model
- Transform to get a Physical Model
- Optimize the Database
Fast Track Data Models
- Load & Transform
- Auto-generate a Physical Model
- Optimize the Database
Build Your Own Metadata Repository
- The Repository on One Page
- Concept Models
- Solution Models
- Physical Models
- Lineage in the repository
- Metadata details
Concept Maps (CmapTools®) as Graph Data Models
- About Concept Maps
- Load & Transform Concept Maps
- The Cookbook for Concept Maps
- From Concept Model to Super Data Model or Fast Track?
Conceptual Schemas in OData CSDL Format
- About CDSL, Common Schema Definition Language from OASIS
- Looking at the CSDL file
- Building the Concept Model from the CSDL Model
XML Schemas from the W3C
- About XML Schemas
- Looking at the XSD (XML) file
- Building the Concept Model from the XSD Model
StarUML ERD Models
- About StarUML
- Looking at the UML (XML) file
- Building the Concept Model from the UML® ERD Model
UML® as XMI® the EA Way
- About UML® and XMI® and EA
- The example: Open, Public Data
- Looking at the XMI®-file
- Building the concept model from the UML® model
Addendum: Improving Metadata Quality
- The Business is King
- Data Names Matter
- Finding Standard Data Structures
- Establishing Identity and Uniqueness
- Presenting the Business Flow
- Presenting the Keys
- Presenting State Changes
- Presenting Versions of Data
- Housekeeping Proper
- Scalar Data Types
- Time Zones
- Design is Decisions
- Which Objects and Which Relationships?
- Presenting Relationships and Missing References
- Presenting the Right Level of Detail
- Good Relationships
- Identity, Uniqueness and Keys Revisited
- Missing Information
- Other books by Thomas Frisendal
The Leanpub 45-day 100% Happiness Guarantee
Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.
See full terms...