Metadata Recycling into Graph Data Models
Metadata Recycling into Graph Data Models
Minimum price
Suggested price
Metadata Recycling into Graph Data Models

This book is 80% complete

Last updated on 2019-03-24

About the Book

Why waste good legacy data models just because you go to NoSQL? Literally legions of data modelers have spent tons of hours and days on producing (mostly) rather good representations of a business context. AKA data models.

I tried to get an overview of the data modeling tools market (see addendum below) and I found a list of 77 ERD-supporting tools. Add to that that wikipedia has a list of 49 UML tools, many of which are also used for data modeling. And the history of these tools go back to the mid 80es with the CASE tools (see also below) based on the emerging IBM PC/AT computers. 

A reader says that "this book was a serendipitous find for me as it will help out with a challenging project at work... We are attempting to create a consolidated data model out of all the different databases and source code bases, and metadata recycling can help us overcome that".

There must be hundreds of thousands of good, reusable data models! Why waste such a large resource of business metadata?

Much similar to data science we need to be able to (in "Metadata Science") to read, transform, scope / reduce / enhance and adapt to modern database technologies. Not least graph databases, if you ask me.

This book explains how to do that. Cypher®-scripts are included for Concept Maps (CmapTools®), CSDL, XML Schemas, StarUML v1 and UML® via XMI®. More to follow. It also explains how to build a simple graph-based metadata repository for:

  • Business level concept models
  • Solution level logical data models, and
  • Physical models.

The repository has full lineage back to the source data model, and it supports identities, uniqueness, mandatory fields and basic datatypes.

Cypher-scripts for repository handling are indeed also part of the book (under a MIT license)..

We suggest a choice of two approaches:

  • Fast Track Data Models (agile)
  • Super Data Models (crafted, using the repository)

I started looking at things this way because of a recent client situation. The challenge was building a graph database based on data, which:

  • Resided in Oracle®
  • Were modeled as UML® class diagrams, and
  • Were also available in JSON-format.

The target was a graph database and I started looking at metadata transformation using Neo4j®. One of the nice things about graph technology is that you evolve the data model as you go. Furthermore you have a very powerful, declarative language with many, many useful procedures and functions. It is called Cypher®, and it is becoming the SQL of graph.

To make the story short: We built (mostly we generated) a graph model by way of extracting metadata, mostly from an XMI®-representation of the UML® model, but also from some JSON meta files. We only used Cypher®-scripts and it was a whole lot easier than we thought it would be. This saved us a lot of time and gave us good opportunities for caring about the scope of the forthcoming graph data model.

Looking for a graph database modeling tool? Now, everything you need for recycling of data models is available inside this book. 47 scripts (and growing) plus the repository model providing data lineage.

Missing a data modeling tool on the list? Suggestions of other formats to be supported can be sent to @VizDataModeler (info at graphdatamodeling dot com).

About the Author

Thomas Frisendal
Thomas Frisendal

Thomas Frisendal is an experienced data guy with more than 30 years on the IT vendor side and as an independent consultant. He has worked with databases and data modeling since the late 70s; since 1995 primarily on data warehouse projects. He has a strong urge to visualize everything as graphs - even datamodels! He excels in the art of turning data into information and knowledge. His approach to information-driven analysis and design is "New Nordic" in the sense that it represents the traditional Nordic values such as superior quality, functionality, reliability and innovation by new ways of communicating the structure and meaning of the business context.

Thomas is an active writer and speaker.

He has previously published:

Design Thinking Business Analysis: Business Concept Mapping Applied, Springer, 2012 and

Graph Data Modeling for NoSQL and SQL: Visualize Structure and Meaning, Technics Publications, 2017

Visual Design of GraphQL Data, first on Leanpub then on Apress 2018

He is blogging at Dataversity.

Thomas lives in Copenhagen, close to the Airport.

Table of Contents

  • Introduction
  • Acknowledgements
  • Overview of the Value Proposition of Recycling Metadata
    • Concerns Drive Full Scale Data Modeling
    • Full Scale Data Architecture
  • The Atoms and Molecules of Data Models
    • Entity-Relationship Modeling
    • Relational and SQL
    • Object Orientation and UML®
    • Graph Data Models
    • Object Role Modeling (a.k.a. Fact Modeling)
    • The Key Things
    • The Universal Constituent Parts of Data Models
  • The Process of Generating Data Models
    • The Mechanics
    • Designing a Flexible Approach
  • Super Data Models
    • Load & Transform
    • Subset to get the Solution Model
    • Extend the Solution Model
    • Transform to get a Physical Model
    • Optimize the Database
  • Fast Track Data Models
    • Load & Transform
    • Auto-generate a Physical Model
    • Optimize the Database
  • Build Your Own Metadata Repository
    • The Repository on One Page
    • Concept Models
    • Solution Models
    • Physical Models
    • Lineage in the repository
    • Metadata details
  • Concept Maps (CmapTools®) as Graph Data Models
    • About Concept Maps
    • Load & Transform Concept Maps
    • The Cookbook for Concept Maps
    • From Concept Model to Super Data Model or Fast Track?
  • Conceptual Schemas in OData CSDL Format
    • About CDSL, Common Schema Definition Language from OASIS
    • Looking at the CSDL file
    • Building the Concept Model from the CSDL Model
  • XML Schemas from the W3C
    • About XML Schemas
    • Looking at the XSD (XML) file
    • Building the Concept Model from the XSD Model
  • StarUML ERD Models
    • About StarUML
    • Looking at the UML (XML) file
    • Building the Concept Model from the UML® ERD Model
  • UML® as XMI® the EA Way
    • About UML® and XMI® and EA
    • The example: Open, Public Data
    • Looking at the XMI®-file
    • Building the concept model from the UML® model
  • Addendum: Improving Metadata Quality
    • The Business is King
    • Data Names Matter
    • Finding Standard Data Structures
    • Establishing Identity and Uniqueness
    • Presenting the Business Flow
    • Presenting the Keys
    • Presenting State Changes
    • Presenting Versions of Data
    • Housekeeping Proper
    • Scalar Data Types
    • Time Zones
    • Design is Decisions
    • Which Objects and Which Relationships?
    • Presenting Relationships and Missing References
    • Presenting the Right Level of Detail
    • Good Relationships
    • Identity, Uniqueness and Keys Revisited
    • Missing Information
  • Other books by Thomas Frisendal

The Leanpub 45-day 100% Happiness Guarantee

Within 45 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms...

Write and Publish on Leanpub

Authors, publishers and universities use Leanpub to publish amazing in-progress and completed books and courses, just like this one. You can use Leanpub to write, publish and sell your book or course as well! Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks. Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. It really is that easy.

Learn more about writing on Leanpub