A Data Engineer's Manual
Minimum price
Suggested price

A Data Engineer's Manual

About the Book

A great deal of hype recently has been directed toward the data scientists who use powerful algorithms and visualization tools to develop new ways of analyzing business data and find new insights.  This is challenging, creative work, but by itself a new model or report only provides a one-time benefit.  There is an increasingly important new role that has received much less attention than it deserves: that of the data engineer who can take a new model or algorithm and automate it, making it repeatable and accessible to non-expert users such as managers and customers.  These unsung heroes create analytical systems, also called "data products", that are critical for organizations to reap ongoing benefits from their data assets.

In A Data Engineer's Manual, we dive into a hierarchy of fundamental knowledge you'll need to understand and work on data products.  We will explore "data in the wild", that is, what forms it takes and how it is communicated over the Internet; learn about the roles played by different types of databases---relational, dimensional, and NoSQL; and examine how new data technologies change analytics workflows and deliver value to the business. 

About the Author

Joseph Clark
Joseph W. Clark, Ph.D.

Joseph W. Clark, Ph.D. has researched and taught information systems and data analytics topics since 2006, most recently at the University of Maine. He was one of the first generation of Web developers in the 1990s, and has been fascinated with databases and data modeling since he first learned how relational databases could power dynamic websites, around 1999. His academic interests lately have been at the intersection of data analytics and entrepreneurship, and new types of workflows such as Agile, Lean, and Design Thinking. His most ambitious project yet is raising four children with his beautiful wife, Xiaofang.

About the Contributors

Table of Contents

  • Preface: Productizing Data
    • Analytics and Business Value
    • A Hierarchy of Data Engineering Knowledge
    • References & Recommended Reading
  • Chapter 1: Atoms, Bytes, and Databases
    • Atoms
    • Bytes
    • The Trouble With Files
    • Databases
    • Data Engineering
    • References & Recommended Reading
  • Chapter 2: Data in the Wild
    • The Context of Data Sharing
    • Formats for Data Interchange
    • Other Technologies for Data Serialization
    • References & Recommended Reading
  • Chapter 3: A Multitude of Databases
    • Defining the Database
    • Data Models
    • Databases in Applications
    • Choices in Logical Database Design
    • References & Recommended Reading
  • Chapter 4: Analytics in the Database
    • Querying a Database
    • The Query Optimizer
    • Complex Analyses Made Simple
    • Crunching Big Data with MapReduce in the Database
    • Logic in the Database
    • Summary
    • References & Recommended Reading
  • Chapter 5: Opening Your Data to the World
    • The language of the Internet
    • Communicating in Data
    • REST to the Rescue
    • Designing an API
    • Summary
    • References & Recommended Reading
  • Chapter 6: Data for Humans
    • Value from data
    • Analytics for humans or for machines?
    • Augmenting the brain
    • Iterating toward enlightenment
    • References and Recommended Reading
  • Chapter 7: A Workflow for Analytics Development
    • The “Black Box” View
    • A Pipeline to Analytics
    • Agility in Analytics
    • References and Recommended Reading
  • About the Author

The Leanpub 60-day 100% Happiness Guarantee

Within 60 days of purchase you can get a 100% refund on any Leanpub purchase, in two clicks.

See full terms

80% Royalties. Earn $16 on a $20 book.

We pay 80% royalties. That's not a typo: you earn $16 on a $20 sale. If we sell 5000 non-refunded copies of your book or course for $20, you'll earn $80,000.

(Yes, some authors have already earned much more than that on Leanpub.)

In fact, authors have earnedover $12 millionwriting, publishing and selling on Leanpub.

Learn more about writing on Leanpub

Free Updates. DRM Free.

If you buy a Leanpub book, you get free updates for as long as the author updates the book! Many authors use Leanpub to publish their books in-progress, while they are writing them. All readers get free updates, regardless of when they bought the book or how much they paid (including free).

Most Leanpub books are available in PDF (for computers) and EPUB (for phones, tablets and Kindle). The formats that a book includes are shown at the top right corner of this page.

Finally, Leanpub books don't have any DRM copy-protection nonsense, so you can easily read them on any supported device.

Learn more about Leanpub's ebook formats and where to read them

Write and Publish on Leanpub

You can use Leanpub to easily write, publish and sell in-progress and completed ebooks and online courses!

Leanpub is a powerful platform for serious authors, combining a simple, elegant writing and publishing workflow with a store focused on selling in-progress ebooks.

Leanpub is a magical typewriter for authors: just write in plain text, and to publish your ebook, just click a button. (Or, if you are producing your ebook your own way, you can even upload your own PDF and/or EPUB files and then publish with one click!) It really is that easy.

Learn more about writing on Leanpub