An interview with Christoph Molnar
  • January 30th, 2019

Christoph Molnar, Author of Interpretable Machine Learning: A Guide for Making Black Box Models Explainable

50 MIN
In this Episode

Christoph Molnar is the author of the Leanpub book Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. In this interview, Leanpub co-founder Len Epp talks with Christoph about his background, what it takes to work on a Ph.D., his book and interpretability, as well as machine learning generally, some dystopian possibilities for the future, and at the end, they talk a little bit about his experience as a self-published author, including the systemic use of a free translation tool.

This interview was recorded on January 22, 2019.

The full audio for the interview is here. You can subscribe to the Frontmatter podcast in iTunes or add the podcast URL directly.

This interview has been edited for conciseness and clarity.


Interpretable Machine Learning: A Guide for Making Black Box Models Explainable by Christoph Molnar

Len: Hi I'm Len Epp from Leanpub, and in this Leanpub Frontmatter Podcast, I'll be interviewing Christoph Molnar.

Based in Munich, Christoph is a data scientist and interpretable machine learning researcher. Writing his Ph.D. at Ludwig-Maximilians-Universität, as a researcher he has a particular interest in making the decisions from algorithms more understandable for humans, which is an important topic in our current era, that people are becoming more aware of, and which we'll be talking about later in the podcast.

Christoph blogs about machine learning and statistics on his Machine Master Blog at And you can follow him on Twitter @ChristophKolnar.

Christoph is the author of the Leanpub book Interpretable Machine Learning: A Guide for Making Black Box Models Explainable. In this interview, we're going to talk about Christoph's background and career, professional interest, his book, and at the end, we'll talk a little bit about his experience being a self-published author.

So, thank you Christoph for being on the Frontmatter Podcast.

Christoph: Hi Len, thanks for the invite.

Len: I always like to start these interviews by asking people for their origin story. So, I was wondering if you could talk a little bit about where you grew up, and how you first became interested in technology generally?

Christoph: I grew up in Munich, that's also where I am right now. I went to university also where I studied statistics, Bachelor's and Master's. After that I actually went away from university. Many people suggested I should do a Ph.D. I had some kind of bad experience with my Master's thesis, so I had a lot of pressure.

I actually promised myself, "I am never going to do such kind of writing again." Which is quite funny, because I'm now writing the book. I went away for a couple of years working in Switzerland, then got back to Munich to start my Ph.D. on the topic of interpretive machine learning.

Len: That's really interesting. Just looking you up on LinkedIn and stuff like that, I guessed at what had happened. By coincidence, a very similar thing happened to me. I finished a Master'ss degree. There was nothing sort of bad that happened in my case. But I swore for various reasons I would never do any more university. And then I went to London and worked for a couple of years, and about halfway through, I was working 12 hours a day, six days a week, and then I found myself on my Sundays researching my old topic. So I knew I had more in me, and I went and did a doctorate. Did something similar happen to you?

Christoph: Yes, it's quite similar actually. So there was no like big, bad incident or something. The thesis went well and everything. It was just the pressure I put on myself, like it has to be perfect, it has to be right. What if I write something that's wrong? Just the stress I put myself under. Nothing from the outside really, it was just me. When I was through with my Master's thesis, I just thought, "No, I don't enjoy this writing, this putting something out there and not being sure if it's all correct." And I just wanted to get some work experience.

And then I worked, and saw how imperfect the world is, how kind of unprofessional everything is. This really helped me to see - I mean, also working with deadlines, where you just have to finish it; it cannot be perfect, you just have to finish it. Things like that switched my mindset a little. And also my second job, I worked part time. So I had one day off, and I promised myself not just to be lazy on the day, but actually work on projects.

It turned out that one of the projects was learning about interpretative machine learning. I started reading papers, and then I thought, maybe there's some good place where I can find a good overview of this topic. But I didn't find anything, so I started writing. It turned into a book at some point, and also into my Ph.D.

Len: I've got a couple of questions about your Ph.D. that I'd like to ask in a couple of minutes. The first thing I'd like to say is - we have something in North America called a Public Service Announcement. To anyone listening who is doing a graduate degree that involves writing a thesis, like a Master's degree that has a thesis part of it or a doctorate: just finish.

In my years, I've seen many, many people fall into what I call the "Ph.D. hole," where - it takes various forms, but one of them is, "I'm writing the greatest work that has ever been written on this subject and everyone in the world is going to read it." I'm making fun of it a little bit, but it's actually quite tragic. A lot of people lose a decade out of their life to simple to describe, but actually internally very complex, psychological traps. The best advice I ever got was "just finish."

I think you put it in a very good context there, where it's like - what people need to understand is that the rest of the world that you'll be in after you finish, is not going to be anything like the world that you're in. It's going to be worse. And that's not a way of detracting from the experience of the years you have researching, doing your graduate degree - it's just an observation about the nature of the rest of your life and other types of work.

With that public service announcement over - so you studied statistics. It's interesting, because I think people of say, my generation, might think that statistics is a kind of obscure thing that maybe social scientists go into. But nowadays it is very much not like that. Was it a kind of like a sexy thing to get into for you?

Christoph: It certainly wasn't as sexy as it is today. I actually didn't start with statistics. I started with electrical engineering, like for half a year. But I already knew there was statistics, and I didn't like electrical engineering that much. What attracted me with statistics was that I didn't have to decide what field I want to work in later, because statistics are like a tool box in the end, that you can apply to medicine, to insurance, to any field really. And also, like mathematics and programming. So that was a good package for me.

Len: And when you were in Zurich, were you working for a medical company?

Christoph: I worked one year for a startup, nd then later, two years. There was one year, and then two years for medical research. It was more like classic statistics, writing a paper, together with a rheumatologist.

Len: This is slightly random, but it is a sort of sub-theme of the podcast. What's the startup scene like in Switzerland?

Christoph: That's a good question. I don't have anything to compare it with. I mean there are startups, and this was, well, of course, a banking startup. Because it's Switzerland.

Christoph: I think there's also quite an active community around cryptocurrency and some startups in the field of insurance and banking - the fields where Zurich or Switzerland is traditionally strong.

Len: That's really interesting. I had no idea, and it totally makes sense once you say it out loud.

So you're working on a Ph.D. right now? I have two questions about that, I guess. One is, how are Ph.D. programs generally structured in Germany? Do you do, for example, some classes and some teaching and then move to your dissertation, like in North America. The other contrast is - where I did my doctorate was in England, and I was all but dissertation from day one.

Christoph: Tere might be a few like graduate programs really. In my case, at the institute where I do my Ph.D., they don't have a program of classes. I mean, we have once a year a meeting where we can present our research and our progress. But there's less of a structured credit program. The funding [?], they fund my Ph.D., and they have attached to that a graduate program. So I have a few workshops. But really it's not like mandatory to finish my Ph.D. It's just parallel to that.

So the big question is, do you do it like a monolith kind of thing? Like one Ph.D. thesis? Or like accumulative? I opted for the second one, to have individual publications.

Len: Is the first one that you were describing the Habilitationsschrift, or is that something else? I learned about this years ago, Is that something totally different?

Christoph: Let me think for a second. That's for [becoming a professor].

Len: My second question is, what's the topic of your thesis or your dissertation?

Christoph: Well on top of machine learning - I'm not sure if I should go into detail at this moment, but I am writing like methodological papers, so I don't have a specific data set, or a field I work in. But I develop methods that help to interpret machine learning models. My plan is to write a couple of papers and be finished within my three years. I'm now on my second year.

Len: Good luck, I wish you all the best.

Christoph: Thank you.

Len: It's a big endeavor.

Moving on to the subject of your book, I was wondering if we could talk a little bit first about what machine learning is?

Christoph: Yes, so machine learning is when machines learn from data to make - or learn patterns. Often it's for making predictions. Then that's a sub-field actually of machine learning, which has got supervised learning. But I guess the biggest part of machine learning - or at least one very big part of machine learning, and I like to compare it with regular programming; in regular programming, you give explicit instructions to the computer and say, "Okay if this happens, do this. If that happens, do the other thing."

In machine learning, you turn this into: you still have to program, but you turn this into like a problem where the computer can learn from data to make decisions. So you move from explicit instructions to implicit knowledge that's in data. Machine learning is like a set of methods that can extract knowledge from data.

Len: And because it will become important going forward, when we talk about this - I think a lot of people think an algorithm is kind of a magical thing. Can you maybe talk a little bit about what an algorithm actually is?

Christoph: Yes, an algorithm is like a recipe, really. Like cooking. An algorithm tells you like the steps you have to take to get to your result, which is then the final dish. And when you program a computer and you write an algorithm, it's really a step-by-step instruction.

In machine learning, the algorithm is the part that actually learns a what I would call a model. It's instruction on how the computer should learn from data.

Len: I'm curious. I'm sure you have to deal with this all the time, talking with people about it. But there's a difference between, say, machine learning and deep learning and artificial intelligence. As we define our terms, can you maybe explain a little bit about what deep learning is? Or at least what makes it different from machine learning?

Christoph: I would say that deep learning is a sub-field of machine learning. Because it is still about learning from data. But deep learning is focused on like deep neural networks. So anything that works with deep neural networks, really.

Len: And what's the difference between all this and artificial intelligence?

Christoph: Artificial intelligence is a bigger term that completely like encapsulates machine learning, but could also be in most other fields.

Len: So I just pick maybe a specific example that listeners may have heard of - if Google gives something that it's built millions of images of cats and other images, and then sort of trains the thing that it's built to identify, "Is there a cat in this image or not?" - is that machine learning?

Christoph: Yes, definitely. And it's deep learning. Usually when you see something with images nowadays, it's something with deep neural networks. Because they are particularly good at detecting stuff in images. The other field would be text where these deep neural networks are really good at.

Len: It's a really interesting topic, and there's just so many ways into it - to try and tease out what's so important about our present life and our future life when it comes to machine learning. But I wanted to ask you a question about something pretty specific, and hopefully a little bit fun.

When researching for this interview, I sort of scrolled through your Twitter feed a little bit and came across some tweets which you might not even remember, about a company called Faception.

Christoph: Yeah.

Len: Where you wrote, and I'm going to quote you back at yourself from Twitter, so maybe that's a bit rude, but it was - I find it really funny. Sarcastically you said thatpeople "complain it's unscientific to infer behavior from facial features", and then that "their chain of logic is rock-solid: DNA influences behavior, DNA influences your face, = We can see from your face whether you play bingo or are a terrorist."

I wanted to talk to you about a couple of things there. One of them which is like a high level question, but, do you think that machine learning is going to be able to give its users conclusions along the lines that Faception is aiming for? So just bracketing the matter of the face itself, but broader patterns of activity. Let's limit this not to activity, but just to physical characteristics.

Christoph: Well you can always - even if you don't know anything about machine learning, you can always ask yourself like, "What do you want to predict?" Or, "What does the company want to predict?" In this case, it was, I don't know? I think like behavioral things like if - actually the bingo player thing was not a joke, it's actually on the website. And the other question is like, "What do you use as inputs?" Which in this case, are images of like profile photos. And then you can ask yourself, "Do you really think you can infer if someone is a bingo player from these photos?"

Len: That's incredible. And I think they're talking about personality analytics and things like that. Can you talk a little bit about what that is?

Christoph: Sort of what they are trying is to predict personality traits from images. And well, I was sarcastic in this tweet. There's also a big history about that, like, there have been a lot of misuses in history, trying to infer like behavioral traits from your face. There are many things that can go wrong if you do this with machine learning. You can have a biased set of photos. They can come from very different sources. So maybe you find some things that could distinguish, or can be predicted for behavioral trait. But this is maybe just like the setting of the photo or something -

Len: I've got some questions about that. That's a huge topic that you're broaching there. Just for people listening who might not have heard of it, one of the things I think Christoph might have been referring to in the past was something called phrenology.

Christoph: Yeah exactly, that's the word.

Len: In the 19th century, where it became quite en vogue to try to come to conclusions about a person's personality, based literally on bumps on their skull and other features of their skulls. This idea has surfaced in various ways throughout history. And here we are again, facing a new manifestation of it, where there's a company - and I presume that there's lots of other companies out there trying the same thing - to take pictures of your face, or pictures of your life, and draw conclusions about your personality from that.

This is a slight digression, but I wanted to ask you about this, because presumably after you finish your doctorate, you'll be in the job market - and hopefully I'm not sabotaging anything for you asking you this question. But back in the dot com boom days in the late 90s, I was working in a job that involved reading about mergers and acquisitions and IPOs all day long. And it was incredible. If something just added dot com to the name of it's company, all of a sudden it had all this credibility.

This has been happening notoriously with blockchain and stuff like that, but also companies like this one that you were criticizing, seem to be able to add words like "technology" and "machine learning" to their website and boom, they've maybe got some investors or some customers. Is that something that's happening now? And if it is, how can people tell the difference between a legitimate use of machine learning and a fraudulent one?

Christoph: I believe that this is the case - that we see a lot of companies also that just write AI or machine learning outside, and they get investor money just by doing that. Same as we have seen with blockchain. Also what I've seen or heard of that - companies that actually use very simple models, then they call themselves like, "Oh we do do some AI," but it's just something very simple that they do.

Then the question is, how can you decide whether it's legitimate or maybe just over-hyped? It's very difficult, because you have to look inside what they're actually doing.

But you can also have some common sense, like you have to ask yourself, "Have they access to good data? The thing that they want to predict, can they predict it from this kind of data set at all?" So just something you [can answer if you're not in machine learning]. "Do you think the knowledge is in the data?" Because that limits what the machine learning algorithm can do.

Len: On the subject of data actually, that brings me back to what I said when you broached a big topic a couple of minutes ago. One of the things that you wrote about in that funny Twitter thread, but you write about elsewhere, and is sort of notorious in the machine learning world, is that data collection to some extent can rely on implicit assumptions - that data itself can be biased in the mode of its collection, for example. Or the sort of sources of the data can themselves have biases.

I think I might be inventing this slightly, but there's an example that sort of corresponds to this, that I read about, where a machine learning machine was making mistakes in identifying one animal from another. And it was because when the machine had been trained to view the animal, the animal had always had trees in the background, or something like that. And then all of a sudden, if you started seeing that animal without trees in the background - well the machine, of course it doesn't understand what the animal is, or that it's an animal. When you fed it certain things, it spat out certain results, that over time more and more were ones that its users were looking for in response to the questions they were asking. But the data had this problem in it. We'll touch on that a little bit when we talk about interpretability.

What do people working in this space do to try and clean up their data before they feed it into the machine?

Christoph: That's really a tough thing to do, because usually you have a really big data set. And sometimes you also don't know what you're looking for.

For example, the case you mentioned - like deciding between classes. You have to notice as a human that maybe all of these images have trees in the background. And by the way, it can be fine. Like if all the future images would also have those trees in the background, you can always distinguish a dog by this background, and it will work just fine in the future. But [?] it might go wrong in the future. This is a really difficult problem, to get the right kind of data set.

Len: And is that solved by something called - partly, or at least, is it an attempt to solve that, through something called labeling?

Christoph: The labeling - usually that's the data, and you have the thing you want to predict. The labeling is assuming you know, you mean the same thing. But labelling’s that you attach a label to it. It's like the thing you want to predict. Because sometimes you don't even have that.

For example, I worked in a startup. This was during my student time. We built tools with machine learning to predict things for documents.

For example, we had one document-type classifier, which would say for a document if it's- So you'd just feed in like a PDF document and the machine learning model tells you in the end if it's a bill, or maybe it's just some reminder of something. And the model just does the classification.

For example, labeling manually. This is also a source, it can already introduce bias. Because sometimes maybe I don't recognize a certain kind of bill, and I always label it incorrectly. Then I already have a bias.

But also at the startup, we started with our own documents. And this might or might not be the same kind of distribution as the documents that will be later fed to the machine learning model one, when it's actually part of a product.

Len: Thanks for that explanation - actually I was asking, more or less, "What is labeling?" Because I'd come across the term in your work, but wasn't quite clear on what it was. I had the impression that it was manual - at least at the beginning, this work.

I think one of the questions that people who start getting into machine learning have a little bit is - well, really, how much is being done by the technology, and how much is actually being done by people? It's interesting, because, to put it very crudely - if you need to do a lot of manual work putting it in, and then you need to do a lot of manual judgement with what comes out - what's the point of the in-between?

Christoph: I think in many cases you have to do this labeling in advance. So some data sets already might come with labels, but usually the expensive labels are the ones that come from humans. For example, a doctor, based on some x-ray, might be classifying like what kind of fracture it is or whatever. This could also be labeling - for example for x-ray images. What was the question?

Len: The question was basically, I think that people who might be skeptical about some - let's say companies that represent themselves as having solved the problem with machine learning. How much of the work is actually manual, and how much is really being done by the machine?

Christoph: Yeah, I think the big hope is to do it just once in the beginning and then the machine runs flawlessly. I think that's the big hope. But I have also heard of startups that like - I'm not sure if these stories are true, but for like chat bots. Because chat bots are really difficult to program, and I think they're not really any good, chat bots, like where you chat and the bot answers you. I've heard of startups that still have humans in a loop to actually answer the questions - in the hope that the machine at some point can take over, because it learned from all those dialogues.

Len: We'll get into interpretability here in a little. So there might be some areas in life, where we might be willing to just rely on a machine to give us some guidance. "Show me a funny cat picture." But there are others where - not only might we not want to simply rely on it, or just trust it that completely, but also we might actually be required to show an explanation. For example, if something that you've built gives you some guidance about what to do next in the medical treatment of a person, you actually need to be able to understand why it's giving you that guidance. Is that correct?

Christoph: That's very situation-dependent, I think. If something works very well, like, I don't know, navigation - like a navigation app, then it just tells you, "Okay go down there," and you just trust it because it worked in the past. I don't think you need necessarily always an explanation for that.

But if it's bigger decisions, or especially if it's a wrong decision or if you think it's the wrong decision, then these are cases where you might want an explanation. Or also if you have experts working with some system, especially like in the example you mentioned with doctors. I guess usually you would, at least in the beginning, always have an explanation with the predictions you give.

Len: Speaking of situation dependency, that gives me an opportunity to get right into the guts of your book. Near the beginning you have this really interesting technique where you tell a couple of very short stories, about ways that machine learning might insert itself into our lives in the not so distant future, one of which is about a medical pump that kills a patient by giving him too much morphine. And then there's a discussion between two of the hospital staff about what happened, why the machine did it. And one of them actually speculates, "Well, maybe it was putting him out of his misery?" And then his colleague says, "Look, it's just a bug."

And then the second story is about someone in a subway station who is suddenly denied access to the subway, because she discovers her civic trust score has dropped. This is something people might be familiar with - both from science fiction, or from reading somewhat exaggerated stories about something that's happening in China. But a civic trust score is partly the idea that your government might have a surveillance system in place that assigns a score to you about how good a citizen you are. And basically there are rewards and punishments associated with your score.

But her situation is interesting, because it's sort of Kafkaesque. She doesn't know why her civic trust score dropped. And this is something that I think is at the heart of the project of your book - is that interpreting what machines do when they're using machine learning algorithms or processes, is really important from a lot of perspectives. So there's the doctor who has to justify a decision to the insurance company. There's also the person who doesn't get the job because something read their resume and didn't move them up the ladder.

Can you talk a little bit about how can we interpret output from machine learning systems?

Christoph: These stories are quite dystopian. They are really, really dark. I think - well, we have some techniques. I mean if it were really easy to explain what a machine learning model does, we could just program it ourselves. But also, this means that they have some kind of complexity behind them. So it's not easy to summarize what they do with just a few lines, or with just a few examples.

But we have techniques - for example, to understand what were the most important inputs to the prediction. And how do these inputs change the prediction? For example, when we have a machine learning model that gives credit scores or predicts the likelihood that someone will pay back a loan, and then you have certain inputs, like how much does this person earn and what sector does this person work in? Well, you could explain overall what your system does. Like, what were the most important inputs? Maybe the most important input, it's the salary of the person.

But we also have techniques - for example, to explain individual predictions. So if one person is rejected, then we can try to explain why this person was rejected.

One technique is counterfactual explanation. Where we can say, "Okay, this person got rejected, how would this person have to changes so that he or she would get accepted, like a loan application would get accepted?" And then it's a search that we try to change some individual inputs. What if this one person would not have a temporary contract, but have an unlimited contract? Would this change the prediction?" And then we also have other techniques for similar goals.

Len: So for example, if someone were applying for a job, they would've had to give the system they're interacting with some data about themselves, like their first name and their last name, where they went to school, their job experience, things like that. Let's say they get rejected - you could test what happened by say just changing their first name to a typically male name from a typically female name, or something like that, and see if something else comes out.

Christoph: Yes.

Len: That makes a lot of sense to someone outside the space like me. But I guess one question I have is, what if the algorithm itself has changed in the meantime, if it's a self-learning system? I think you talk about model-agnostic interpretation methods? How do those work?

Christoph: Model-agnostic, also these counterfactual explanations are model-agnostic. Model-agnostic means that it doesn't need to look inside the model, these methods. The question is, how can we know anything without looking inside? Well, you can always try to change the inputs and observe changes in the output. It's surprising how much you can infer about the model behavior, just by manipulating inputs and observing what happens. And also these counterfactual explanations, like, "What happens if I?" Or, "How would this person have to change, to change the prediction from rejected to approved?" This is also model-agnostic, because we don't have to understand inner workings, but we can still learn something about the model if the answer's, "Yeah, you have to increase your salary-" Or, "You should quit five of your credit cards to get the loan."

Len: This brings up a really interesting topic. I'm going to quote you from your book here. You say, "I predict we will see a future with a lot more machine learning algorithms integrated in every aspect of our life and, coming with that, also regulation and assessments for algorithms, especially in the health, legal and financial industries." It's really interesting you bring up regulation there. When technologies emerge, typically there's a lot of excitement, and it's later on the government comes in to sort of wag its finger and says, "Party's at least partially over, now."

One of the questions I have about the kind of regulation that we're going to see emerging around machine learning algorithms is - in order to interpret a company's machine learning algorithm, I might need access to it. But companies are, of course, notoriously protective of all their intellectual property. Do you think that we're going to end up in a world where, for example, if an insurance company is using algorithms to decide how much to charge its customers, governments are going to step in and say, "Well maybe we'll let you use this technology, but you need to give us a way in the back door to make sure everything that's happening is in alignment with our laws and customs in our society."

Christoph: I think this will be very sector-dependent. So for example, if you develop medical products, you'll have to prove that your products work the way they are intended to work, and that they are safe. And now we have companies starting to use machine learning, at least for part of those products - maybe, for example, to detect things in x-rays, or likein other kinds of imaging. And now they have to approve that they are safe and reliable. I think then we will see interpretability as a requirement - or at least, I'm not sure how else I would show that they are - how they could prove reliability and safety.

Len: I'm sort of jumping ahead a little bit here. But at the end of your book, you have a section called, "A look into the crystal ball," and a line about how in the future, robots and programs will explain themselves.

Christoph: Yeah, that's a big--

Len: You set it up that you're making sort of fun predictions there. But that's a really interesting one. I think, in its own way, it shows the complexity of the problems around trust and ethics. Because if you don't trust it, you're not going to trust its explanation either. Will the programs explaining themselves be enough? Will we want to have another layer of explanation coming from somewhere else?

I'm just going to kind of go to the view from 30,000 feet here. But it's just so interesting. You mentioned that your stories were dystopian, and they are. The short stories at the beginning of the book where you talk about some future applications of machine learning.

But it reminds me of an interesting and very sad story, where a few years ago in the province of Quebec, in Canada - a man and his daughter were out for a motorcycle ride, and they died because they went over a hill and they ran into a car that had been stopped right in the middle of the highway. And this car had been stopped by a woman who had seen some ducks crossing the road. And so she just decided to stop in the middle of the road to let the ducks go past, without thinking about what might happen to people coming from behind her. Presumably she hadn't noticed that she'd just gone over a hill, or something like that.

I think there are like a million people in northwestern China who are in internment camps now. And there's problems of that kind all over the world. There's things we encounter in our day-to-day lives, like the woman in your story who, because she's lost her civic trust score, she's not let through the door. But of course we can think of all kinds of examples in history, and in our current time, when people are denied entry to places based on judgements that people make about them.

And so when I said, "The view from 30,000 feet," at the same time, if a Tesla is in a semi-autonomous driving mode and gets in an accident, everybody freaks out.

What is it about machines that drives people to put them into a different box when they're making decisions, as opposed to when a direct person is making a decision? Let's put it this way - why, and this might be mixing things up - but why are we freaked out about a robot causing an accident, and not a person?

Christoph: Maybe because of its novelty. This is a new thing, and I think anything that's new and shiny, we talk a lot more about it and question things that we stopped questioning in other areas of our lives.

So for example, the Tesla car accident. There's a lot of talk about autonomous cars. But maybe next to where you live, there might be a street where, just because of the setting, like maybe there's a sign missing or whatever - there are a lot of accidents. But it's not in the news as big as like a single accident of a Tesla. But hundreds of thousands are dying every year from car accidents.

So there is certainly, as you mentioned, two ways of viewing kind of similar things. But because one involved a machine, we shine some extra spotlights on it. I think we need also to have some discussions. But sometimes it can like be a bit extreme in the sense that we already have that problem, but we don't talk about them as much.

Len: That's really interesting. I've never thought about things in those terms before. It's a reminder of things we've stopped talking about, or questions we've stopped asking ourselves about. I'm going to think about that. I see a new path for thinking about that. It;s so interesting that there is this aspect of repression to things that we just sort of push out of our thinking in our day-to-day lives - like the million people that die every year in car accidents doesn't bother us, but something else does.

And there's something about the inhumanity, perhaps, of a machine doing something, that allows us to actually get past the thing we're repressing about ourselves. Because now it's about something else.

I guess my last question is - as someone who's sort of working in this field, some of the things that we read about every day - like is self-driving coming? Do you think that that's something that's actually going to be a part of our day-to-day lives, generally within the next 10 years?

Christoph: Self-driving cars?

Len: Yeah. I know it's a totally selfish question, I know it's not your field. But you're closer to it than I am and probably everybody listening, so -

Christoph: My prediction's probably as good as any. I think with these new technologies, they kind of end up differently than we think they might. I mean, the change from horses to cars meant also a lot of changes in the infrastructure and how these vehicles in the end are used.

I'm not sure if it will come in 10 years or 20 or - I think in some ways it will come. I mean, like small features - we already have staying in lane, and things like this. But when they come, it will not be like today, just with self-driving cars. Some other things need to change, I think. But I can't say if it's in 10 years or 20 or 100 or 200.

Len: And I gather you're not optimistic about chat bots either? Well, I shouldn't say "either."

Christoph: Well, at least I don't see that they're working at the moment. I think still - these are really big, difficult goals, like autonomous driving and chat bots that can like talk to us like a human. I think there is still like a lot of small things that we can already do with machine learning. And if we do those, we can already achieve a lot.

Len: What are a couple of examples of those things, to end this part of the podcast on a positive note?

Christoph: Like repetitive work, like classifying documents. For example in big companies - or like labeling images. When people [can do some] labeling in images, and then a machine can do it, once it's well-trained of course - just to go through vast amounts of data and extract some knowledge, or automatically label it.

Len: Moving onto the next part of the interview. There's a couple of interesting features about how you went about writing your book. One of them I think is that you first published it open source on GitHub.

Christoph: Yes. I started out not with Leanpub - well actually, it starts a bit earlier. I use R. It's a statistical programming language, but it evolved to be able to do a lot of things. And one is to build websites, where you can automatically insert figures that are computed from data.

There's also one software package, it's called bookdown, that allows you to create a mix of web page and a book. I still publish it in this kind of way, my book. And that's how I started all of it. I still have this website, and also the code is completely open source from my book. It's on Github, so anyone who wants can get the code play with it, and maybe add some things to the book and print their own version if they want to.

Len: And you published the book in progress, I believe?

Christoph: Yes I did.

Len: How did you go about that? How did you decide when you had enough to first publish?

Christoph: I had a few chapters, and I just said, "Okay, now it's time to announce to the world that this exists, this thing." I did so on Twitterm and it kind of took off from there. I had a lot of like likes and retweets, and people started actually reading it. I was quite surprised. I mean, I know it was an interesting topic, but I didn't think so many people would go to this website and actually read what I write.

This was quite good feedback, and also motivation to keep going. And then a bit later came the idea to also - I found Leanpub and decided to also publish there, to make it a bit more like book like and a bit more official, I would say.

Len: It's really interesting your experience, there. That's something that a lot of people who publish in-progress - whether it's on Leanpub or not, I mean of course we think everybody publishing in-progress should be on Leanpub at some point in the life of their project - finishing a book is largely a matter of motivation. And publishing early can really give you that motivation. Because if you see other people reading your book, even if it's like one person sometimes, it's enough. It's enough to keep you going.

On the flip side, another reason to publish a book in progress is - you might publish it, and you might not get any retweets and you might not get any likes. And then instead of spending three years on a project that, in the end, it turns out no one wanted to read about, you might find out a lot earlier on.

Another really interesting thing you did writing your book, was that you used a translation service called DeepL. I wanted to ask you a bit about that, because I'm sure that people who've written Leanpub books before have used translation services here and there, but you're the first person I've come across who talked about it.

This really interesting moment we're entering into, when it comes to automated translation - it's come along a lot more quickly than I personally thought it would. One time, after a talk in town, I was having dinner with the speaker who was actually German - here where I live in Canada - and we went to a restaurant, and he took a picture of the menu , and it was translated into German. And he's like, "I can tell you this is almost perfect."

So, you were writing a book with something a little bit more complicated than a menu, or at least, maybe not a hipster menu. But how did that work? Did you actually take every paragraph that you wrote and put it through DeepL?

Christoph: So DeepL, I started using a bit later in the progress. Sctually when I started, I didn't know about DeepL. And maybe, well it's a nice bridge back to what we talked about before. DeepL, I think the "D" comes from "deep learning." But at least, I'm sure they are using deep learning.

This big leap you mentioned, it comes from applying neural networks to machine translation. So this big leap is an example of a successful story of machine learning.

I'm not a native English speaker, as you might have heard by now. And also when I write, I don't write perfect English.

So I started trying out this DeepL, kind of misusing it actually. Because I didn't write in German, I already wrote in English. So I started translating like section by section or paragraph by paragraph into German and back into English - just like copy-paste twice. And then I get like an English version of my text that went twice through the translation. It works much better than you would expect. Of course it's not perfect, so you couldn't like take it and just copy/paste it back into my book. So I compared line by line, and if I like the machine translation more than my original text - I then used the machine translation if it was better.

Len: It's really interesting. So you would write something in English, which is not your first language. And then you would use DeepL to translate it into German, which is your first language. And then you would use DeepL to translate it back into English. And then using your understanding of German and English, along with this machine - you would then edit the output of that process.

Christoph: Funnily enough, I didn't even look at the German translation usually. Just a few times, because I was curious when something went wrong. But actually it could've been Spanish for my purpose as well. I don't really speak Spanish. But the act of translating it twice, it kind of smooths out a lot of errors, even typos, and like adds the commas at the right places, and stuff like this.

Len: That's just so fascinating, being on the publishing side of things, to think about where something like that might be in 10 years. In particular, I would say - I suspect that for certain types of, say, philosophy or cultural criticism or something like that, actually successfully translating using processes like that might be something we can never do. But there are all kinds of things, particularly, for example, let's say programming books, where - one thing we know from Leanpub and from in-progress publishing is that good enough is good enough if someone needs something.

And if we could be in a world where you could write in your native language and then have your work available to people who can't read that language automatically - it would be just an amazing advance for spreading knowledge.

So, thank you for explaining that process, and for doing it. Because it's just such an exciting moment.

The last question I always like to ask on this podcast, is - if there were one thing on Leanpub that we could build for you, or one problem we could fix - what would you ask us to do?

Christoph: I know that you ask this question, and I thought like, what was my pain point with Leanpub? It took me kind of a long time, because I used, in the beginning, this booktown package to publish it on a website. And then it's already done in Markdown, but I still had to convert it to like your style of Markdown. It took longer than I expected.

One thing that would have helped me, would be some better error messages. Because it turned out that a lot of things were because of some things I couldn't write in formulas, like mathematical equations. But I didn't get the right error message, so I did a lot of back and forth, like trial and error, to understand what I did, or what doesn't work, or what does work. So that's a little technical thing, but this -

Len: No, thank you very much for bringing that up. That's something that I wish we were better at as well. What Christoph's talking about, for those listening, is that if you're writing a Leanpub book using, let's say - if you're writing it in plain text, you might do something that doesn't look right after you create the PDF, EPUB, or MOBI, or read online version of your book - or you might just actually, in some cases, cause our book generation process to fail. And when that happens, it would be much better for authors, and it would be much better for Leanpub support, if we had more robust error messages. I mean, the reason it blows up is because there's a problem. But even if we can't say what the problem is, if we can narrow down where in the text it might be, in various ways, that would really help with that.

Christoph: Yes.

Len: It would help us automate support, and it would help authors fix problems on their own as well. And it helps us find bugs and things like that. So thank you for your vote for improved error messages. It's something that we definitely think about from time to time.

Well, thank you very much Christoph, for taking the time to do this interview. I really appreciate it. And thank you for being a Leanpub author.

Christoph: Thank you for letting me be a Leanpub author.

Len: Thanks.

Podcast info & credits
  • Published on January 30th, 2019
  • Interview by Len Epp on January 2nd, 2019
  • Transcribed by Alys McDonough