Daniel Godoy, Author of Deep Learning with PyTorch Step-by-Step: A Beginner's Guide
A Leanpub Frontmatter Podcast Interview with Daniel Godoy, Author of Deep Learning with PyTorch Step-by-Step: A Beginner's Guide
Daniel Godoy - Daniel is the author of the Leanpub book Deep Learning with PyTorch Step-by-Step: A Beginner's Guide. In this interview, Daniel talks about his background and his varied career, economics, machine learning and self-driving cars, his book, and at the end, they talk a little bit about his experience as a self-published author.
Daniel Godoy is the author of the Leanpub book Deep Learning with PyTorch Step-by-Step: A Beginner's Guide. In this interview, Leanpub co-founder Len Epp talks with Daniel about his background and his varied career, economics, machine learning and self-driving cars, his book, and at the end, they talk a little bit about his experience as a self-published author.
This interview was recorded on February 17, 2022.
The full audio for the interview is here: https://s3.amazonaws.com/leanpub_podcasts/FM199-Daniel-Godoy-2022-02-17.mp3. You can subscribe to the Frontmatter podcast in iTunes here https://itunes.apple.com/ca/podcast/leanpub-podcast/id517117137 or add the podcast URL directly here: https://itunes.apple.com/ca/podcast/leanpub-podcast/id517117137.
This interview has been edited for conciseness and clarity.
Transcript
Len: Hi I'm Len Epp from Leanpub, and in this episode of the Frontmatter podcast I'll be interviewing Daniel Godoy.
Based in Berlin, Daniel is a data scientist, developer, teacher and writer, who has been working for over twenty years in a range of industries and sectors, including banking, government, and retail, amongst others. In recent years, he has also been teaching machine learning and distributed computing technologies at Data Science Retreat, the longest-running Berlin-based bootcamp on the subject.
You can read Daniel's popular posts at dvgodoy.medium.com and Towards Data Science,
You can follow him on Twitter @dvgodoy and check out his profile on LinkedIn.
Daniel is the author of the Leanpub book Deep Learning with PyTorch Step-by-Step: A Beginner's Guide.
In the book, Daniel uses a conversational and first-principle approach to help beginners interested in learning about Deep Learning and PyTorch, a tool used to make it easier to build models in the Python programming language.
In this interview, we’re going to talk about Daniel's background and career, professional interests, his book, and at the end we'll talk about his experience as a self-published book author.
So, thank you Daniel for being on the Leanpub Frontmatter Podcast.
Daniel: Thank you for having me here, and thanks for the nice introduction.
Len: Thanks. I always like to start these interviews by asking people for their origin story. So, I was wondering if you could talk a little bit about where you grew up, and how you made your way into your career?
Daniel: I was born in Rio, so I guess everyone knows Rio. But when I was very, very young - four years old or so, my parents moved to Porto Alegre, a city in the south of Brazil. I lived there until 2015, so most of my life.
When I was very young, I started coding, programming, already. So this is something that is with me since I was eight years old, and even younger than that.
In 1982, my father bought a computer. I started copying code from the BASIC language to the computer, and running the stuff. That's something that has been with me all my life.
Then I did a degree in Computer Science, still in Porto Alegre. After that I worked for a little bit over ten years as a software developer in a small company - also in Porto Alegre, that I used to develop banking systems for financial institutions - credit, lending. All these kind of things. I always had a taste for the financial sector. It is something that always got my interest. That was a very nice fit.
But then at some point, I was not so happy with being in the same place for ten years. I kind of got tired of that. Then I tried something completely different.
I got a government job in Brazil. That is very popular there. Because you get a nice, nice pay slip. It's very chill most of the time.
I started working for the Secretary of Finance for the Treasury Park in South of Brazil. It was a completely different world, right? I was not coding or programming so much anymore. I was still doing some of that, to help improve the processing and stuff in the job. But most of it was about making financial projections for revenues and debt ratios for the State. These kind of things.
A funny thing is that at some point, in 2012 - sometimes it's like with the butterfly effect thing - I really like to think that some of the stuff that happens to you in life is so - a moment that - someone assuming something that's so simple, but ended up having a major impact on everything.
I remember I was having a conversation with a colleague of mine, and we're like, "Gee, we are analyzing this - now if we -" For people that are not familiar with it, it's kind of boring - debt to revenue ratios and stuff.
And it was like, "Why is our State lagging so much behind the others? Why is the city where I used to live in so bad in terms of that?" Then we started making hypotheses and trying to figure out what happened in simulations with that. It really caught my interest, because of, "Okay, well, maybe I can try to simulate what happened, right? And make a projection, a forecast based on that."
We got something going. Then a colleague of mine said, "Maybe you should write a paper on that?"
And then I wrote a paper on that, using stochastic simulation to make these forecasts. I submitted this paper to an award, the National Treasury Award in Brazil. And then won that. I never thought I would have won that.
I was like, "Okay, yeah, I wrote something that was cool. I submitted it. Maybe someone can acknowledge that?"
There was all this fuss about it. One of the major newspapers in Brazil got in touch. They made an article based on the results and all stuff. That was a life-changing moment. Because at that moment -
I like analyzing data. I can pinpoint that was the starting point of my career as a data scientist. That was like, "Okay, I want to do that."
I started investing a little bit more. I wrote more things for this - every night, in the public sector in Brazil, analyzing other things - like efficiency, and other topics. But then, after five or six years, I was like, "Okay, there is nothing much left for me to do here." I wanted to do something different again. I wanted to invest more into data science career.
My wife always wanted to move abroad. She wanted to be a diplomat. She wanted to work at the embassy and stuff. But she was doing something else.
And then we were like, "Okay, maybe we can try to go to Europe to see what we can do there?"
That's when I found this data science retreat, this boot camp in Berlin, that you also mentioned in introduction. I applied to it. They selected me for the boot camp. And then at the end of 2015, we moved there. There's unpaid leave from the job, because I did not know if this was going to work or not, right? It's a major, a major change.
But then three months passed. I graduated from the program. I managed to get a job in Berlin, where there's more fintech.
And then things start happening. After a year or so, they invited me to teach there. That was something that was also kind of out of the blue. I mean, I always liked explaining concepts to - trying to figure out how things work. That is something that I was always interested in doing.
But then I was like, "Okay, they're teaching to a bunch of students that were also like me - trying to get their career started in data science." That was very nice. Because then I found out that - okay, not only do I like data science - but I also like teaching.
I've changed jobs a couple of times since then. But I've been teaching in the data center [?] from 2016 until 2000 - the beginning of 2020. Then with the pandemic, everything changed, right? Because it was the main onsite program. It would make sense to go there and interact with your colleagues and you develop projects. And all of a sudden, yeah - freeze everything, everyone is on lockdown.
As in everything that matters, good and bad - there are bad things tied to everything, right? In that case, the lockdown gave me the chance to start writing the book.
In March, 2020 - I had the idea, but we never started doing it - "Yeah, tomorrow I'm going to do that. Yeah, tomorrow I'm going to do that." And then, "Okay, now I'm stuck at home. I don't have anything else to do. Maybe now it's time to start writing this."
That's when I started the book. I didn't keep track of everything. But maybe I spent like 1,000 hours or so, writing that? It was huge.
Len: Thanks very much for sharing that great story. There's so many parts to it. We'll get to talking specifically about your book and the process of writing it, and what got the ball rolling there as well. And thanks for talking a little bit too, about how the pandemic gave you the time or the opportunity to do it.
It's funny, in the publishing industry, that was true generally. A lot of conventional publishers were like, "Enough. We've had enough manuscripts submitted to us." And they always get more than they want anyway.
But actually on this podcast, and on Leanpub - we've had quite a few guests and quite a few authors who had a similar experience. I mean, including I think one - one story was a guy's friend was going to be in quarantine for two weeks. He's like, "Let me help you edit your book?" And the guy was like, "Well, then I'd better get started writing it." I've had another guest who I think went to visit his parents, and then got stuck because of a sudden lockdown in - I think, Northern Italy? And he's like, "Well, time to write my book." We've had a few guests who've had a similar experience like that.
But just going back to the beginning of your story - your dad got a computer, and you were figuring out how to code on that and stuff - and then you did a Computer Science degree in the mid-90s. One question that comes up often on this podcast, in some form or another, is, ff you were starting out now, you were 18 years old, say - and you were intending to have a career that was technology-related, computers-related, would you get a formal university degree in Computer Science? Or would you choose another path, given all the changes and all the new tools and resources that are available now?
Daniel: I think a formal degree is still important, to some extent, even though you can get lots of things from the internet for free.
I would say that the biggest challenge is to organize yourself and stick to a schedule, right? Because I think one of the things that you have - if you're enrolled in a course, and you are in an institution, and even if you're paying - or in Brazil, for instance - I was attending a public university, so it was free.
But still, you have a commitment. And then you have to go there. I think that that helps. Organizing. Someone has put a lot of thought into organizing the content, and seeing how it fits better.
Of course it's always going to be at a slower pace than what you'll get yourself, if you're doing stuff online, right? But I would stick with the formal education still, yes.
Len: I'm speaking of formal education. You got your Computer Science degree, and you worked, I believe, as a programmer in banking for a while. But then you made the shift to the Department of the Treasury, for the government. I can see from your LinkedIn profile that you got a Bachelor's degree in Economics later on in life, and a Master's and an MBA as well.
I'm curious, did your Computer Science education help you when you were suddenly being given problems in finance? Like, I don't know? Interest coverage ratios, or as you were saying, interest to debt ratios, or debt to income ratios, and stuff like that. Did having a Computer Science background help you face those challenges?
Daniel: Once you know how to code, you can solve pretty much anything, right?
That's funny. Because especially in the government, there were lots of things that were done in a suboptimal way, because everyone knows how to use Excel, but not everyone knows how to use Excel properly, right? A couple of VBA macros. And some people say, "Yeah, this is not real coding." I mean - okay, I'm not into that discussion. But even some coding, and something very simple as a VBA macro, really helps, and makes processes much more efficient. So, yeah - definitely. Do it. I think that knowing how to code and knowing properly -
Also, algorithms, right? Because one thing that I remember learning in college, was - okay, how do you grasp theory? It was so abstract when I got this in 1995. What am I going to use that for?
But then, at some point later in life - okay, maybe I can use these? There are things in real life that you can use stuff like that for.
That's probably one of the reason why I also stand for the formal education. Because maybe if you're doing a curriculum for yourself - I mean, nowadays not so much for graph theory. But there is always some topic that you may neglect, because you don't think that's important - because you don't know that may be important, right?
But yeah, going back to the original question - definitely. I think that having great computers and actually knowing how to code, really helps with everything.
Len: Yeah, I know. It's funny, I'm not going to talk about myself too much, but I had a kind of funny backwards experience from yours, where I went into investment banking and I had to do financial modelling. It was a lot of time in Excel. And I didn't realize that I was coding. Like, I had no idea that I was basically doing like functional programming or something like that, right?
I mean, you'd have way more stories than me. But I remember - I had no idea what a VBA macro was, until some guys from one of the big five accounting firms gave us a spreadsheet that had hidden sheets in it, that were locked. I was so angry that they would do that, right? Because it's like - and I don't know - for anyone who's worked with Excel a lot, there's conventions when you're passing files around, right? One is - either you give someone the output, or you give them the whole thing. But to give them the output with the calculations hidden in the same spreadsheet, is just offensive.
And so I found a VBA macro that I could use to crack the password protection in Excel, to get those hidden sheets. And it didn't really make - I mean it actually did help a lot, understanding what was really going on, but it wasn't like finding out any real secrets. It was just rude on my counterparty's part to hide that.
But, yes - and there's a lot of people who work in finance, will know how to use Excel. But if you actually really push and learn - to get to another level, then - it's not just like now you're faster or something, there's all kinds of things you can do. And it can actually be just inherently like enjoyable. It sounds ridiculous, but it's true. Some people just love spreadsheets and what you can do with them.
And so, I wanted to ask you a little bit - a lot of our guests - I think there's something about self-publishing, and being independent, and a lot of our guests are people who've moved around a fair amount. Not that that's so unusual nowadays, or even in many times in the past. But what was it like for you moving from Brazil to Berlin - to Germany?
Daniel: That was something, man. First, because we had to pack everything that we could, and move to a country, 40 days - getting all your affairs in order to move to a different place, and all the legal stuff - that wasinsane. And fitting everything that you - not everything, actually - but everything that you want to keep into five or six larger bags, that's very challenging.
I admit, I kind of cheated on that. I still have like a lot of stuff in a storage unit in Brazil. Because like, "Okay, I cannot handle that." But they are still there. And I have to return, I will at some point.
But apart from these practicalities, right - there is a lot of different things, when you compare Brazil and Germany. And that's in two different culture, right?
I was a bit surprised. Because I mean, in Brazil - or I mean, at least I had some ideas, that I expected it to be more similar, to be honest.
I was a bit surprised at a lot of - many, many different small things sometimes are different. You're like, "Okay, whoa, whoa - how does that work?" But of the course, the good and the bad sides.
Specifically, for the good side, is that one major problems in Brazil that everyone knows, is violence, right? Criminality is off the charts. This is something that you grow up in, and then you will be like - you learn to be aware of your surroundings and paying attention to - if you're going out the car, or get out of the car.
This is something that like, three months in, in Germany - we're like, "Oh, we feel lighter now. What's happening?" Because we don't have to look over our shoulder. You can get on your phone and not worry about someone trying to snatch your phone and running away.
That was very liberating. That was really, a really good sensation. That, and most people take for granted, right? When I was discussing this with the people in Germany - and not only Germans, but the Spaniards, Portuguese people, Ukrainians - everyone else. Then you'll say, "Yeah, this is so cool, you don't have to worry about this." You're like, "Why? What do you mean?" And then I tell them, "Our reality is, in the South American country -" And they're like, "Oh, really?" That was one of the other key differences.
And then of course, you have all of these small differences, right? One thing that I miss the most, is the food. Because Brazilian food is - especially where I come from - you go to these all you can eat buffet-style. You may stuff your face with food. And barbeques all over the place. And this is not a - you can't find that in Germany, or in Europe, for that matter. But life is about tradeoffs.
In that case, I got a different career. I don't know if I would be a published author, a self-published author - if I hadn't moved to Europe. Because the kind of connections - the kind of people that you get to know in Europe, is so different. Because in Brazil, we don't have foreigners, right? You don't have these exchanges, these opportunities. It is very local.
And there, I'm getting in touch with many people from all over the world - especially teaching at the data science retreat. Which, I think it was one of the best things that I've done, where I've taken this job as teacher. I've got like 150 or so students from all over the world, different experiences. And that is so nice. Because you have this network. You exchange ideas. You'll make referrals for other jobs, right? It's also nicer to find a new job if you have this network, so -
Len: That's really fascinating. I've got a question about your move into data science specifically, and then how that happened. But I just, I can't help it - there's some joke in there about the difference between a currywurst stand and a Brazilian steakhouse.
But it's funny what you say about moving somewhere, and you realize that there's something that's not there, that is where you're from - but you never really thought about it, until it was not there anymore.
I remember when I moved from Saskatchewan - it's a province in Canada, to London. Not that outside bars, there was a lot of violence. But if you were a guy and you were at the bar, the possibility of a bar fight was just always there. It's just a punchy place, right?
And so I remember some people asking me, like, "Oh, don't you feel kind of intimidated by being in the big city?" And I'm like, "No, I've never felt safer, at least in bars, than I did in London." Because it's just - at least in the parts of London that I normally went to - not all, though - punch ups just weren't really a thing.
Anyway, on that note - so, eventually you found yourself in Germany, and you were doing data science. Had you been doing data science before you started teaching, I imagine?
Daniel: That was about this paper that I wrote with the simulation. It was not the data science as we know today, it was on the way, or towards data science - like the Medium publication. Because I was doing stochastic simulation for these Treasury indicators and stuff. But I was using MATLAB back then. I was not doing data science using Python - as everyone does today, as I'm doing today. But it was very useful in the sense -
I was doing factorization, like making everything factorized operations. It's something that I struggle a bit to learn while using MATLAB. But once you learn it, that's awesome, right? And now we can use NumPy and everything else, works the same. I was really struggling to understand using MATLAB in performing these stochastic simulation for writing that paper, it really helped me with the foundations for the stuff that would come after, right?
Len: And just for anyone listening, let's say - can you maybe tell us what a stochastic simulation is?
Daniel: Basically what happens is that - the forecast that we had, that was being done before - it would assume that there was a growth rate of 3% a year, something like that. It could be two on average. But the thing is, of course - one year, the revenue would grow 5%. The other year, it would go down 1%. You never know.
Basically what I did there, was, instead of assuming this linear growth, I would do something randomized between like minus 3% and plus 2% - it depends on department of the simulation. But you just draw from a normal distribution, these values - and it would assume those - draw samples from one scenario, and compute how much it would lead me to.
And then again, and then again, and then again. It would like do this 10,000 times. Basically as if there were 10,000 parallel universes, where different things happen, right?
Then you know, okay - if I would go to all this, all those 10,0000 universes - and then I would average them out, what would be my result? And not only the average, but you can also get - okay - if I want to get to 80% of these universes, how much does it -? What's the range that I can find this -? Whatever indicator that I was trying to simulate.
Basically it's just randomly sampling values that you can use for something. For, in this case, forecasting indicators. And doing these a lot of times, to have an idea of how much it can vary. dies that give you enough?
Len: Yes, that's very good, including the idea that you iterate and you do lots and lots of them, right? It's not just one simulation, as it were.
I'm going to throw you a bit of a curveball question here. When words like "BETA" and "simulation" come up, and in particularly, data science - and when computers are involved, people often just assume that what they're shown is true, especially if charts and numbers are involved.
This has come up a few times in discussions on this podcast. Where it's like - if you're on the analysis side, you present to people who are not familiar with the way the analysis works. And they often feel like you've presented them with the truth. "This is the way it's going to go in the next year, or the next five years," or something like that.
I guess my curveball question for you then, is - is economics a science?
Daniel: Yeah, okay - good, good. Oh, that's a tricky question. Because I mean - can I not answer that question, or are you kidding?
Len: You can opt out, that's for sure.
Daniel: No, I know. But I mean, that's tricky, right? I think that - I mean, it is a science in a sense that you can - why does it need to be a science, right? You have organized knowledge, and then you have hypotheses or assumptions, you to try to make predictions and stuff, right?
What I don't agree with are some of the assumptions that are made in economics. I mean, of course - you have the different schools of thought, and everyone has different assumptions. And some of those are simply not realistic, right?
For instance, one is ergodicity. Basically - to have the expected value.
Let's say that you're making a bet, right? Propose to you a bet. Okay, here's the thing. Either you will get 100 million dollars, or, if you lose, you will have to give me one million dollars.
I'm assuming that you don't have a million dollars for this scenario, right? Depending on the probability of winning or losing this, you may say the expected value of this bet is positive. You should take the bet. Let's say that for the sake of 90% of the time, you would lose - and 10% of the time you get this 100 million.
The thing is, if you were able to live 100 different lives and make this bet in multiple universes, if you would be able to bet all these, the average over all these universes, would have a positive value. It would be profiting.
But you are only one. You cannot connect or talk to the other yous in the multiple universes. If you lose and you don't have a million dollars, now you're in a bad shape - right? To not say a curse word here.
And then - but this is the assumption that you have ergodicity - you can just do-over. There is no do-over in life. How can you make this an assumption of a model, right?
For me this is - the problem that I see with this - and then I know that if people would like challenge this, and people will not agree with that, what I'm saying here. But that's my opinion on the topic, right? For sure, there are many different opinions on that.
Len: Thank you very much for sharing that very specific opinion. It's one I - I mean, I'm not a trained economist - but it's one I share very, very much.
Particularly I think it's - when people are in areas, like let's say, psychology or economics, and they're maybe professors and they write papers and they teach at universities and things like that - they can often trick themselves into a kind of intellectual chauvinism, where they think it's really straightforward what reason is, for example. What the rational thing is to do, or what is in a person's interests to do, right?
But what counts as rational is actually outside the realm of economics. That's philosophy and ethics, even. What counts as being in your interests, is a question for philosophers and ethicists, and not for economics or for psychology.
One of my favorite examples of the kind of thing you're talking about - people set up these little scenarios. And this is what I call, I used the word "chauvinism": they proceed as though it's really obvious what the right answer is. And it's like - I'm sure you've heard the one where - I always forget the name of it, because I just don't have any respect for it.
But the scenario where it's like - there's you and this other person, right? And you can either choose to have that person get $99, and you get $1. Or neither of you gets anything.
Well, the supposedly rational thing is to take the $1, and let the other person get the $99, because one is more than zero.
But that only follows if there's an unstated fact of this scenario, which is counterintuitive, which is that value is absolute, rather than relative. Which is not how money works, right?
A scenario in which me and this other person are, with respect to everything else, equal - and now he has relatively more of something than I do - is actually worse for me, it's not better. If you see what I mean, right? You have to add the extra thing, like, "Oh and by the way - in this crazy universe, money has absolute rather than relative value." What does that even mean? It's not clear.
But you can't get through to people when they're behind this kind of wall. And that's one of the reasons I asked the leading question, about whether economics a science. Because often people use the status of economics as a science, to kind of get away with not really thinking through some of the deeper philosophical questions that they just take for granted, like, "What's rational," or, "What's in a person's interest?"
Daniel: This is called the ultimatum game.
It would rather be - they say, "That would rather be with more money than less money." But I think it's not only about the relative versus absolute, but a measure of fairness, your judgement on what's fair, what's not. Do you think it's fair the other guy have $99, and you have only $1?
They have experiments with that, and they say, "Okay, as long as--" I don't remember now. I read about this. But it's been a long time. I think if you have one third and the other person has two thirds, that's acceptable for most people. But if it gets more skewed than that, then it's a no-go.
Of course it changes, right? But the idea that it's - the outcome is not fair, and therefore it doesn't matter if you get more money with that. Because if you think that it's unfair, then of course you'll say, "Okay, whatever - I don't mind losing the money, as long as I can prove that the other person know that was not fair."
What I don't know, is that - of course - these experiments, they're always made with like tiny amounts of money, right? You get like one buck or ten bucks, or something like that.
I wonder what would happen if someone will do an experiment - like, life-changing amounts of money. Let's say that I give you a hundred thousand dollars, and the other guy has five million. Would you take it or not? I think that changes a lot, right? Because even if you think, "Okay, that's unfair as hell - but I mean, that's a hundred thousand dollars. I mean, anyone can use $100,000, right?" I don't know, probably we'll never know the answer to that, unless one of the billionaires decides to investigate psychological aspects of these things.
Len: That's so interesting, both of your examples. It sounds like you have a kind of existentialists approach to economics. Which is really fascinating. But yes, thank you very much for going down this path. I was excited to get the chance to ask you that question, but I wasn't sure you'd be up for it.
And so, you started teaching, and you also started writing blog posts and things like that. I was wondering if you could talk a little bit about how that got started for you?
Daniel: I'm a bit funny, and I was like at some point - I think it was 2017 or so? I thought, "Okay, what if I try to teach a course myself?" The initial idea was to teach in Portuguese. I thought, "Okay, this is a market that's not covered by anyone. They just have it in English. You have lots of competitors in English, but in Portuguese, there is almost nothing." I thought, "Yeah, maybe I can do that." And I organized the course. And then I was talking to a friend, and he was like, "Ok cool, I know you, that's fine - but who the hell are you?"
He said, "No one knows you. Why would anyone give you any money to teach something? Because what's your qualifications, right?" Of course, I was already working as a data scientist, but still. "Okay, I have to get known. How do I do that? Why don't start writing something?"
That's how the first blog post came to be. Because I'm like, "Okay, I'm going to write something and publish it, there is only upside to that." I got to this topic that they really liked, the activation functions in the learning. Because they do all these really nice - the twists and turns, the feature space, that's very technical now. But they really produced some nice animation, that you can visualize what's happening.
I read a blog post from other guy, that was very nice - but way over my head. Extremely technical, lots of equations and stuff, and I wanted to do something simpler. I could understand it better, and could like show people that.
The guy from Towards Data Science was really happy with the post, and he featured first in the publication on the front of the page. I got like 2,000 views on the first daym, which was super excited. Refreshing like crazy. "Oh, 1,800, 1,900, 2,000 - yay." It was a very nice feeling, having 2,000 people read what I wrote.
And then I wrote some more, until, at some point I wrote one about PyTorch. That was the one that eventually led to the book. That started in a way that would not anticipate going in that direction, right?
Len: And that post, I think got something like 280,000 views - or something like that?
Daniel: The first one, the most popular one is about binary cross-entropy. It has almost half a million now.
Len: Wow.
Daniel: The other day I realized - I Googled myself in Google Scholar, and I got 35 citations on that blog post. Which is like - I never thought of that, but there is a lot of citations on this blog post. That was really great.
Len: That must be an amazing experience - but also really good motivation to write a book on that, when you know that you've got an audience out there already.
On that note, the book is Deep Learning with PyTorch Step-by-Step: A Beginner's Guide. It's a very big book, as you mentioned - and all the time you spent working on it. I was just wondering if you could talk a little bit about what deep learning is, and what PyTorch is.
Daniel: Basically, you have machine learning - right? Which is where you have traditional [?] - and then you have deep learning, which is based on neural networks.
Neural networks, well, they were not so popular until 2014 or so. But then, when I was a student at data science retreat in 2015, there were already some developments. Some of my colleagues, they used those neural networks in their projects. But it was very primitive, in a way that you did not have access to GPUs.
There's a combination of different things that enabled deep learning to be a thing, as it is today. One is the availability of data. Because these are very data-hungry models. You have to have like thousands or sometimes millions of data points or images or something, to train those models - and you need a lot of computational power. GPUs and all this, it's something that was not so widespread, five, seven years ago. The combination of these two things allowed all the deep learning to flourish.
And then of course, you have the big players - right? They have a major impact. You have Google, Facebook - what do they do? They do their own models - and then once they train those, they release those to the public. They'll have the pre-trained model that you can use for something yourself. Because it's not like you and I or someone is going to train like a huge amount like that on their own.
Because, if you try that on the cloud, you'll be out of thousands of dollars. And so, that will not be so easy. Pretty much everything relies on these pre-trained models. Because they can use them as a base. Someone else already did the hard work, now you just have to make small adjustments.
Then you may have an application that - the famous cats and dogs. The other day I read about this application that this person, she was detecting if a cat was happy or not. And then you're thinking, "That seems silly, right?" If you will just look at that, "Happy or not?" Why would the cat be unhappy? Turns out, when the cat was unhappy, it was actually sick. This application was able to detect early signs of sickness in cats. One was dehydrated, the other had some other problem. And then these people took the cats that were not happy, and took them to the vet - and then they were able to treat them early.
That's really cool, right? What I did not know, what I was curious to know, is like where did she get the unhappy cats pictures from, right? Because you need the pictures from the unhappy cats to create a model. That information - couldn't find it. This is the kind of thing that you can do. And you do that based on these other models that are already pre-trained by one of the big players. That's deep learning, right - in general.
PyTorch is a framework that you can use to actually handle those models - train them or use them to make these predictions about a cat being happy or not.
PyTorch itself was developed by Facebook. That's been since 2016. And then two major competing frameworks - TensorFlow which came first from Google, and PyTorch. I like PyTorch better. I mean, of course, I'm biased, right? I wrote a whole book about it.
But I like it because it's kind of fun to use if you are a Python programmer, and you know how to handle Python. It feels so easy and so natural, to go through coding using PyTorch.
This is what I was trying to convey in the book, and first in the blog post. I actually had a good time learning PyTorch in the beginning, because, I don't know, it just feels natural.
Len: Thanks very much for sharing that. And of course we'll have links to the blog post that we mentioned, and everything else in here as well, so you can also get a sense of Daniel's conversational and fun, very fun style. Which you can also see in the About the Book description on the landing page on Leanpub for the book.
It's interesting that there's all these big players like Google and Facebook, making competing things - but that they're also releasing trained models or datasets to other people to use.
Probably most people who pay attention to some of this stuff in the press, will know about giving one of these models a set of pictures, some of which are cats. And then, what it does is it runs various algorithms, and learns to identify which images are cats. And so it can run some simulation, or some effort to identify what are cats.
And it can check against, "Well, how many of my guesses were correct, and how many were false?" And blah, blah. Then it can iterate, and it can get better.
But basically, what the algorithms are doing, is looking for correlations in the data - generally speaking, if I understand it correctly? I mean, I'm sure that's a big oversimplification.
Daniel: It's not so much as correlations, but I mean - this is tricky to explain, especially if you don't have a way to show people what's happening. But, well - basically what happens is you map - in this case, an image - right? Or the feature that you're using in the model. It will have a gigantic dimension in space, right? Because you have like two-dimensional, that we can see and plot nicely.
And if you have four-dimensional, where it just starts getting weird, and then - there is this joke that, "Okay, how do you imagine something in 14 dimensions? You just close your eyes, say '14' really loud - and then go." But that's silly.
Then when you talk about deep learning models, they have like millions and millions of parameters. That will be millions of dimensions. And that's impossible to visualize, or to even understand what's going on in a visual way.
But the idea is that all these instances, this image - let's say that you're doing cats happy or unhappy, like I was saying - what will happen is that the happy cats, they will be mapped into a region of these crazy dimensions in space. And the unhappy cats will be mapped to a different region.
Now, if you were saying this will be like three-dimensional, you have like a bowl, right? And then you see on the left side of your bowl, there are the happy cats. On right side of the bowl, there are the unhappy cats. And then maybe, if you've got a well-trained model and it's easy enough, you can cut the bowl in half - and you have happy cats on one side, unhappy on the other. That's an oversimplification, but the idea is that you're going to separate them. It's not so much about correlation, but about mapping into these dimensions, multi-dimensional spaces.
It's weird. Just describing it feels weird. If I can draw this like I do to the students, it helps.
Len: I really love, personally, weird descriptions of things. Because they remind you that, even though you may be someone like me, who's read a lot of articles about this kind of thing, you really don't understand it. And just giving you a sense that there's this - what to you is a mystery, there's something solid behind it. But it has to be presented to you as a mystery, because you don't understand it.
It's interesting too, because I mean, of course - other things that people will be familiar with - the news about these kinds of models, is sometimes they can make mistakes, for example. They can identify people of a certain race as being more likely to do X, Y, and Z for all kinds of reasons that have nothing to do with race whatsoever. If there's bad data, basically, you can get bad results. And so you have to be careful about that.
But another very interesting, and in its own way, kind of controversial application, are self-driving vehicles. I know that, for example, Tesla uses PyTorch. And in this case, it's not happy cat or sad cat, it's like, "cyclist" or "street sign."
Just generally speaking, what's your sense of where technologies like that are going to be in, say, five years? And, again - that's a total like curveball kind of question. But -
Daniel: Yeah, that's tricky, right? Self-driving - that's interesting that that's nice to think in theoretical terms.
But it's delayed, because if you read the news from 2016, they'll say, "In 2019 we're going to have full self-driving cars doing everything." And then we're like - okay, three years past that already.
And some of the companies are not doing that anymore. I mean, Tesla's still doing it, but it's a hard problem.
And then, technically speaking, I find this challenging and interesting. But from the practical point of view, I wonder if that's wise?
Because I remember maybe ten years ago or so, before deep learning and computer vision became so widespread, the talk about self-driving cars was not about a smart car, but about a smart road.
You remember this Minority Report or I, Robot movies? The cars were evenly spaced on the road, and they were being driven by the sensors or whatever, in the road. That they would move really fast, because you know everything was in place.
Because at the end of the day, the traffic - that's a controlling problem, right? If you had some central authority handling traffic, or in that case, something controlling traffic, it would be optimal, right? The problem is that you have multiple layers competing.
And then in that sense, if you have one guy trying to drive at 200 kilometers per hour, and the other guy driving at 60 kilometers per hour, and you have a self-driving car in the middle of it trying to figure out these very different profiles of drivers, and trucks and whatever - and all of a sudden, there is a pedestrian crossing where he or she shouldn't be crossing.
There are so many challenges for that. I understand the appeal from the technical point of view. Because I mean, come on - every programmer loves a challenge. If it's something really hard and it's a puzzle, "Yeah, let me try to solve it."
But I wonder if there would be a different and easier way? On the other hand, if you think of that - how would you implement something like a smart road? You would need all the cars to be compatible with that road, which is a challenge in itself, right? And then you have to have the government or the private sector doing this renovation of the road, so it's possible. This will not happen anytime soon.
On the other hand, the self-driving car - if you put in enough effort yourself, you may try to have it done or on the road before that. You don't have to agree with anyone else, in that sense. Pros and cons, like I said - with everything out there, there are tradeoffs.
Len: Thanks very much for sharing that. That's such a fascinating concept, that we could talk about for hours, I'm sure. But I mean, it is true though, that to a hammer, everything looks like a nail, right? And if you're a programmer, and you're into technology, like computing technology - you're like, "Let's use this."
But when it comes to smart roads, for example - well, here's an example of smart roads: they're all one way, right? All of a sudden, self-driving just advanced ten years, if roads are all one way, or if they're all one lane, or something like that. You know what I mean? Those kinds of things could make a huge difference.
It does seem to me that often - I mean, this is an ordinary observation. But when new technologies come around, we try and replicate the old way of doing things with a new technology, instead of thinking about a new way of doing things, right? And so, the conventional self-driving car that people are thinking of, is like, "Oh, it's just like mine. It can do anything it wants. And it can go anywhere, doing anything it wants, and making decisions all the time." And it's like, "Ah, that might not be - there's a million people a year who die because we have that way of driving as a default mode, that just anybody can get in their car and go anywhere and do anything."
My personal view, on that note, is that, if and when the day comes that people don't own their cars anymore, that's when standardization starts to happen, of the kind that you were describing as a potential thing. Like when you don't own your car anymore, then you don't care about its features or even what its color is, or something like that - potentially.
And so all of a sudden, the opportunities for standardization come in, and then we could see a lot of movement on that area.
But yeah, thanks very much for sharing those thoughts. This is a personal interest of mine, and it's great to hear an expert in these technologies like you talk about it.
Moving on to the last part of the interview, where we talk about the process of writing your book, and things like that.
It's a very big book. You said you spent many, many hours on it. I was wondering if you could talk a little bit about your approach to it as a writer? You mentioned earlier, when you're talking about university degrees - discipline and plans, and things like that. Did you say to yourself, "Okay -?" Or did you work out after a while, "I'm going to get up at four in the morning and write for three hours every day?" Did you have a system like that that you used to work on the book?
Daniel: That's one of the reasons that I mentioned having a schedule, or something that's important, when you're attending institutions. Because I know, myself - I'm not that good at doing it myself. I had the rough idea of what I would like to accomplish with the book when I started. I knew that the blog posts were like the baseline, the initial building blocks of that.
But I was very optimistic. "Yeah, when I write this, like two months from now it's going to be over." It took like eleven months or so, so I was way off the initial planning.
But also, I didn't want to make that like a job. I didn't want to set a deadline for myself. Because these parts - sometimes programming, coding - or in that sense, also writing - is a creative process, right? It's less than a job or a task that needs to be done, and more of a creative process. So, I wanted to let it flow.
Basically, I started writing. I had some idea what I wanted to do there. But, no, I was not sticking to anything very strict. I started writing. And then there would be good days and bad days. Some days I would write, I don't know? 4,000 words in a single day. Because it was really going.
Other days, it was like, "Okay, I don't feel like writing." But then I would still try to force a little bit, to write at least 200 words or so. Even if I had to review it the next day, to not let it - because if you procrastinate too much, then you end up not doing it. I was trying to keep at least some of the writing -
But the important part, was to get all the code - since this was a book about PyTorch, and code is the major part of it - was to get all the code working and organized first. I had all these big notebooks with lots of code on it.
Once it was, "Okay, I'm happy with the code," then I started telling the story that was in my mind. Then I started writing. So, "Okay, yeah - this is not working the way that I'm telling the story." And so I would go to the code and make some tweaks, and then go back to the text.
The idea was to tell a story, and more than that: to explain it in a way that would be more clear. Because even for me, when I'm trying to learn something new, I would go there, and be like, "Okay, this seems cryptic." Or sometimes you see the way that it is presented, it just overlooks a lot of steps, as if they were obvious. They may be obvious to you once you've done this for many years. But it's not obvious to the person that's reading it. They're like, "Okay, how the hell did he go from here to there?" I was trying to bridge out those gaps as I would see them, coming from other sources, and say, "Okay, how do you go from this to that? I want to go step by step." That's where the "step-by-step" from the title comes from, right?
It's a painful process to dissect every single step of the way, to show how it works, without assuming much. That was the main idea.
I learned a lot myself in the book. Because many things I learned before writing, and then like, "Yeah, I'm going to talk about that." And then all of a sudden, "Okay, but what's the impact of changing these, or changing that - or how does this has an effect on the result when you're treating the model?"
I investigated lots of scenarios and different permutations to see how they will play out. I found lots of new stuff, even for me, during the process of writing. Which was very nice. The writing process was a lot of work, but it was very much enjoyable. Because it was being creative, ultimately.
Len: That reminded me of an interview I did with an author named Eric Matthes, who's got this book on Leanpub called Beginners Python Cheat Sheets. He has, I think, one of the bestselling beginner’s books) on Python in the world. And he said something very similar to what you said, which is like, really getting it step by step - really doing that is such a challenge. And you do learn a lot along the way.
But it is something you kind of have to enjoy at the same time, right? If you're not getting any enjoyment out of it, then it's tricky. But it is funny how you really do have to push yourself to not skip stuff.
Daniel: I mean, I'm curious by nature, to always to have more. A friend of mine was helping me with like reviewing the first draft of the things. And then at some point, he was like, "Well, why don't you talk about optimizers?" I was not planning to talk about optimizers. But when he mentioned, "Yeah, ok, where can I add optimizers?" There were like another 40 or 50 pages in the book about optimizers alone. I tried to cut some stuff short, because it was already 1,000 pages. But it could have been longer. Because I really liked digging into the data and learning more about it.
And that's when, "Okay, I need to finish this. I mean, some people bought the book already. They are waiting it to be finished, I need to finish."
This is the difference between being self-published and working with traditional publishing, right? Because there you're going to have to fit your work within the perimeters that they expect you to do. Then you have to make it shorter. Then you can't go into so much detail. Because they don't want to have a 1,000-page book, they only have up to - I don't know? 500 or so. I like the freedom of being able to decide myself, "Yeah, I want to go crazy and have 50 more pages on optimizers alone." And that's it, right? I like that, alright?
Again, there is a tradeoff. Because on the other hand, I have to control myself enough not to extend myself too much in the book. But, yeah, I think it would work.
Len: You mentioned in there that while you were still working on the book, people had bought the book already, and were waiting for it to be finished. I looked this up before - you published your book in-progress. You published the first two chapters, I think was the first thing that you did when you launched it. And then published more chapters along the way. What was your experience like publishing like that? Did you get lots of people kind of helping you with corrections or suggestions or complaints or demands?
Daniel: I think one of the points that - doing it partially, it's interesting. Because, I mean - knowing that some people already bought your book, and they are expecting to have it finished, really helps you get going. Because I didn't want to let anyone down, right? I made a promise to myself, and to the people that already bought the book, that I would publish one chapter a month. And of course, then I got two or three chapters ahead. If there was any delay, I would have some buffer.
Len: Smart.
Daniel: And it was hard to keep up until the end with a chapter a month. I got lots of feedback from my friends. They are all present in the acknowledgements page of my book. Because they really helped me with reading and most of the things like, "Okay, what do you mean by that?" Or, "Can you make it more clear?"
Because even when I was trying to do step-by-step - sometimes my own bias, or by knowing something for so long - would get in the way, and then I would just jump over something, right? They would check me in, like, "Hey, why is that? Where does this come from?" This was very helpful.
Luckily I don't remember any complaints. I remember there were a couple of refunds along the way. One person complained about the layout, "I don't like the layout very much." Because layout is really, really hard to have right. I spent like ten months writing the book, and between formatting everything, and getting everything right, and then I organized a paperback edition - I was doing that. Just like another ten months.
Len: Oh, ten months?
Daniel: Yeah. Because I hired someone to proofread everything. Because I mean, I'm Brazilian, right? I'm not a native speaker. I tried my best to write it in correct English, but some of the expressions or stuff that I used, was not sounding so well to a native speaker, so she helped me with that. And then I had to review all that.
And formatting, and then sometimes the pictures would not be so great. I have to do it over. So, yeah - I mean, of course I did a lot of mistakes at the beginning, because it was my first book, right? I shouldn't have started with a 1,000 pages book. I should have started with something smaller, to learn first. For the next one - I'm going to save some time on that.
Len: Actually, in the interests of saving time - and helping people out listening, who might be planning their first book or working on their first book - what tools did you use to write the book? I'm assuming you didn't use Microsoft Word, if you were doing a lot of formatting and it was 1,000 pages long. Can you just let us know the detail of what you used?
Daniel: Actually, there were two stages. First, I was writing in Markdown. I used this editor called Typora, which is very nice. I really liked it. Because the problem with doing Markdown - some of the editors, they have two windows, right? In one you write Markdown, the other you see it. And then it's not nice. Because it gets - at least for me, it would get in the way of flow.
In Typora, you write in Markdown - it will render it automatically in the same window. It felt like, okay, you can do stuff, you would write all of that. But then in there, when I - also when trying to make the PDF version, I bumped into some problems that I couldn't overcome with that. And then I had to convert everything to Asciidoctor, Asciidoctor PDF, Asciidoctor EPUB. Documents, need to make all these conversions. I had to make this change to convert some of the chapters for that. Then I was able to get the PDFs in the way that I wanted, autoformatting.
The funny thing is that, even if I did this, after three or four chapters, I still kept writing the book in Typora. Because it felt so good to write it. Then I was already mastering the conversion tool, just writing everything in Typora - and then in the end, convert Markdown through Asciidoctor. Make some tweaks, and then we can write the PDFs for the book.
But that was not the most challenging part. I think that the worst part is getting all these images correct. Because then you have all the resolutions, and I have to change and resize and rescale these so many times. That was a lot of work. But yeah, I'm proud of that. I'm really happy with the result.
Len: Oh it's an amazing book, very well-produced. Thank you very much for sharing those details, including the challenges in the ways you - you have to find your path and develop it yourself.
It's one of the interesting things about self-publishing, is that, of course you're free of the constraints that the publisher might impose on you, and stuff like that. But you're also, in sense, free of all the professional help they can give you producing your book. And so, you have to learn how to do it yourself, or pay someone to do it for you. But even then, you still have to learn how to work with people who are paid to do this kind of thing. You still have to learn a lot about it. Because it is hard to make a well-formatted book, particularly one with lots of pictures and things like code samples, and things like that.
The last question we always save for guests on the podcast, if they're a Leanpub author, is: if there was one terribly broken and frustrating thing about Leanpub that we could fix for you, or if there was one magical feature that we could build for you, can you think of anything you would ask us to do?
Daniel: I don't think anything is terribly broken. I'm really happy with the overall experience with Leanpub. The only thing that - at the very beginning when I tried to use the integration with GitHub - so the writing the Markdown, it'll just send to Leanpub itself. Some, when it was rendering the code - and there'd be gaps - sometimes it would not respect the boundaries with the background. Then it would - a little bit above, or a little bit below.
That's one of the reasons that I end up doing the book using Asciidoctor. Because I couldn't solve this. When I was trying to do the code snippets, sometimes they would not be rendered properly.
But that was the only thing, the only problem that I had. Since I had like a whole lot of snippets of code in the book, that was something that I could not like handle, or get over with. That's why I made the change.
But apart from that, I'm happy. I really like the Table of Contents thing that you can put the HTML inside and put it there. So, no complaints - don't worry.
Len: Thanks very much for sharing that. Yeah, there are some cases where sometimes the formatting of the output in the Leanpub book isn't exactly what the author wants. If it's not exactly what you want, some people are like, "That's fine." And other people are like, "No, I want it to be exactly like I want it to be." It all depends on the project and the author, and things like that. Thanks very much for sharing that. That's some very important feedback for us to hear.
The HTML feature that you're mentioning - so, if you're uploading a book to Leanpub, as opposed to using one of our own writing flows - we can't see the Table of Contents in the file that you've uploaded. And so we can't show it on the book landing page.
But we do give you an option to - if you know a little bit of HTML, you can actually tell Leanpub what to show for the Table of Contents on the book landing page, which really helps readers discover the book, and decide whether they want to buy it. and it's going to give them what they want.
Well thank you very much, Daniel, for taking the time out of your evening to talk to me, and talk to our audience. And thank you very much for using Leanpub as the platform for your really amazing book.
Daniel: It was very nice to talk to you, I'm really happy to participate. This was my first podcast, thank you so much for the invitation.
Len: Oh thanks very much. You did a very good job. Thanks.
Daniel: Thanks a lot.
Len: Thanks.
And as always, thanks to all of you for listening to this episode of the Frontmatter podcast. If you like what you heard, please rate and review it wherever you found it, and if you'd like to be a Leanpub author, please visit our website at leanpub.com.
