Leanpub Podcast Interview #39: Patrick Applegate
published Nov 23, 2016
Patrick Applegate is the co-author of the Leanpub book Risk Analysis in the Earth Sciences: A Lab Manual with Exercises in R. In this interview, Leanpub co-founder Len Epp talks with Patrick about his career, his book, and his experience self-publishing on Leanpub.
This interview was recorded on July 14, 2016.
This interview has been edited for conciseness and clarity.
Len: Hi, I’m Len Epp from Leanpub, and in this Leanpub Podcast I’ll be interviewing Patrick Applegate. Patrick currently holds the position of scientific programmer with the SCRiM, or Sustainable Climate Risk Management network. Hosted by the Earth and Environmental Systems Institute at Penn State University. [Patrick was working in the Sustainable Climate Risk Management network when this interview was conducted. He has recently begun working for Research Square, a company that helps to streamline the scientific publishing process for scientists and publishers. - eds.]
Patrick has a PhD in Geosciences, and has authored or co-authored 20 peer-reviewed scientific papers. He has done research on areas like estimating the age of glacial deposits, and the contribution that ice sheets make to sea level rise. Risk Analysis in the Earth Sciences: A Lab Manual with Exercises in R.
Risk Analysis in the Earth Sciences is a free textbook that teaches readers about statistical concepts required for doing assessments of climate risks. In this interview, we’re going to talk about Patrick’s professional interests, his book, his experiences using Leanpub, and ways we can improve Leanpub for him and other authors. So thank you Patrick for being on the Leanpub podcast.
Len: I usually like to start these interviews by asking people for their origin stories. I was wondering if you could tell us how you first became interested in your academic field, and your journey to your current position with the SCRiM Network at Penn State.
Patrick: Sure. I actually started out life as a geologist, but ended up in this position, and along the way, I sort of ended up as a data scientist and a technology educator.
So it turns out there are lots of different kinds of geologists, and I was the type of geologist who goes and looks at glaciers to see how they’re changing, and at traces in the landscape that tell us how glaciers have changed in the past.
We do this because glaciers are really sensitive to past climates, so they can tell us about how the climate has changed in the past, if we know something about how the glacier has changed. That tells us that it got warmer or colder, depending on the direction of the change. We’re also really interested in glaciers and ice sheets, because when they melt, they make the sea level go up.
And that really matters because if all the ice on land right now were to melt, sea level would go up by tens of meters or hundreds of feet. And people living along the coast right now would definitely notice that. People are already starting to notice that. And so, even a small fraction - if even a small fraction of the land ice that’s out there right now were to melt, that would really matter for people. And we’re interested in what could happen in the future. Understanding the past helps us say something about the future.
So I stopped being a straight up geologist, and became more of a computer person during my PhD. The landscape tells us how a glacier has changed in the past, but we need a time signal in order to get at rates of change. There’s a method out there that lets us take samples from the landscape and figure out the ages of different land forms, and that provides the time signal that we need, in order to look at how fast glaciers have changed in the past.
Turns out that method isn’t perfect. There’s a lot more noise in the data than we would expect, based on our measurement techniques. And so I spent my PhD developing computer models that help us to understand that noise, and to pick out the signal in already published data sets. So that was sort of my first step from being a geologist, towards being somebody who spent more time in front of a computer.
But then a couple of years before I finished my degree, the Fourth Inter-governmental Panel on Climate Change Report came out. This was in 2007. So the IPCC is a big body that’s organized by the United Nations, and it really organizes scientists to assess every few years what’s known about the climate system and how it works. So this report came out in 2007, that I’m talking about right now. There was another report that some of your listeners have probably heard of, that came out more recently in 2013.
So going back to that 2007 report, one of the main conclusions was that we really didn’t know how the great ice sheets were going to contribute to sea level rise in the future. And that really surprised me, because at that point, people had already been building computer models of how the ice sheets work for decades. We knew what - we know a lot about how ice sheets behave.
So my question at that point was - well how is it that we still don’t know enough to predict what’s going to happen with the ice sheets and sea level rise? And so to try and understand this question better, my first job after my PhD involved going to Stockholm University in Sweden, and working with one of these computer models, to try and understand where these remaining questions were coming from.
So there again - step number two away from being a geologist, towards being a computer person, working with computer models. So once I was done with that, eventually that project came to an end, and I moved back to the US. I kept working with sea level rise, but the group that I moved into - and that’s the SCRiM group that you’ve already mentioned, uses the R programming language for everything.
So that group is very heavily invested in the R programming language. And this is one of the key languages that people use in data science - that’s a discipline that combines programming, statistics, and a real understanding of data, to pull out insights. I know that you’ve already interviewed Roger Peng, Jeff Leek and Brian Caffo for this podcast series, so it’s a lot of fun to be following them here.
Anyway, so I picked up R there, and took another step away from being a geologist, and more into the computer realm. During my time here, I’ve spent a lot of time writing code, and a lot of time teaching people - teaching other people how they can write their own code. So I co-taught a course that developed into the book that we published with you guys at Leanpub. And then I spent a year as an instructor at one of our branch campuses. And I spent time there bringing R programming into those courses, which was a lot of fun.
So at some point I realized that I really enjoy writing code and teaching other people to write code. And so that’s the story of how I went from being a geologist to being a data scientist and technology educator.
Len: Thanks very much, that’s a really great answer. You actually answered my next two or three questions spontaneously.
I guess I’d like to ask you next about SCRiM. Tell us a little bit about the Sustainable Climate Risk Management Network - what it’s purpose is, who are the people working on it, things like that.
Patrick: Sure. SCRiM’s a network of people, mostly at universities, who look for strategies to help us manage the risk caused by future climate change. So we expect that climate change is going to have consequences for us now and in the future. Particularly in the future. We’d like to figure out how we can best address those challenges. And so, the network’s mission statement says that the strategies that SCRiM is trying to come up with, need to be sustainable. They have to be scientifically sound, technologically feasible, economically efficient, and ethically defensible. I’m just reading from the website here, so that I don’t mess this up.
Patrick: As you can imagine, creating strategies that satisfy all those requirements, requires bringing together a lot of people with very different expertise. And so SCRiM is this organisation that makes that conversation possible. And I should mention that what really makes this work is a generous grant from the US National Science Foundation. So we thank them for that support.
Len: You had an article published in 2015 about something called “solar radiation management geoengineering”. I read the tech/environment news, so I think I have some idea of what that is, but I was wondering if you could explain - as an expert - a little bit more about what that is, and maybe talk about climate engineering more generally - the current state of affairs and that discipline.
Patrick: Right. So one of the things that I loved about Jeff Leek’s podcast in your series here, was that he was careful to say at some point, “that’s actually a little outside my expertise”. And so, I’m going to give you a partial answer here. And I’m going to try to flag the parts that I know more about, versus the parts that I know less about. Does that sound fair?
Len: Oh perfect, yes, yeah.
Patrick: Okay. So the part of this problem that I’m the most familiar with, has to do with the connection to sea level rise and the ice sheets - as I’ve said before. And so you mentioned solar radiation management, which is also called “albedo modification”. This is the idea that we could deliberately change the earth’s climate to be something that we would prefer - by changing the earth’s reflectivity, essentially. So if we were to somehow make things so that a little more solar radiation went back into space, instead of making it through the earth’s atmosphere to the surface - then that would make things cooler down here, and maybe we would like that better.
So that’s all fine, and I should mention that this technology has not been tested. People are starting to talk about testing it. There are other strategies out there for potentially changing the climate. But the aspect that we looked into, had to do with some claims that some scientists had made, that this would help us avoid sea level rise. So on the face of it, this makes a lot of sense. As temperatures go up, the glaciers and ice sheets melt. The ocean waters expand, and that all makes sea level go up.
So it seems pretty logical, that if you were to just make it so the temperatures went down - or didn’t go as high, that that should then save you from sea level rise. So our work involved using one of these computer models of ice sheet behavior. And essentially what we were able to show is that it’s a lot less effective than you would think in terms of preventing sea level rise from the ice sheets - that probably that anticipated benefit of solar radiation management and other geo-engineering schemes won’t materialize if they’re put into practice.
Len: And why is that?
Patrick: So what’s going on is that the ice sheet creates its own weather, because it’s both white on the top, which makes it reflective, and it’s also tall, and so there’s this thing called the atmospheric lapse rate, which means the temperatures decrease with elevation. So the higher you go on a mountain, the colder it gets. Because the Greenland ice sheet’s a couple of kilometers tall, and it’s above sea level in its middle part, and because it’s reflective, it’s cold up there. But if the ice sheet gets a little bit smaller, then that reduces that happy weather generating effect that the ice sheet has.
So if it shrinks a little bit, and then you try to save it by bringing the temperature down, you’re unlikely to be successful. Because the system develops an inertia that keeps it going in the direction you don’t want it to go.
Len: I think that a lot of our listeners are probably on board with the idea that climate change is real. I think probably many of them also know that a rise of tens of meters would be very disastrous. It wouldn’t just mean some beaches go away. It would be very bad. Without committing yourself to any kind of position professionally, what’s your personal feeling about the direction things are going with respect to attempts to mitigate the future impact of climate change?
Patrick: Okay so - I want to be careful to emphasize that we’re not talking just yet about a future where sea level goes up by hundreds of feet or tens of meters. It’s more that relatively small changes can still create problems for us. And we want to think in advance about how we’re going to meet those challenges. And that’s really what SCRiM ends up being about - understanding what the challenges might look like, and developing strategies to try and address those.
In terms of where I feel our efforts are going, there I really have to say, that is totally outside of what I can address, I’m sorry.
Len: Oh, no, that’s okay. I understand and I very much appreciate your straightforwardness about that.
Just changing topic slightly. I can see from your bio that you’ve taught college level courses, and you’ve done two postdocs. I wanted to ask a little bit about the value you see in university education. I ask because, especially in the startup tech scene, there are people who repeat the refrain that university education is no longer necessary. It’s sort of the more extreme version of the “We should all learn online now”. And I know it’s a big question, but I was curious to hear your opinion on this topic, especially as someone involved as a sort of academic scientist in university life.
Patrick: Right. The value that I have seen in my education is that, from a straight-up personal level, I’ve gotten to do a lot of cool things as a result of the education that I’ve received. So I’ve gotten to do field work in Greenland and Peru and the western United States. I’ve gotten to visit glaciers in Canada and Iceland. Lots of different places. And that’s the kind of thing that I certainly wouldn’t have received if I’d studied all these things in an online program.
Also there is a habit of thinking, and a way that you’re taught to think in advanced education, that you certainly can acquire for yourself outside of it, but I still think that the system of graduate education that we have here in Canada and the United States, and also in Europe, is really very excellent. And that’s something that’s worth preserving and maintaining.
Now I can speak to the other side of your question here, because I have gone on all these Coursera courses, and also done some online education through Penn State. And I have to say that if you’re a very self-directed learner, absolutely, you can pick up everything you need to know that way. To do lots of different things.
So I guess what I would say is that for sure, if your goal is to be someone who’s more on the tech side, then maybe the best strategy really is to do some personal projects, get involved in projects that other people are doing. Find a codebase to work on. And pick up what you need to do, what you need to know from the internet. But if you’re doing something that’s more, certainly more hands on, or requires you to travel and see things, I still think in-person education winds up being a good way to go. Better or worse, I won’t say.
Len: Okay. Turning to the topic of your book, you mentioned already the programming language, R, and I was wondering if you could explain a little bit about why its popularity is growing amongst scientists, and data scientists in particular?
Patrick: Sure. So reasons why people love R. R is a great language, because it’s free, it’s open source. There are a lot of people that use it, and they share their code. So free and open source is nice, because if you’re working at a university or a company that has site licenses for whatever statistical packages you want to be using, that’s all fine. But if you’re cash-strapped, or your working in a smaller organisation - boy it sure is nice to know that there’s a full-featured statistical and computing environment, that you can just go download off the internet, and do whatever you want with.
So we’re getting to the point in the modern world where you really can operate almost exclusively in a Free Software environment, and basically not pay for things, not have to pay for things. It used to be that if you wanted a good vector drawing program, you had to get Adobe Illustrator. These days, you can download Inkscape. And it’s kind of the same way in scientific computing. Pretty much everything you need can be had for free. So that’s very nice.
The other part of it, I mentioned that people share their code. And so, R has this great system that’s called the package system, where, in one line of code, you can download and install these nicely-formatted software libraries that other people have written. And then you’re off to the races. You can use their code to do whatever you wanted to do. And that’s just a tremendous time saver, because it saves you from reinventing the wheel, and also from testing any tools that you might build.
So you could write your own Markov chain Monte Carlo algorithm for example. But why would you when someone else has already done it, and a million people have used it - and it works?
Len: Speaking on the topic of free versus paid, when you were first getting on board with Leanpub, you emailed us, and asked us to set your book so its maximum price would be zero.
Just to explain to people listening, this is relatively unique. On Leanpub, books are sold on a variable pricing model, so that you set a minimum price for the book and a suggested price, and then customers can pay the minimum price, or anything between the minimum price to the suggested price, or anything above the suggested price. I think currently the maximum is $500. So what you can do, is you can set your book to a zero minimum price, and a zero suggested price. But it’s still possible that people can buy your book.
Now if you email us, like Patrick did, we are happy ourselves to set the maximum price to be zero. So it’s a totally free book that people can’t pay for. And I just wanted to ask you - I mean, I think I know the answer, but I wanted to ask you to explain to anyone listening who’s maybe picked up your book and is now listening to the podcast, why the book is free?
Patrick: Why is the book free? Well you probably have already guessed the answer. As I mentioned, we were funded to do all the work that the book is based on by the US National Science Foundation. And so, we got paid to do this from public funds. And it seemed to us best to make that free and open source. So not only is the book free on Leanpub, but you can go download all the source off of GitHub and build your own book if you want. So that’s basically the answer - that we felt that this was really the most honest way of distributing what we had done, so that other people could use it.
Len: Yeah, thanks. I think it sounded like a very good principle - that if something is publicly funded, then making it available for free sounds appropriate.
Len: We don’t currently have a ton of textbooks on Leanpub, but we would love to have more. From your perspective, is there anything missing from Leanpub that you think might help us attract more textbooks or accommodate the needs of textbook authors and editors better?
Patrick: You know, I can’t think of anything. I do think that Leanpub is a great platform for publishing this kind of thing. So you have some penetration, particularly in the technology field already. I wouldn’t be surprised if in the future you had more people writing textbooks there. So I’ve contemplated putting together, essentially, book forms of notes from classes that I’ve taught. I wouldn’t hesitate to do it.
Len: Do you think that for a textbook author, that the in-progress model of publishing that Leanpub allows people to use, is something that might be useful? For example, I’m curious to know if someone writing a textbook would be open to changing a chapter that they’ve written, based on feedback that they’ve publicly received from a reader.
Patrick: That’s a good question. The in-progress model that Leanpub follows is tremendously helpful to us. We’ve recently released an updated version of the book, that actually incorporates a lot of feedback that we’ve gotten from both our classroom users– Some of the authors have taught a course with the book, and gotten feedback, where people said, “Hey, this part needs improvement. I didn’t understand it”. So where we can, we’ve gone and fixed those things. And we’re also going to add a bunch more chapters, so the book’s going to get something like 50% longer in the not-too-distant future.
I would think that - particularly for fast-moving fields, that’s very, very nice. In terms of textbook authors, sort of traditional textbooks - you have to have some kind of minimum length, so ideally a textbook is something that you can use to teach a one semester course, or even a full year course. So, you do want to make sure that your book has some number of chapters before you publish it. That I think is the one thing that, if you’re going to be doing like a calculus textbook, the body of knowledge there does not change rapidly. And also, it has to be a particular length, in order to be of use. Does that make any sense?
Len: Oh yeah, that totally makes sense. That’s a fantastic answer, thank you. It hadn’t occurred to me before that - and it’s a very good point - that oftentimes a textbook is meant to be taught along with a course, and that a professor isn’t going to probably buy your book on the hope that chapter five will be ready in time for week five - that although the in-progress theme of being able to change rapidly based on feedback that you’ve received from people, or publish new chapters to add to it in the end - all that’s very useful, probably more so than the average Leanpub book, there needs to be a sort of - there needs to be a set of, with respect to what it’s doing, complete information before someone else would take up that textbook and use it in their course. So that’s a very interesting thing for us to learn.
Patrick: Yeah the standard for the Minimum Viable Product for a textbook that’s going to be used in the classroom is probably a bit higher.
Len: Yeah, yeah. And I think, I believe you wrote your book using your own tools, and you produced a PDF and then you uploaded it to Leanpub. I’m curious if you would think about writing in plain text and in Leanpub Flavored Markdown or Markua - as we’re developing in the future. Is that the kind of thing that you would find attractive?
Patrick: Potentially yes. This was a case where we had a technology in mind that we wanted to use to write the book. I got really excited, when I realized that Leanpub would let us put up our own home-brewed PDF, and I noticed recently that you do have GitHub integration, which could make this very simple for us. The challenge, or the reason why we did things the way we did, is because in R you have this great technology that’s called R Markdown.
So it’s R-flavored Markdown, and that lets you have pieces of R code interleaved with your human readable text. The interpreter goes through, generates a nice LaTeX document, and then turns that into a PDF. But it also processes the code, and dumps any results to the PDF document. So that is very nice, because if you then go back and change the code part of the document, you don’t have to remember to update the output part. You just have to check and make sure it doesn’t look funny afterwards. Does that make any sense?
Len: It totally makes sense, thanks for that - that’s really great to hear about that process. I was wondering if you intend to make a print version of your textbook?
Patrick: We haven’t thought about that. I would have to say - I mean certainly that is a discussion that would have to happen with particularly my co-editor, Klaus Keller, but also the other authors. And that’s something that hasn’t come up so far.
Len: One thing I always like is to finish off interviews by asking people, if you could ask us directly to add one feature, or to fix one problem - and we would go to work doing it for you - what would that be, or is there anything that you can think of?
Patrick: Things that could be added? Well you and I had a little email correspondence recently about the past downloads page, I think? And the - this is so trivial that I hesitate to bring it up. Let’s just say I can’t think of anything substantial. From an author’s side, I don’t think there is anything.
Len: Increasingly that’s the answer that I get to that question from people that I interview. I think I might have to try closing off these interviews with another question, or maybe try formulating it a different way from now on. But I wanted to say, thanks very much. I think people will find it really interesting to hear about data science from an environmental perspective, and environmental science.
Also other people in your position who are thinking about publishing textbooks on Leanpub will probably be very interested to hear about your process and how you - especially at the end, how you described incorporating feedback and improving the textbook for current and for future readers.
So, thanks very much Patrick for being a Leanpub author, and for being on the Leanpub Podcast.
Patrick: Thanks Len, this was a lot of fun.
Len: One last thing, is there anything that I left out that you wish I’d asked you about in this interview?
Patrick: I want to be sure that I give appropriate credit to my co-authors on the book. It’s been great fun to work with these people. So we’re expanding the book; we’re going to add more chapters. But the people who are on the book right now include Klaus Keller, Ryan Sriver, Greg Garner, Alexander Bakker, and Richard Alley.
I should mention Klaus in particular. He’s head of the SCRiM Network, and he was co-instructor on the course that gave rise to the book. Ryan’s an atmospheric scientist, who thinks a lot about tropical meteorology and hurricanes. Greg’s an atmospheric scientist who works on decision science. Alexander thinks about sea level rise. And Richard Alley’s a great guru of ice sheet science and climate science in general. So there’s a lot of different expertise among the people who’ve been involved with this book. And we’re only going to add people as time goes on.