Forging Python

Forging Python
Forging Python
Buy on Leanpub

Introduction

Being honest may not get you many friends but it’ll always get you the right ones.

- John Lennon

Youngstar: Hey Graybeard, tell our readers a bit about yourself.

Graybeard: Why don’t you introduce me and I’ll introduce you?

Youngstar: Great idea. Let’s see - You’ve been around the IT industry since punch cards, you’re also the best proof you can teach old dog new tricks. I have no idea how do you find time to learn all the cool stuff you know. You’re smart and usually pretty quiet until you start talking about technology. You’re currently not doing much work but still manage to earn a lot. Oh - and you have a quirky sense of humor.

Graybeard: Cute, I’m not that old though. About you … You’re somewhat new to IT, finished college few years back. You’re very bright and motivated and you sold your company not long ago for way too much money. You also like to learn and are one of the few people who get my humor. Oh - and you’re one of the great examples that a woman can make it in high tech.

Youngstar: Thanks. And yeah - I’m good at pretending to like your humor.

Graybeard: At least you try, my wife doesn’t even bother.

Youngstar: Oh, and we’re fictional characters.

Graybeard: We are?

Youngstar: Don’t pretend you don’t know it. How does that make you feel?

Graybeard: Really? This is not that kind of book.

Youngstar: Can you recall how we met?

Graybeard: I think it was just when I was leaving that company to start freelancing. And you just arrived, still wet behind the ears.

Youngstar: Yeah, I think we had about a month together before you left. Man those were big shoes to fill!

Graybeard: I hope the smell wasn’t that bad.

Youngstar: It was OK, I killed most of the fungus. Can you tell the readers about this book.

Graybeard: After I left, we decided to meet about once a week in “The Forge”.

Youngstar: “The Forge” is a great pub just down the road.

Graybeard: Thanks for the close captioning. And yes - it’s a great pub. We were geeking out regularly and I was kinda mentoring you when you started that startup doing that online thingie.

Youngstar: That was both great help and a lot of fun.

Graybeard: Yeah, and we keep meeting about once a week. But it has been less fun since you made all that money selling your company and became a snob.

Youngstar: I truly hope you’re joking. Also you saw some of that money, if you recall you got some equity for all the advice you gave.

Graybeard: I’m joking. Money didn’t spoil you, and once you’re out of this big company we might hack together on a new one.

Youngstar: Anything else our readers need to know?

Graybeard: The meetings we had were around Python. But I think most of the things we talked about apply to other things as well.

Youngstar: I agree. Well, that’s about the time we have for the introduction. The attention span of the average reader nowadays is pretty low. Hope you’ll have as much fun reading the book as we had in those drinking meeting.

Graybeard: Cheers!

Writing Good Code

Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live.

- Bill Mitchell

Youngstar: Your code is always easy to read and maintain. How do you do it?

Graybeard: Thanks! It took me a lot of time and practice to get there. And I’m still improving.

Youngstar: That’s a long journey, I don’t have so much time. Can you share some of the highlights?

Graybeard: Will do, but you need to keep improving.

Youngstar: Yeah, yeah - I’ll “sharpen my axe”.

Graybeard: Good girl! The main theme is simplicity.

Youngstar: Like in KISS1?

Graybeard: Somewhat. As developers, we spend most of our time reading code, not writing it.

Youngstar: Which means it need to be readable.

Graybeard: Exactly.

Youngstar: OK, so how do I write readable code?

Graybeard: By rewriting. I see the first iterations of code I write as sketches.

Youngstar: How do I find time to write several iterations of code?

Graybeard: I don’t think you can afford not to. As someone said: “The worst thing you can do to your code is to stop writing it the first time it works.”.

Also Fred Brooks said: “plan to throw one away; you will, anyhow.” Which means it’ll happen any way.

Youngstar: Is this from The Mytical Man Month? That’s an old book.

Graybeard: It’s old but about people, and people haven’t changed that much since it was written.

Youngstar: We haven’t changed a lot in the last 10,000 years. Back on track, what else will help me write good code?

Graybeard: Reading good code.

Youngstar: Where will I find that? I know where to find bad code - it’s everywhere.

Graybeard: No everywhere. There are few places, where you can see amazing code. For example, almost everything written by Peter Norvig.

Youngstar: Yes, I’ve seen his spell checker, it’s awesome!

Graybeard: It is. There also some good code and advice in the ASOA book.

Youngstar: Oh, I read some chapters. The one Berkely DB was good. I’ll keep on reading this book.

Graybeard: Yup, and along the way you’ll find people to follow and read their code. You might even find a good mentor.

Youngstar: That I have. Though he’s getting old.

Graybeard: Like wine, I get better with age.

Youngstar: You keep telling yourself that. Anything else about writing good code.

Graybeard: Read bad code.

Youngstar: Learn from other people mistakes?

Graybeard: Yes, but also look out for things you do. From time to time I go and read “How to Write Unmaintainable Code” and try to see if I do anything they say there in my code.

Youngstar: OK, will pay it a visit. What else?

Graybeard: What code does not have any bugs?

Youngstar: Eh… none?

Graybeard: Exactly!

Youngstar: You lost me there grandpa.

Graybeard: The code you don’t write, or delete.

Youngstar: Oh. It’s also the fastest.

Graybeard: Exactly. In a way code is our enemy, we’d like to have less of it.

Youngstar: Can you give me an example?

Graybeard: Sure. Assume you’re asked to process some data in Excel files. This will require you to install an external library to read excel (such as xlrd). However if you ask them to send over the files in CSV format - there’s already a csv module in Python. No need to install and maintain third-party packages.

Youngstar: I see.

Graybeard: Also, a lot of times after awhile and due to specification changes - you have code that does nothing. Make sure to delete it. One of my most productive days was deleting few thousand lines of unused code.

Youngstar: How did that happen?

Graybeard: Specification changes, libraries came about that did the same work …

Youngstar: I start see what you man by “code is our enemy”. What else?

Graybeard: Keep your functions short and with small number of parameters. A good rule of thumb is no more than forty lines of code per function.

Youngstar: Forty? Doesn’t seem much.

Graybeard: It not a law of nature, but it’ll make you code nicer. It’ll make you think on small pieces of code which are easier to understand and maintain.

Youngstar: Also avoid globals?

Graybeard: Yup. I like functional code since easier to reason about. However you can’t avoid state, no matter how hard you try.

Youngstar: Sometimes TDD helps with that.

Graybeard: Yes, especially when you start out. It forces you to write small pieces of code that are easy to test. However Google for “TDD is dead” for some interesting discussion about TDD.

Youngstar: OK. Any more?

Graybeard: Did I tell you that old linguistics joke?

Youngstar: Old and linguistics? Must be a good one - do tell.

Graybeard: I’ll make it brief. During the cold war the US created an automatic system for translating from Russian to English. When the system was ready they tested it by giving it English sentence, translate to Russian and back. The input was “The spirit is willing but the flash is weak.” and the output was “The vodka is good but the mean is rotten.”

Youngstar: Ha! Not that bad.

Graybeard: The secret is in starting with low expectations.

Youngstar: OK, and how is this related to what we’re talking about?

Graybeard: The idea is that every language has different way of saying the same things. In Python we call it “pythonic code”.

Youngstar: I heard that term before. Mostly with reference to the Zen of Python .

Graybeard: Good old Tim Peters, he is someone to learn from.

Youngstar: So learn how to speak the language?

Graybeard: Yes. A lot of people when they start write Java in Python, C in Python etc… But you need to learn how to properly speak the language.

Youngstar: OK, will do. Any other advice?

Graybeard: The most important thing is to have a good mental mode of what you do. You’ll have people talking about building an ontology, which means figuring out how to talk about things.

Youngstar: The “two hard things…”?

Graybeard: Naming is important, especially in Python which is untyped.

Youngstar: It’s also hard to get right.

Graybeard: Yeah, it usually takes me a couple of iterations until I get names right. A red flag are generic names like “object”, “other”, …

But back to ontology, it’s important to define what “things” are. At a place I worked we got a bug report that we count unique users wrong. The code seems OK so my boss went to talk to people. Turned out we had four different definitions of “unique users” in the company.

Youngstar: Ouch. I see what you mean - it start before you code.

Graybeard: Sometimes things emerge as you write the code, then you need to revise your model.

Youngstar: OK, will do. Anything else?

Graybeard: There are may rules to follow - DRY2, SPOT3, minimizing coupling … You’ll find them as you go.

Youngstar: Any reference?

Graybeard: There good summary in “The Art of Unix Programming”, and may other other places.

One trick you can do is see if you can understand your code without the comments.

Youngstar: OK. I’ll practice and read. More beer?

Graybeard: You keep asking these rhetorical questions.

Which Python?

Gentlemen, choose your weapons.

- A Night in Casablanca

Youngstar: I’ve been thinking of using PyPy for my new project, I heard it’s super fast.

Graybeard: Before we get into that, let’s take a step back. Why use Python?

Youngstar: Seriously? Coming from you?

Graybeard: Programming languages are tools, not religion like some people tend to make them.

Youngstar: And if all you have is a hammer…

Graybeard: Exactly. You have some experience with other languages.

Youngstar: Mainly thanks to you.

Graybeard: So again, why Python?

Youngstar: I’m most productive with Python. Going from zero to working is fastest.

Graybeard: OK, so speed of development - which is important in a startup. What else?

Youngstar: There are many great packages I can use.

Graybeard: Yes, a good ecosystem. Audry Tang said that “perl5 is just syntax; CPAN is the language”. I believe this is true for Python as well.

Youngstar: CPAN is Perl’s pypi?

Graybeard: Yes. What other reasons do you have for choosing Python?

Youngstar: It’s open source?

Graybeard: And why is that a good thing?

Youngstar: It means nobody can take it away from me. And worse case, I can fix bugs in Python before an official release.

Graybeard: Yup. Gimme one more.

Youngstar: Oh, the community is great. People are usually nice and helpful, and there are a lot of articles and videos out there.

Graybeard: Right. Now let’s try to think of places where you won’t use Python, it’ll help clarify some things.

Youngstar: Embedded?

Graybeard: You mean small devices or real time requirements?

Youngstar: I guess both.

Graybeard: Yeah, it’s hard to fit Python on small devices. However it’s possible and MicroPython does a good job.

Youngstar: I’ve never heard about MicroPython, I’ll take a look.

Graybeard: As for real time - most garbage collected languages don’t fit the bill. Anything else Python’s not good for?

Youngstar: I guess if you need a lot of formal checking of your system.

Graybeard: Yea. This leads me to what I call “the cost of error” which has implication on many areas of both development and business. For example, Jane Street is a trading company who uses OCaml - they claim it helps them make sure their code is correct.

Youngstar: I guess that in trading systems you feel the pain of bugs right away.

Graybeard: Yeah, ask someone from Knight capital once. On the other hand, I worked in an HFT4 firm once and we used Python and made money.

Youngstar: Yeah, yeah - we all heard your war stories many times.

Graybeard: Be nice to your elders. Anything else did we miss?

Youngstar: I can’t think of anything else - do tell.

Graybeard: Hiring is one.

Youngstar: You mean finding programmers?

Graybeard: Yes, try to recruit some good Haskell programmers sometime.

Youngstar: Try recruiting good programmers in any language.

Graybeard: Right. Remind me what your startup is all about.

Youngstar: It’s a backend thingie with REST API.

Graybeard: Seriously? This is almost as bad as “It doesn’t work!” bug reports. However it’ll do for now. Looks like Python is a good fit for you.

Youngstar: What a surprise…

Graybeard: Huh! Now let’s try to see which Python. What Python distributions do you know.

Youngstar: There’s CPython, Jython, IronPython, PyPy and now I know of MicroPython. Oh and there’s the subject of Python 2 and Python 3.

Graybeard: IronPyton is for .NET shops, which you’re not. Jython is for Java shops or you need to use Java libraries - and I don’t think this is your case either.

Youngstar: And I’m running on hosted servers so MicroPython is not for me as well.

Graybeard: When will you want to use PyPy?

Youngstar: For the speed?

Graybeard: TANSTAAFL

Youngstar: Gesundheit!

Graybeard: It’s an acronym for “there is no such thing as a free lunch”. What’s the downside of using PyPy?

Youngstar: Well, packages I guess. Not all of them support PyPy.

Graybeard: Yes. Going off mainstream has it’s down side.

Youngstar: Says the man who uses archlinux.

Graybeard: Trust me, there are days I regret it. But most days I’m very happy - it fits my preferences. Which is exactly what the Python you choose should do for you. So let me ask you - what are your speed requirements?

Youngstar: The faster the better?

Graybeard: Then why not pick assembly as your programming language?

Youngstar: I see what you mean. I need write some business requirements and then see if Python fits them. I have a hunch it will.

Graybeard: In God we trust; all others must bring data.

Youngstar: Good one, yours?.

Graybeard: Not mine - W. Edwards Deming’s.

Youngstar: I’ll spec and measure. Now let’s talk on Python 2 vs Python 3.

Graybeard: OK. Python 3 is the future, choose it.

Youngstar: That was easy! Should I tell it to all the people who still use Python 2?

Graybeard: There are many good reasons to keep using Python 2.

Youngstar: Because you’re and old fossil who can’t change?

Graybeard: Get off my lawn!

Youngstar: Sure, can I finish my beer first?

Graybeard: Funny. You might find yourself using Python 2 eventually.

Youngstar: Because of dependencies?

Graybeard: I’d say this is the main reason. However the situation has improved significantly in the last couple of years. If you head over to Python 3 Wall of Superpowers (which used to be called “Python 3 Wall of Shame”) you’ll see mostly green now, which means most “top downloaded” packages support Python 3 now.

Youngstar: What other reason are there? Legacy code?

Graybeard: You won’t believe how fast the new cool code you wrote a while ago becomes legacy code. Most of the time we improve existing code, not write new stuff. If you already have a decent code base, writing new code from scratch is a dangerous thing. Read “Things You Should Never Do” once.

Youngstar: How do you find the time to read all of these things?

Graybeard: I don’t have time not to. But this is something for later conversation. Another thing you learn with experience is to appreciate things that work. Zach Holman, then at github, said “Your product should be cutting edge, not your tech … stability is sexy.”

Youngstar: I wonder how we make progress then.

Graybeard: Sometimes the advantages of new technology outweigh the risk. Also, people are way too optimistic for their own good.

Youngstar: Oh, what about Anaconda? I heard people talking about it.

Graybeard: Anaconda is based on CPython, and comes bundled with scientific packages. There are other scientific Python distributions out there but it seems to be the dominant one. If you plan to use a lot of scientific packages, such as numpy, scipy, matplotlib and others, give Anaconda a try.

Youngstar: I don’t have plan for that now, and as you said earlier switching is not that painful.

Graybeard: Just make sure you have a good test suite.

Youngstar: Will do, but testing is a big subject and we’re getting to the point where my boyfriend gets jealous of you. Final recommendation?

Graybeard: Don’t be lazy, do your homework and find the right Python, or other programming language, for you. Note that switching from one Python to another shouldn’t be that difficult. At one place we had to switch from Python 3 to 2 due to dependency issue, it took us about half a day to do that.

Youngstar: So the decision is not that crucial?

Graybeard: It is, don’t take it lightly. We were lucky the switch was easy, you might not be.

IDEs and Editors

All mail clients suck. This one just sucks less.

- Michael R. Elkins (mutt website)

Youngstar: What are you using to write Python code?

Graybeard: Vim, I use it for everything.

Youngstar: Cool, so I’ll start using it.

Graybeard: Hold your horses. Mastering Vim is a long and sometimes a painful experience. I’ve been using it for more than 15 years and I’m still learning.

Youngstar: Whoa! I don’t have 15 years, I need to get productive now.

Graybeard: Since you’re going to spend most of your time inside an editor/IDE5 - try to pick a good one and master it.

Youngstar: I know I’ll regret this… But which one should I use?

Graybeard: It’s not that simple, there are several factors you need to consider. At the end, it’s a matter of personal taste. Check out the editor war sometime.

Youngstar: Editor war?

Graybeard: Yeah, some people get too passionate sometimes.

Youngstar: OK. Let’s start with what you’re using. Why are you using Vim?

Graybeard: As I said - it takes time to master Vim and get used to its dual editing mode. However once you’ve mastered Vim you’ll be super productive with it not just in Python but with almost any other language. Vim itself is pretty bare-bones editor, but it has a rich plugin ecosystem which can transform it to a powerful IDE. One of the main advantages (at least for backend developers like me) is that on most Unix like systems - it’s already there. Vim can work in “terminal mode” which does not require a windowing system. This means you can SSH to a box and start editing. Oh - and you can write Vim scripts in Python.

Youngstar: Isn’t Vim old?

Graybeard: In tech old usually means working - take me for example.

Youngstar: Ha! What’s the other editor old developers use? The lispy one?

Graybeard: Emacs?

Youngstar: That’s the one.

Graybeard: Emacs is a text editor that does everything. It has excellent Python support with python-mode and many core Python developers use it.

Youngstar: Then why don’t you use it?

Graybeard: Since I picked the dark side of the editor war.

Youngstar: And something more modern?

Graybeard: Before going modern, I’d like to stress that both of these editors take a lot of work to master. But once you grok them, both will offer you things that most other editors or IDEs will not.

Youngstar: Noted, I’ll invest some time learning one of them. Maybe emacs just to annoy you.

Graybeard: I never get annoyed by stupid editors people pick.

Youngstar: Something more modern?

Graybeard: I’m seeing a lot of people using PyCharm, from JetBrains the makers of IntelliJ. There also PyDev which sits on top of Eclipse.

Youngstar: IntelliJ? Eclipse? Aren’t those Java IDEs?

Graybeard: They started there, but now they are very powerful general purpose IDEs. You will need Java to run them, and a lot of memory. A strong CPU won’t hurt as well.

Youngstar: And PyCharm/PyDev are the Python environment?

Graybeard: Yes. There’s also Aptana which is Eclipse already bundled with PyDev.

Youngstar: Doesn’t it take time to start them?

Graybeard: People usually have them running for weeks at a time. You can switch projects without closing the IDE.

Youngstar: OK. Any other options?

Graybeard: In Windows world, Visual Studio comes with excellent Python support called PTVS.

Youngstar: Windows? Visual Studio? You?

Graybeard: Some claim that Visual Studio is the best IDE out there, but then again - they are using Windows ;)

Youngstar: Thanks but I don’t think I’ll switch to Windows just for that.

Graybeard: Smart girl.

Youngstar: After all the brainwashing you did?

Graybeard: I prefer “showing you the light”.

Youngstar: Yeah, yeah. Back on track - any more?

Graybeard: There are so many.

Spyder is good you’re doing a lot of scientific Python or coming from Matlab. It’s not as polished but fits better with scientific development.

There are also Atom, Sublime, and many other good editors out there with Python support. There are Wiki pages for both Editors Wiki and IDEs Wiki on the Python web site if the above are not enough.

Youngstar: As usual, I’m more confused than before.

Graybeard: My advice - pick one or two (and make sure Vim is one of them ;)) and try them out. Do a little project with each, see what fits your work style and then start specializing. I personally try a new one every now and then - but always get back to Vim eventually. Maybe I’m too old to learn new tricks.

Youngstar: OK. Anything I need to pay attention to while learning or using these IDEs?

Graybeard: Most of them have good integration with linters, make sure to enable it.

Youngstar: Linters?

Graybeard: Programs that check your code for common errors and coding conventions. We’ll talk more on them when discussing testing, but the editor will mark lines with errors so you can fix them right away. For example I use flake8 integration in Vim.

Youngstar: Fixing errors closer to when you introduce them is always better.

Graybeard: Yes. I think some of them run the tests in the background whenever the code changes.

Youngstar: Cool!

Graybeard: Depends on how fast your tests are.

Youngstar: I can see that. Any other advice?

Graybeard: What? That was not enough for you? I guess another good advice is to be patient.

Youngstar: Have you seen my hair color? I wasn’t born with the patience gene.

Graybeard: You kids … The point is that it takes time to master an editor or an IDE. Give it time, and you’ll see your productivity jumping. I call it the “output” part of a programmer I/O.

Youngstar: Programmer I/O? As in input/output?

Graybeard: Yes. Most of your time as a developer should be spent thinking. However reading and writing are also part of the process and a good editor or IDE can increase the output part. Another thing that writing fast does is that you can write several drafts of your code and not lock into the first one you write.

Youngstar: Good point. I guess I’ll brush on my speed reading to get the input part faster.

Graybeard: Yes. We programmers spend a lot of time reading, both code and technical documents.

Youngstar: And in your case a lot of Sci-Fi.

Graybeard: Where do you think I get all my ideas from?

Youngstar: Thin air?

Graybeard: You’re too kind, I thought you were going to mention a certain body part.

Youngstar: What are you? Six?

Graybeard: Mentally? Not much more. But I see this conversation has taken a bad turn so I’ll stop here.

Youngstar: Right as usual, cheers!

Project Structure

organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.

- Conway’s Law

Youngstar: How should I structure my code? I currently have everything in one directory and it looks messy.

Graybeard: Are you facing a specific problem?

Youngstar: Not really, but I assume I should be more organized.

Graybeard: As the bad guy in a very bad movie said: “Assumptions are the mother of all !#?@ups”.

Youngstar: Which movie was that?

Graybeard: “Under Siege 2” if my one bit memory serves me right.

Youngstar: Don’t think I saw that one.

Graybeard: Trust me - you’re not missing anything. But back to your question. Why are you trying to fix something that you don’t know is broken?

Youngstar: You’re probably right. I’ll leave it for now.

Graybeard: I didn’t say it’s not broken. I just said you think it’s not broken.

Youngstar: OK, enlighten me.

Graybeard: Do you have some tests?

Youngstar: Sure!

Graybeard: How do you make sure they don’t get to production?

Youngstar: Why shouldn’t they?

Graybeard: Ask github who had a few hours of downtime a while back. The cause was tests deleting the production database.

Youngstar: Ouch!

Graybeard: Yes, and github are not the only ones bitten by this problem.

Python has an established way to organize projects. It’s not mandatory but I found it’s a good practice. Let’s assume that the name of your project is archer.

Youngstar: Do you have to bring that TV show into everything?

Graybeard: Please be quiet, I’m trying to teach you something here. I’m also still hurt you didn’t take my suggestion for a project name.

Youngstar: I’m being quiet.

GrayBeard draws the following diagram on a napkin:

 1 archer
 2 ├── README.md
 3 ├── Makefile
 4 ├── run_tests.py
 5 ├── requirements.txt
 6 ├── archer
 7    └── __init__.py
 8 ├── docs
 9 └── tests
10     └── test_archer.py

Graybeard: Let’s go over this. The top archer directory is your project - the one you clone from source control.

The second archer directory is your Python package where the code is. tests are outside of the code so they won’t get deployed.

Youngstar: And the rest of the files?

Graybeard: Every project should have a README with at least an elevator pitch. This focuses people on what we’re doing here. It should also contain instructions for developers not found in the docs.

The docs directory is the generated documentation, I don’t usually have docs other than what’s in the code and in the README.

Youngstar: .md stand for markdown right?

Graybeard: Yes. You can also use ReStructuredText or plain text. But markdown became very dominant these days. There are several variants of Markdown, pick one and stick to it.

Youngstar: Markdown it is then.

Graybeard: What else? Oh, I usually have a main Makefile to automate some tasks, requirements.txt to specify external requirments. And one script to run all the tests. We’ll discuss what’s in requirements.txt and run_tests.py when we talk about dependencies and testing.

Youngstar: OK, I’ll try to remind you - considering your one bit memory.

Graybeard: Yay, an external memory! I’ll drink to that.

As said, this is my personal preference which is based on how many Python projects are structured. You might find another one better for you but I suggest you start with it.

Youngstar: Anything else?

Graybeard: Yes, don’t overthink this and spend too much on it. Start with something and only if it becomes a problem fix it.

Youngstar: That’s advice you give for many things.

Graybeard: Because it’s a good one, and hopefully one day you’ll make it a habit.

Youngstar: Is there a way to automatically generate documentation?

Graybeard: Yeah, write simple code that people can understand.

Youngstar: That’s a manual way.

Graybeard: OK “Miss Always Right”. I stand, actually sit, corrected.

I say that the only updated documentation is the code itself.

Youngstar: That’s good in the general case, however sometimes I need to write tricky code. For example when optimizing.

Graybeard: Optimization is a subject for another talk. But you’re right, when you do stuff that is not that obvious - write good docstrings and comments.

Youngstar: Are there tools to generate nice documentation from docstrings?

Graybeard: Of course. In the Python world we mostly use Sphinx. It has a format for documentation strings and can generate HTML, PDF and maybe other formats. A nice feature of Sphinx is that it can run doctest tests.

Youngstar: doctest is where you write snippets of code in your docstrings?

Graybeard: Exactly, and I find it cool that you have testable documentation.

Youngstar: How about the “big stuff”? Things that don’t fit inside one module?

Graybeard: You have the README for that and also Sphinx can have top level documentation. Note that if you have documentation, you’ll need to add checking it as part of the code review.

Youngstar: How did we get from project structure to writing documentation?

Graybeard: Not sure. Last thing about documentation is that several times I saw people investing a lot of time in generating very nice documentation that nobody looks at.

Youngstar: I’ll start with simple documentation. Anything else about project structure?

Graybeard: There are more files you might need. A MANIFEST.in files to help with packaging. ChangeLog to list changes, NOTICE.txt or LICENSE.txt for specifying license. tox.ini for running tests on multiple versions of Python and many other files. Start with the least amount of items and add new ones only when you need to.

Youngstar: Then trim and restructure periodically?

Graybeard: Exactly.

Youngstar: What about setup.py, I’ve seen it in many projects.

Graybeard: setup.py is used for packaging. Do you need packaging?

Youngstar: Currently I deploy directly from git.

Graybeard: So you probably don’t need packaging. setup.py is mostly used when creating packages for other people to use and in open source code. There’s a lot of options there and when you decided to release some of your code as open source we can talk about it.

Youngstar: I’ll live without setup.py for now. Priorities …

Graybeard: Very good.

Managing Dependencies

Only the paranoid survive.

- Andy Grove

Youngstar: You won’t believe the stupid bug I was chasing today.

Graybeard: Do tell.

Youngstar: I was updating some packages …

Graybeard: and one of the new versions had a regression bug and it took you all day to figure it out.

Youngstar: What do you know? I’m not that special after all.

Graybeard: Oh, you are unique - just like everybody else.

Youngstar: Funny! So how can I avoid bugs like this in the future.

Graybeard: Congrats, you know that the best way to solve a bug is to make sure that it’s impossible to introduce such bugs in the future.

Youngstar: Yeah, forgot who taught me that …

Graybeard: Buy me another beer and I’ll refresh your memory.

Youngstar: Sure thing. Now back to my question…

Graybeard: How do you manage your dependencies?

Youngstar: I have a requirements.txt with package per line, and I run pip install -r requirements.txt to install them.

Graybeard: You know you can specify a specific version using ==. For example requests==2.9.1

Youngstar: I didn’t know that. But why would you do that - you won’t get all the bug fixes … Doh!

Graybeard: Exactly!

Youngstar: Then I should probably version all my packages.

Graybeard: I agree.

Youngstar: I know I’ll regret this… But any other pointers on dependency management?

Graybeard: As I said many times, one of the biggest factors in your development practices is the price of error, for example it’s much harder to fix a bug in an embedded system than in a small site web server. The bigger the cost of error the more strict you want to be with your requirements and enable stable builds.

For example, do you use virtual environments?

Youngstar: Yes, I use virtualenv.

Graybeard: Why?

Youngstar: So that packages are installed in isolation per project and not globally in the system.

Graybeard: Good, this is one more isolation level. By the way, newer versions of Python comes with venv module which does basically the same work.

Youngstar: That’s nice, one less dependency. Any differences between virtualenv and venv?

Graybeard: Two that I’m aware of. One is that with virtualenv you can specify a different Python interpreter, for example even if your default Python is 3 you can still create a virtual environment with the Python 2 interpreter. The second is that virtualenv has a Python module to setup the virtual environment from Python. This way you don’t need to run activate before running your code, you can do it from your Python script.

Also since venv is in the Python standard library, it’ll updated only when a new version of Python is released. virtualenv will probably have a faster release cycle.

Youngstar: Good to know. The downside of using virtual environments is I need to teach my IDE which is the right Python.

Graybeard: Which IDE are you playing with right now?

Youngstar: atom.

Graybeard: That’s a cool one, almost as good as Vim.

Youngstar: Yeah, yeah. Any other pointers for managing dependencies?

Graybeard: Don’t use the system Python.

Youngstar: Why?

Graybeard: In general, it’s preferred to leave the system Python alone since a lot of system utilities are written in Python and a system upgrade might break your code. Red Hat based distros use a lot of Python.

Youngstar: I thought virtualenv makes sure you don’t use any system package.

Graybeard: And 3’rd party package. But what will happen to your code once the next debian ships with Python 3 as default?

Youngstar: I see. Is debian a popular distro?

Graybeard: Very, several other distros are based on debian, such as Ubuntu, Mint and others. Changes to debian will find their way to these distros eventually.

Youngstar: I use Mint, now I remember reading somewhere it’s debian based.

Graybeard: And of course if you don’t use a virtual environment and install new packages, you might break system scripts.

Youngstar: One more reason to use virtual environments.

Graybeard: Yup. Now what happens if pypi is down when you deploy?

Youngstar: I’m pretty much screwed, but how can I overcome this?

Graybeard: In some cases it might be OK to wait for pypi to get back it. It’s has been more stable in recent years. If you need to deploy no matter what, then you need to pre build your dependencies and tell pip to install it from your servers.

Youngstar: pip can do that?

Graybeard: pip can do many things, this is one of them. See the --index-url and --find-link options if pip install.

Youngstar: OK.

Graybeard: Now about the version of the C compiler…

Youngstar: I write Python code, not C.

Graybeard: You can write Python modules in C, and there are many good reasons for doing that - but mostly as last resort. It’s likely that one of your dependencies is a C extension. Then you’ll need a C compiler and possibly some libraries and header files. Some libraries require a Fortran compiler.

Youngstar: Fortran?

Graybeard: Yes, sometimes a Fortran compiler can do better optimization than a C compiler.

Youngstar: How do people on the Windows world find a C compiler?

Graybeard: There’s a free C compiler for every major platform. gcc or clang on Unix like systems. And the Microsoft compiler comes free nowadays.

Youngstar: Good to know. And what’s the solution here for the C extensions problem?

Graybeard: The idea is that you build all your dependencies in advance and then use them. The latest packaging format is called wheel. It’s basically a zip file that contains both the Python code and the compiled extension as a shared library.

Youngstar: What happened to eggs?

Graybeard: wheel is the new egg.

Youngstar: I’ll get the T-Shirt.

Graybeard: Some companies have a “build machine” which has all the required dependencies to build the packages. This way you don’t need to install a lot of tools on your production machines. This build machine is usually also the one serving these third party packages. By the way, this process of keeping third party dependencies locally is sometimes known as “vendoring”.

Youngstar: How deep does this rabbit hole go?

Graybeard: Just you wait Alice. Oh! The places we’ll go… Dependency management is an old and unsolved problem. Pick any package manager: yum, apt, gem, npm … - all of them have their problems.

Youngstar: Consolation of fools… Can we get back to the Python realm?

Graybeard: Yes.

Youngstar: And …

Graybeard: Hold on, collecting my thoughts… OK. If you’re doing a lot of scientific computing - numpy, pandas, matplotlib and other packages. pip installing them can be a pain. Try installing matplotlib on OSX when you have some spare time.

Youngstar: Right… Should I wax my legs while doing it?

Graybeard: Not sure what will hurt more. Anyway … There’s an alternate package manager called conda. conda was developed by Continuum to solving the problem of installing scientific packages. Over time in became a general installer and you can install other packages with it. Note that not all of the packages on PyPI can be installed with conda.

Youngstar: What do I do then?

Graybeard: conda plays well with pip and you can use both. conda has its own notion of “environments” and it installs pip in them for just this case. conda supports Linux, Windows, OSX, ARM …

Youngstar: Do you get royalties from Continuum?

Graybeard: Nope, but since I’ve been doing a lot of scientific Python lately it had saved me tons of time and agony. Going deeper …

You can use docker. This will give you a system where you know exactly what going on - which version of Python, of libc … However docker comes with it own set of issues - mainly what’s called “orchestration” but I won’t get into that. The simple approach is just to run a single container as your application on the host.

Youngstar: OK.

Graybeard: Alan Key once said “People who are really serious about software should make their own hardware.”

Youngstar: Let’s stop here, I have no intention of starting a hardware company.

Graybeard: CPUs have bugs as well, you might want to control the version of CPU you use.

Youngstar: OK. A related question - How do you choose which package to use?

Graybeard: If the package implements a known protocol or connection to external tool (such as a database), chances are that the main site of the protocol/tool will list recommended “language bindings”. For example, the bottom part of msgpack site has a “Languages” section with Python pointing to msgpack-python.

Youngstar: And if I don’t find a reference to Python in the main site?

Graybeard: Most packages are hosted on public sites such as github There you can see the project “health” - how many committers, commit history and last commit, number of open bugs …

Ask around, the Python community is very friendly and helpful. There are also sites who have a curated list of packages. However don’t blindly trust them, make up your own mind. I find they have a tendency to recommend the shiny new toys.

Youngstar: Err toward mature package.

Graybeard: “Stability is sexy.”

Youngstar: We need to have a talk about how you define “sexy”, but another time.

Graybeard: Ha!

Another thing you should do is test before you use. Pick a package or two and try it out to see how it behaves. Try to simulate real environment and load as much as you can and always make sure to write code in a way that makes switching packages easy as possible.

Youngstar: Do I really need to do so much even before writing even one line of code?

Graybeard: This is sometimes called “accidental complexity.” But no, don’t start with having your own build machine and internal PyPI. Start simple with pip, virtual environment and versioned requirements file.

Youngstar: Pain vs Gain?

Graybeard: Exactly. Start with minimal effort that works for you and grow when you need.

Youngstar: Thanks for that. My head is full and my beer glass is empty - time to go home.

Graybeard: Cheers!

Storage

Two rules of database systems

  1. It takes 7 years minimum to create a production-ready database system
  2. You’re not an exception to rule 1

- Luca Candela

Youngstar: I need to store some data and was thinking of using MySQL, what do you think?

Graybeard: I think you mean MariaDB.

Youngstar: What?

Graybeard: MariaDB is the community fork of MySQL, done after Oracle bought MySQL.

Youngstar: Like OpenOffice and LibreOffice?

Graybeard: Exactly.

Youngstar: OK. Now that we clarified this issue, can we get back to my initial question?

Graybeard: I don’t think I know enough about your data to give you a good answer.

Youngstar: Currently I don’t have much data. Some user information, some session data. Things are very much in flux so it’s hard to know.

Graybeard: I’ll give you my usual advice - start simple.

Youngstar: Gee, why didn’t I think of that? What do you mean by “simple”?

Graybeard: When you start with a database such as MySQL you add complexity to your system. You need to serialize/deserialize your objects, you have schemas to design and update - and schema migration can be tricky. Using MySQL also means you need a server, users, backup …

Youngstar: OK, so what do you suggest?

Graybeard: When I need storage, I usually start with shelve. It’s very much like a dict which is backed to disk. The main limitation is that the keys have to be strings, the values can be anything that pickle can handle. I don’t have to worry about serialization, schemas and other things.

Youngstar: How do I query it?

Graybeard: By running for loops in Python.

Youngstar: Isn’t it slow?

Graybeard: sighs Speed again? What’s your speed requirement? How many objects do you have? Have you profiled your code? …

Youngstar: OK, OK …

Graybeard: As a rule of thumb, for a system that’s not that loaded and around tens of thousands of objects - shelve will work reasonably well.

Youngstar: Is it thread safe?

Graybeard: Is your application multi-threaded?

Youngstar: I haven’t decided on the web server yet, so I don’t know.

Graybeard: Well, if you find you need to be thread safe - slap a threading.Lock on it. It’s a good idea to have your own data access layer anyway, so switching storage backends shouldn’t be that hard. Writing a nice DAL also forces you to think about your storage API. Most of them time the usual CRUD is enough, maybe some search as well.

Youngstar: DAL? CRUD?

Graybeard: DAL is Data access layer. CURD is Create, Update, Retrieve, Delete

Youngstar: Ah. What about ORMs6? I heard SQLAlchemy is great.

Graybeard: I have mixed feeling about ORMs. On one hand they save you a lot of boilerplate coding. However I found out that when your data usage become more sophisticated, you need to work around them. Also I haven’t found a good ORM for a NoSQL databases yet. If you end up using an ORM, make sure it’s easy to rip it out if it becomes a problem more than a solution.

Youngstar: NoSQL as in MongoDB?

Graybeard: Yup. There are so many of them.

Youngstar: Are they better than SQL ones?

Graybeard: It really depends on your usage. I found NoSQL databases good for early stages when your data model is still in flux and schemas are just in your way. I usually start with shelve and switch to NoSQL database if I need support for large amount of data or client/server architecture.

Youngstar: Will I need client/server support?

Graybeard: My crystal ball is broken today. However the answer is probably yes. You usually run more than one server for failover or load handling, and you’ll want all of these servers looking at the same data.

Youngstar: I guess if I can make my server stateless it’ll be best.

Graybeard: Good insight. In practice this is really hard to achieve, but a good goal to strive to. I worked at a company that stored all the data required in HTTP cookies. This meant the client sent all the data we needed in every request. Which saved us a lot of database queries.

Youngstar: When will you pick an SQL database?

Graybeard: There are many parameters that point to SQL database. One thing is that many people know SQL, and if you have many hands touching the data - it’s a good thing. Also many tools, mainly reporting ones, work well with SQL.

The other thing is that some of the SQL databases, I personally prefer PostgreSQL, are wicked fast when you have much more reads than writes.

Also, SQL databases tend to be older, which means they are more stable and have more tooling and knowledge around them.

Youngstar: You prefer older? You love all this new and shiny stuff.

Graybeard: I know, but I’ve been bitten by “new” database. At one company we worked with a two years old database. About 90% of our downtime was due to database issues.

Youngstar: Ouch.

Graybeard: Yes. I hear the situation has improved since then, it takes time for a database to mature and be production ready.

Youngstar: OK, I’ll learn some SQL then.

Graybeard: It’s not just SQL you need to learn but also NoSQL. There are many ways to model your data and you need to know things like normalization, fact tables, type 2 dimension tables and more. One of the more effective ways I know is to start from the UI and think about the queries you’re going to perform. After that you start modeling the data.

Thinking and designing your data layer is very important. In “The Mythical Man-Month” Fred Brooks says: “Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”

Youngstar: flowcharts?

Graybeard: Yeah, this book is from 1975.

Youngstar: 75? Are you kidding me?

Graybeard: It’s timeless. Talks mostly about people and communication, and people haven’t change a lot in the last few thousand year.

Youngstar: But still … 75?

Graybeard: Read it for yourself and decide. Well worth the time in my opinion.

Youngstar: OK… Going back to present day - any more advice?

Graybeard: Couple tidbits:

You’ll probably have some complex queries in your code. I recommend saving them in external files - SQL, JSON … and not in code. I once worked in a company who used the Spring framework. They went half the way and stored the SQL queries in the Spring XML configuration files. It was really hard to read the SQL embedded in the XML, there was no syntax highlighting and viewing diffs was a mess.

The second thing is that most Python’s SQL database drivers support accessing columns by name and not just by index. Accessing by index is both less readable and prone to error, someone changes the SQL query and suddenly row[2] is not the column you want. For example in sqlite3 you need to set the connection row_factroy attribute to sqlite3.Row and then each column can be accessed both by position and by name.

Youngstar: OK, I’ll remember these. Now what about backup? How often to I need to backup my databases?

Graybeard: You don’t need backup.

Youngstar: I don’t?

Graybeard: No - you need recovery. You’ll be surprised how many companies had backups of their data but couldn’t restore from it when time came.

Youngstar: So backup is part of recovery. How often should I do it?

Graybeard: Again, depending on your audit and recovery needs - this question can have very different answer. Another thing is that backups tend to grow in size and accumulate, have a good purging policy. One more thing is that if you use a hosted database - that might take care of backup and recovery for you.

Youngstar: Hosted?

Graybeard: Yup. And considering that they take all the operations headache from you it might be a good solution. Google has BigQuery, Amazon has Redshift, there’s compose and many others.

An extra benefit for BigQuery and Redshift is that they scale. Both claim they can process billions of records in seconds.

Youngstar: Don’t they cost money?

Graybeard: TANSTAAFL7. Don’t make the common mistake of underestimating the cost of running your own servers. Deployment, monitoring, alerting, backup and more - all take time and effort. And developer time is expensive. In The Art of Unix Programming Eric Raymond says the rule of Economy is: “Programmer time is expensive; conserve it in preference to machine time.” This is true in most cases, whenever you can save developer time - do it.

This is also why people like Google App Engine, zero ops.

Youngstar: I have to say now I’m totally confused.

Graybeard: Yeah, too many options is not a good thing. Remember this when we’ll talk about monitoring. But for now - just start with shelve or something simple as it. When things get more interesting - go over the queries you do, the business requirements and then select the right solution. Who knows? You might find yourself using a graph database at the end.

Youngstar: A graph database?

Graybeard: Yes. You store not just objects, but also relationship between them. Look up neo4j which is a very popular graph database, they have some good usage examples on their site.

Youngstar: Any other types of databases I need to know of?

Graybeard: There are so many. I think we considered the main ones except search based ones.

Youngstar: Like Elasticsearch?

Graybeard: Yes, it’s actually my favorite.

Youngstar: My God, it’s full of databases!

Graybeard: Yes Dave. Also It’s not uncommon to use more than just one database. For example a combination of SQL for fast queries and search database for textual search. Some people use Redis for fast key/value and MongoDB for document storage. It really all depends, but having just one is a big plus.

Youngstar: I’ll start simple and grow when it hurts.

Graybeard: Wise words to end the night. My beer is empty and home is calling. Next time…

Testing

A computer lets you make more mistakes faster than any invention in human history, with the possible exceptions of handguns and tequila.

- Mitch Radcliffe

Youngstar: I fixed a bug today and accidentally introduced a new one.

Graybeard: Sounds like the “99 little bugs in the code” poem.

Youngstar: I can guess the rest of it.

Graybeard: Don’t you have regression tests?

Youngstar: I have a few unit tests, but that’s about it. What are regression tests?

Graybeard: Tests that guard against exactly what happened to you - that new changes didn’t brake anything old. There are many kinds of tests and this is an important one.

Youngstar: So I should do more regression testing?

Graybeard: Let’s back off a bit. Why do you test?

Youngstar: Well, for one thing to make sure I don’t break anything.

Graybeard: Any other reason?

Youngstar: Check that the code runs as intended?

Graybeard: These are mainly unit tests. More reasons?

Youngstar: Hmm, nothing comes to mind currently. What are more reasons?

Graybeard: There are many - integration tests check that all parts of the system connect together. Fuzzing tries to bring down your system with unusual input and there are many more kinds of tests.

What do you think are the down sides of testing?

Youngstar: Downside? Let’s see … Well - they take time to write, that’s for sure.

Graybeard: Anything else?

Youngstar: Every time I change my code - I need to change the tests as well. This makes sure that I didn’t mess anything up, but also take more time.

Graybeard: Yes. This is what the guys in Getting Real call “mass”. The more mass you have, the harder it is to make changes.

The amount and kind of testing is influenced by the cost of error. If you’re writing a life support system - you’ll use much more testing than what you need in your little project right now.

The main point here is that testing is “pain vs gain” balance. Make sure the extra mass and time pain is worth the gain.

Youngstar: Speaking of tests, do you practice TDD?

Graybeard: Sometimes, mostly when working with new developers. I found out it helps them designing clean code. You should fit the methodology to the team your working with. I personally write test after the first or second draft of the code is working.

Youngstar: How do you know it’s working?

Graybeard: I try it out in the REPL.

Youngstar: The what?

Graybeard: REPL stands for “read eval print loop”, you might also know it as “the interactive prompt”. You write little pieces of code and test them as you go. After I’m done and happy with the code, I write some tests.

People underestimate how much does the REPL help during development, give it a try next time.

Youngstar: OK, I will. Which testing framework do you use?

Graybeard: I personally prefer nose. But I’ve used py.test and unittest with discover mode as well - all of them are good.

Youngstar: Why do you prefer nose?

Graybeard: I find nose simpler, and I always go for simple. Also love their test generators which let you run the same test with different input (AKA table driven testing). Their xunit output is great for Jenkins integration as well.

Oh, and I also use tox for testing the same code on multiple versions/implementations of Python.

Youngstar: I’ll start with nose then, don’t need multi version testing currently. How do I run the tests?

Graybeard: nose comes with nosetests script that discovers and executes tests. But this is usually the last thing in I run.

Youngstar: Last? What do you run before it?

Graybeard: Few things: I check that there are no calls to pdb in the code.

Youngstar: pdb is the Python debugger?

Graybeard: Yes, you can insert calls to it if the breakpoint condition becomes too complicated. We’ll talk about debugging later. Another thing I do is clean all the compiled modules.

Youngstar: The .pyc files that are generated on import? Why?

Graybeard: Say you renamed a module but forgot to change the import in your code. Since the .pyc of the old module is still there - your test will pass.

Youngstar: Gotcha.

Graybeard: I also run linter, I use flake8 which combines pyflakes and pep8, before the tests and fail on any output.

Youngstar: Does pep8 check for coding conventions?

Graybeard: Yes, this is how I avoid wasting time on coding convention talks. If the code passes pep8 - it’s fine. However don’t get too stuck on coding conventions, see Raymond Hettinger’s talk called Beyond PEP8.

Youngstar: Will do, anything else?

Graybeard: Nope. After that I run the test suite.

Youngstar: Sounds like a lot of steps. Knowing you, you probably have a script to do this.

Graybeard: Correct, I’ll mail it over if I remember. But I’m sure you can code it yourself.

Youngstar: I’ll remind you.

Graybeard: Thanks. Having one command to run your tests also makes sure other members in your team don’t forget steps. I’m not the only one with a one bit memory.

Youngstar: In some cases I found out the tests run for a long time. Which makes it annoying to run them every time I make a change.

Graybeard: My rule of thumb is that developers won’t run tests that take more than about a minute.

Youngstar: So how do you run longer tests?

Graybeard: With my friend Jenkins.

Youngstar: It’s the system that monitors your source tree and run tests on every change?

Graybeard: Yes. It’s called “continuous integration” or CI for short. Jenkins can do much more but at heart this is exactly what it does.

I separate the tests to faster ones that can run on a developer machine without too much setup and longer ones that run on Jenkins. Both nose and pytest have a way to mark tests and pick a subset of tests to run. In unittest I use environment variables and a special exception that’s called SkipTest.

Youngstar: And when Jenkins runs the tests it selects all of them?

Graybeard: Yup. A common mistake that people do is to write a lot of code in the Jenkins execute field.

Youngstar: Why is it a mistake?

Graybeard: Since then it’s usually not in source control.

Youngstar: Ah! And then if you want to make changes to how tests are run - you change the script and commit.

Graybeard: Exactly. Note that Jenkins can do much more but start simple as always.

Youngstar: Another thing I recall we talked about was to make sure tests don’t get into production.

Graybeard: Yes, try to make it impossible for tests to get or touch production.

Youngstar: Any more advice?

Graybeard: Yes - cleanup at start of the test.

Youngstar: Say what?

Graybeard: Most test frameworks allow you a setup and teardown methods. Most people create what they need in the setup, for example setting database tables and populating them with data. Then the use the teardown to cleanup everything. The problem is that teardown gets called even when the tests fail, and then if you want to debug - the data is missing. If on the other hand you use only the setup method and initially cleanup and then populate, you’ll still have data to debug if the tests fail.

Youngstar: Will do.

Graybeard: The last thing to remember…

Youngstar: Yay, there’s more!

Graybeard: Testing is a mastery by itself, and done right it’ll save you a lot of agony. But no matter how hard you test - bugs will get out into production and you need to be ready for that. Monitoring and altering is something we’ll talk about next time. NASA which has a very strict and thorough development process, still manage to ship bugs to outer space.

Youngstar: Really?

Graybeard: Yup. But they have a system in place to fix bugs in outer space as well.

Youngstar: I guess I’ll have to mock some parts of the system for testing, any advice on this?

Graybeard: In general - don’t mock! Every time you use a mock you cheat and don’t really test your system. Mocks are another “mass” you acquire and need to be updated to match what they are mocking. I’ve found out that with a little effort you can usually avoid mocking. I once worked at a company where we were doing web scraping, getting HTML pages, parsing them, analyzing and storing in a database (Elasticsearch by the way). At first someone suggest we’ll mock the HTTP connection and get a canned HTML. But with a bit more coding we created an HTTP server using Flask which returned canned HTML pages. This way we also tested our connection infrastructure and when we wanted to test accessing pages with user/password - it was easy to add these kind of pages to the test HTTP server.

However sometime the cost of not mocking is too much - “pain vs gain” again. There’s a mock package in the Python 3 and for Python 2 it’s available on pypi.

Youngstar: Any more advice?

Graybeard: Testing is a bottomless pit. We can talk on it for hours, but I’m getting tired and I think we covered the main points. Also my beer is empty - going home now.

Youngstar: Cheers.

Configuration

Amateurs think about tactics, but professionals think about logistics.

- General Robert H. Barrow

Youngstar: I now have two environment where the code run. We have a production environment but we also have a QA environment. I have an if env == 'PROD': in my code but I’m not to happy about it. I also remember you once said I should try to minimize if in my code. How would you handle it.

Graybeard: What makes you think you have only two environment?

Youngstar: Oh, you’re right. There’s also the local development environment on my machine.

Graybeard: Yeah, and the number of environments will grow. You might want to check a new database version, a new package version …

Youngstar: Eeeek, again accidental complexity bites us in the behind.

Graybeard: How much did you drink? You usually get depressed later on.

Youngstar: You’re right, lemme get another round and you can tell me how to solve my problems.

Graybeard: Sure, I’ll wait.

Youngstar fetches a new round, they drink in silence for a few minutes.

Graybeard: OK, did you figure how to solve your problem by now?

Youngstar: I thought of some kind of configuration system, then have a configuration file per environment. Probably use JSON since writing my own format is bad.

Graybeard: Why JSON?

Youngstar: There’s already a parser and it’s well known format.

Graybeard: Would you like to have some comments in your configuration?

Youngstar: Probably yes … that rules out JSON. YAML?

Graybeard: YAML is a great format for configuration. I use it a lot, but there’s something even simpler.

Youngstar: YAML is pretty simple, you just load the configuration file. The only way it’ll be simple if the configuration will already be in Python … Oh - so I’ll use Python.

Graybeard: Yes. I usually use a system where I have config.py and just import it. Having said that, a YAML (or other format) based system is good as well. But start the simplest way you can.

Youngstar: But then how do I get a different configuration per system?

Graybeard: You have a overrides file where you place values per system, something like this in config.py (Graybeard write on a napkin)

1 # config.py
2 
3 db_host = 'localhost'
4 db_port = 8000
5 
6 try:
7     from config_local import *  # noqa
8 except ImportError:
9     pass

Youngstar: What’s # noqa?

Graybeard: Oh, a force of habit. Most linters consider import * as an error. # noqa tells flake8 to ignore this line. We talked about flake8 a while ago when we talked about testing.

Youngstar: Yeah I remember. So in your system if there is a config_local.py next to the file everything written there will override what’s in config.py

Graybeard: In the import path, not just next to config.py

Youngstar: Yeah, and I see where you can use PYTHONPATH to get different config_local.py per environment.

Graybeard: Yes. In most cases the deployment system, say Ansible, will generate config_local.py based on the environment.

Youngstar: And I guess the default in config.py should be for local development environment?

Graybeard: That’s right.

Youngstar: This system looks good enough to my usage, anything else?

Graybeard: There are many ways to do configuration, and you should pick the one that fits your case. We talked about overrides, the usual order is default < configuration < environment variables < command line switches. You can use something like ChainMap for this.

Youngstar: OK. I guess adding command line support helps in quickly testing other systems.

Graybeard: Yes, sometime the script that starts your program (say docker) gives all the right switches. Then you can go without configuration system at all in your code.

Youngstar: It’s not true, you just moved the configuration system to the deployment/running system.

Graybeard: I said “in your code”. Glad you caught that, many people when they talk about “zero configuration” mean “in the code”. There’s a nice thing about not having configuration in your code, but I found out that the code is usually tested better than the configuration system. I prefer to have the complexity where there are more tests.

Youngstar: What about storing configuration in a server?

Graybeard: People do that as well, they use systems like ZooKeeper, Consul and others for this.

Youngstar: Then you need just to know where the configuration server is.

Graybeard: Yeah, but then someone need to populate the configuration values on the server.

Youngstar: Agree. Anything else about configuration?

Graybeard: There’s so much more. Some people believe you should use just environment variables.

Youngstar: Why?

Graybeard: Read the 12 factor app and see.

Youngstar: Yay, more reading.

Graybeard: As we said, the IT automation system (Ansible, docker …) can generate fixed values. The database host will have the same name (say elastic) and the IP will change from system to system.

Youngstar: I am just using fabric, should I switch to Ansible?

Graybeard: Depends on the complexity of your deployment. fabric is very simple so it usually start there and switch to something more complex only when I need to. If you use docker based system like docker-compose and kubernetes have their own system for hooking containers together.

Youngstar: And then my code uses less configuration.

Graybeard: Exactly. But beware of jumping into docker - it’s cool but comes with it’s own set of problems.

Youngstar: Which are?

Graybeard: Let’s talk about it later when we discuss deployment.

Youngstar: OK. I guess as usual I’ll start simple and grow in complexity when I need to.

Graybeard: So young and so wise.

Youngstar: That’s right. Anything else I should know regarding configuration?

Graybeard: If you look at the code I wrote, only db_host and db_port are defined. But in some cases you’ll need a URI, something like pgsql://<db_host>:<db_port>. Instead of having everyone constructing this URI themselves you can add a line db_uri = 'pgsql://%s:%s' % (db_host, db_port) after the import from config_local.

Youngstar: What if I want a totally different URI? Say add my own user and password?

Graybeard: There’s no end to where you can go with this. I usually find out these edge cases are not worth the trouble of supporting them. Sometimes I have a utility function db_uri which will generate the URI and it can be as complex as you want. But there will also be an edge case where you configuration system falls short. As long as it supports the majority of cases - you’re fine.

Youngstar: As usual, simple things go very deep with you.

Graybeard: A good configuration system will reduce the complexity in your code. This complexity don’t go away, but it’s contained somewhere else which is a good thing.

Youngstar: What about passwords and other “secret” stuff? Where do I store it?

Graybeard: Make sure they don’t make it to configuration or checked in by mistake. We’ll have a talk on security later (and had one on configuration management already).

Youngstar: OK then.

Debugging

If debugging is the process of removing bugs, then programming must be the process of putting them in.

- Edsger Dijkstra

Youngstar: I have a bug at work that I just can’t figure out. How do you debug?

Graybeard: I mostly don’t.

Youngstar: Come on, you’re not that good.

Graybeard: Oh, I have not mastered the art of writing bug free code… yet. What I’m saying that I don’t debug in the traditional sense of using a debugger.

Youngstar: Ah, so how do you solve code problems?

Graybeard: Ever heard about Rob Pike?

Youngstar: The names rings a bell, not sure from where.

Graybeard: Look him up, he did a lot. Anyway he once said:

“If you dive into the bug, you tend to fix the local issue in the code, but if you think about the bug first, how the bug came to be, you often find and correct a higher-level problem in the code that will improve the design and prevent further bugs.”

I think it was his experience when working with Ken Thompson.

Youngstar: Ken Thompson of Unix?

Graybeard: Among other things.

Youngstar: That’s all very nice, but to get to understanding I need to debug some time.

Graybeard: Right. However I’m a backend guy and most of the time debugging is impossible. I use mostly logging to understand what’s going on. If I do debug, it’s usually with the command line debugger that comes with Python - pdb.

Youngstar: Why not a visual one?

Graybeard: Since most of the time I’m in an SSH session to a server, or in a docker container - which makes UI hard or impossible. Also once you get to know pdb it’s very effective.

Youngstar: Just like mastering Vim? OK, I’ll spend some time with it.

Graybeard: However, if you use good IDE it’ll have a visual debugger and sometimes these are nice. As we talked before, knowing your IDE well will save you tons of time.

Youngstar: OK. What else?

Graybeard: Why do you assume there’s more?

Youngstar: Since with you there’s always more.

Graybeard: Fair point. One of the tricks I used is sometime to place a “hard” breakpoint in the code. I do this when the condition for the breakpoint becomes pretty complex.

Youngstar: I thought pdb support conditional breakpoints.

Graybeard: You’re right. I can do that in pdb or other debuggers but in some cases it’s much easier to specify the condition in Python code. What you do it something like this (codes on napkin):

1 if some_complex_condition():
2     import pdb; pdb.set_trace()

Youngstar: I thought there were no semi-colons in Python.

Graybeard: There are, but rarely used. In this case where it’s just debugging it’s convenient to have it in one line. I have a Vim abbreviation for this line.

Youngstar: I bet you do.

Graybeard: Then you run your code normally, not via pdb. And once the condition is met - you’ll get the pdb prompt. If you have IPython installed you can use its debugger instead of pdb, its a bit nicer. You do it like this (codes again on a napkin):

1 if some_complex_condition():
2     from IPython.core.debugger import Pdb; Pdb().set_trace()

Youngstar: And you make sure this is not left with the code in your test script.

Graybeard: Exactly.

But as I said earlier, I mostly use logs. It’s an art to get the right balance between huge logs to to little information. Try to err on the TMI side.

Youngstar: TMI as in “Too Much Information”?

Graybeard: Yes. Storage is very cheap comparing to programmer time.

Youngstar: But what if the logs get too big?

Graybeard: You usually save only a window of time backwards. There are great tools for log rotation, both in the standard library and Unix utilities.

Youngstar: Like logrotate?

Graybeard: Exactly. You can also ship logs to log aggregation services, we’ll talk about logging and monitoring later.

Oh, and Python’s logging module can listen on a socket and change the logging configuration in run time. This way you can temporarily set a log level in one of your modules for a while, collect enough data and then return it back to the normal level.

Youngstar: Cool, I’ll look it up. Anything else about debugging?

Graybeard: Today’s systems are usually have more than one part. Debugging such a system is even more complicated. One thing I found that helps is to pass around a context object between sub systems. This way you can search the logs and get a logical view of an operation between several sub systems.

Youngstar: What’s in the context object?

Graybeard: Anything you think is useful. The bare minimum is just an identifier for the current operation/session.

Another thing people do it sometimes connect to a running service and inspect what’s going on with the Python REPL. There are several such systems, see Twisted manhole for example.

Youngstar: OK. Armed with this knowledge I’m heading back to the office.

Graybeard: Remind me to talk with you about work/life balance sometime.

Youngstar: OK.

Graybeard: But before you head back, another thing that really helps is giving it time. Letting what Daniel Khaneman calls “system 2” work on the problem.

Youngstar: System 2?

Graybeard: Yeah, not very imaginative name. Think of it as the part of your brain the works below the surface. It’s the one that does most of the leaps in understanding but it needs time. Instead of heading back to the office, go home and watch a video called “Hammock Driven Development” by Rich Hickey.

Youngstar: Oh, we definitely need to talk about work/life balance and how you have time to learn all this stuff.

Now that you mention this and I see my empty beer glass. I’m guess I’m over my “Ballmer Peak”, so I’ll go home and watch that video.

Graybeard: Kudos on knowing your XKCD.

Youngstar: Thanks and g’night.

Deployment

May the queries flow, and the pagers remain silent.

- SRE Benediction

Youngstar: I’d like to place my code out there in alpha state so people can play with it.

Graybeard: Getting feedback early is a very good thing. Where are you going to put the code?

Youngstar: That’s what I was going to ask you. There are so many options - AWS, GAE, Heroku, Azure, my own servers … Which one do you use?

Graybeard: I use the one that fits my needs.

Youngstar: That was helpful.

Graybeard: The point is that there’s no “one size fits all”. It depends on many factors. And I use different hosting solutions in different situations.

Youngstar: One of these factors is if I can place my data outside?

Graybeard: Yes. A lot of companies think their data is safer if the keep it in house. However I tend to trust the Google/Amazon security experts much more than the local IT.

Youngstar: I don’t know much about security.

Graybeard: We’ll fix that later. However today it’s more common for companies to host data outside. And even companies that say “we host data ourselves” usually mean “on our hosted servers”. Sometimes you can’t host data outside due to legal reasons or some compliance policies.

Youngstar: IANAL, but I think I’m OK with hosting data outside.

Graybeard: What most companies underestimate, is the cost of having your own servers. Scaling up becomes much more painful. And you need people doing rotation who can drive at 3AM to some Colo, have the right keys and know how to reboot the servers.

Youngstar: Colo?

Graybeard: Short for “co-location centre”. It’s usually a secure place for your servers with good network, security and other goodies.

Youngstar: So not from the office network?

Graybeard: Sadly I’ve seen that too.

Youngstar: OK, I’ll start with the cloud then. Which one?

Graybeard: There are many options and many variables you need to consider. As usual - some research required.

Youngstar: Such as pricing?

Graybeard: Pricing is one aspect. However most companies don’t fathom how much time consuming operations can be.

Youngstar: And by time you mean money.

Graybeard: Exactly. I’d do my best to limit my operational involvement.

Youngstar: OK, less ops is better. What else?

Graybeard: Try to avoid vendor lock.

Youngstar: By using open standards?

Graybeard: Yes, and also creating abstractions in your code.

Youngstar: “All problems in computer science can be solved by another level of indirection”.

Graybeard: Did you catch my quote addiction? Was this David Wheeler?

Youngstar: Yup. Just stumbled on this the other day.

Graybeard: Another thing you need to take into consideration when choosing who to use is size and reputation.

Youngstar: Very much like selecting technologies to use.

Graybeard: As the old joke says: “Nobody ever got fired for buying IBM”. Sometimes it’s OK to bet on younger products, but infrastructure is something you need working.

Youngstar: “Stability is sexy”.

Graybeard: Oh, you actually listen to what I say. I’m flattered.

Youngstar: Yeah, yeah. Go on.

Graybeard: Once you decided on hosting which fits you budget and seem decent enough. You need to fit deployment to your process. The ideal today is called continuous delivery - once tests pass on Jenkins, the code goes to production.

Youngstar: I heard that deployment is painful.

Graybeard: It doesn’t have to be. There’s a piece by the late Aaron Swartz called “Lean into the Pain”. He says that just like sport, we need to do the stuff that hurts us a lot in order to get better at it.

Youngstar: And when we deploy a lot it won’t be an issue.

Graybeard: Yup. Note that there are deploys and there are deploys. Most of them will be a non issue, but some of them will give you a headache.

Youngstar: Can you give me an example?

Graybeard: Changing a database schema in a non backward-compatible way.

Youngstar: Which means you need to re-process all the data?

Graybeard: Yes. And also you’ll have some processes still working with the old format and some working with the new format.

Youngstar: Ouch!

Graybeard: There’s a reason NoSQL is popular.

Youngstar: You can make breaking changes in NoSQL.

Graybeard: That you can, but it’s sometimes easier. You pay in other areas, pick your poison.

Youngstar: OK. I’ll think about what the data and try to automate the deployment as much as possible.

Graybeard: Good plan. Another thing which is hard in some platforms is zero downtime.

Youngstar: I read about it. So many options - Blue Green, Canary Releases, Rolling deployments

Graybeard: As usual, go simple and scale when you need. Some platforms like GAE do it for you.

Youngstar: Cool. They scale as well?

Graybeard: Yes. So does AWS and others. You need to take care to limit scaling otherwise a spike in load can make you bankrupt.

Youngstar: Ouch!

Graybeard: It’s also hurts that users can’t access your site due to load.

Youngstar: I’ll pick my poison.

Graybeard: You’re learning. It’s all about trade-offs.

Youngstar: What else?

Graybeard: You need to make sure you don’t have snowflake servers.

Youngstar: I thought servers like cold temperatures.

Graybeard: What Martin Fowler means is a unique server that you can’t rebuild if it’s gone.

Youngstar: So automate again. Which tool? Ansible, SaltStack, Chef, Terraform

Graybeard: Do your homework and ask around. I usually start simple with Fabric and move to the heavy weight when I need them.

Youngstar: OK. I will.

Graybeard: Automation also helps with avoiding errors. Some people swear by checklists, but manage to forget a step.

Youngstar: I get it, you sent me the “automate all the things” meme enough times already.

Graybeard: OK, moving on then … It’s important that there won’t be one production environment. You need one or more for QA.

Youngstar: But probably not that fancy.

Graybeard: Yup. So make sure to parameterize everything - cluster size, machine type …

Youngstar: What about Docker?

Graybeard: Docker helps in some aspects - it takes you out of dependency hell. However it comes with another level of orchestration.

Youngstar: TANSTAAFL?

Graybeard: Exactly. Docker is also let’s you create a copy of production environment on your local machine, which is handy.

Youngstar: Anything else?

Graybeard: A nice thing is to mark deployment times on your monitoring graphs. This way is you see a spike in errors it’s easy to see if it’s related to a specific release.

Youngstar: Just a vertical line?

Graybeard: Any way you want, as long as it’s visible.

Youngstar: OK.

Graybeard: Also make you you can do a rollback as well. If a release goes bad you need to be able to quickly get back. Blue-Green and rolling releases help with this.

Youngstar: Don’t forget the cute canaries.

Graybeard: That’s right. They were helpful at the coal mines and they are helpful now. Every release is a risk.

Youngstar: And we don’t like risk.

Graybeard: Yeah. In “Keys to SRE” Ben Treynor talks about “error budget”. If a deployment went bad and there’s down time - it takes out of your error budget and you release less.

Youngstar: Sound reasonable. It seems there’s so much infrastructure to build and process to develop.

Graybeard: Yeah. And backups which work, and security and …

Youngstar: OK. I get it - ops is a lot of time and money. Final advice before my head explodes?

Graybeard: Get more beer?

Youngstar: I mean deployment wise.

Graybeard: I usually start with GAE which is zero ops and once things start to heat up - I look into other platform. Or stay in GAE if it gives me all that I need.

Youngstar: OK. I’ll take a good look at my architecture and see if it can fit in one of the no-ops hosting. And now that beer please.

Graybeard: Sure thing.

Monitoring & Alerting

On a long enough timeline, the survival rate for everyone drops to zero.

- “Fight Club” movie

Youngstar: Our logging system paid off this week.

Graybeard: Do tell.

Youngstar: A customer called to say they are missing some data. A quick search in the log files found that one sub system was down for a couple of days, we brought it back up and the missing data was in front of the customer eyes in about an hour.

Graybeard: Fixing a system in an hour is indeed good. However I think you can do better.

Youngstar: Better than that? How?

Graybeard: You need to know about problem before your customers.

Youngstar: Well, we have great logging. But we look at the logs after we found out there’s a problem. We do monitor our machines for load, disk space and other things. However this was an application crash and didn’t cause a system problem, it actually reduced the load.

Graybeard: Two things: One is that monitoring without alerting is not that helpful - nobody is watching the graphs 24/7. Second is that there are better things to monitor than disk space.

Youngstar: Let’s take these one at a time. You’re saying I need some automated system that will alert me when a metric goes funky?

Graybeard: Yes. You usually start with a fixed threshold, but as your system grows complex you need more advanced methods. Remember that if you have too many alerts - people will ignore them. It’s the classic “the kid who cried wolf” story. There are some cool new systems now that apply “anomaly detection” algorithms to metrics. There are even companies that provide a service where you send them your metrics and they alert when they find an anomaly.

Youngstar: I’ll start simple with manual thresholds and move to more sophisticated stuff later.

Graybeard: Yup. “start simple” always wins. Other questions you need to ask yourself about alerting are “who?” and “how?”.

Youngstar: We’re a small team, I guess everyone should pitch in.

Graybeard: Yeah. At one company I worked with had a good rotation system. There were weekly shifts, rotating at Monday noon. Each shift had a primary and secondary role.

Youngstar: I don’t believe that everyone can solve every problem.

Graybeard: Yeah, but it’s the Pareto principle - most errors are easy to solve. The big bonus is that everyone feels the pain of failing system and start writing more robust code and also pay more attention in code reviews.

I saw a great talk called “Keys to SRE” by the guy who started the SRE team in Google.

Youngstar: SRE?

Graybeard: Site Reliability Engineer. It’s the group that makes sure things keep running in Google.

Youngstar: OK.

Graybeard: Where was I? … Oh yeah, in the video he mentions that a couple of sleepless nights does wonders to the stability of code people write.

Youngstar: I can see that. And I think that will be a good fit for my small team. I’ll give it a try - getting woken up at 3am gets old real fast. How do you actually alert?

Graybeard: Usually by alert to cellphone, pagerduty seems to be very popular. It’s good also to alert to the ops chat room.

Youngstar: OK. And if I recall you recommend to do postmortem on every issue.

Graybeard: Yeah, start with 5 whys and develop your own system. Along the way update your “red book” for what to do when shit happens.

Youngstar: I thought shit happens all the time.

Graybeard: That’s right. Now let’s talk on what to monitor.

Youngstar: I guess the usual - disk space, load, memory …

Graybeard: Right and wrong.

Youngstar: Gee, that’s helpful.

Graybeard: Let me ask you - how’s a disk 80% full affect your revenue?

Youngstar: Hmm. Well, it’s an indication that I’m going to have a problem and this might drive out users. Hard to place a number on this.

Graybeard: Right. Also let’s say everything looks OK system wise but your users can’t see data from the last 2 days.

Youngstar: I guess I need to check that as well.

Graybeard: Most people start “bottom up” from system metrics to system health. But the more important is “system health”, you need to monitor your KPIs.

Youngstar: The what?

Graybeard: KPI - Key Performance Indicator. You need to be up to date with your TLAs.

Youngstar: Three Letter Acronym?

Graybeard: Yup. Take Netflix for example, they have one major KPI they monitor called SPS - starts per second. It follows a wave pattern if there’s some deviation from this pattern - they take a look.

Youngstar: I see. But then you need to hook your own monitoring to your programs. It’s also harder to find problem in a wave like pattern which I guess differ from country to country and changes over the holidays.

Graybeard: Yes, it’s harder but better. Most of the time people measure what’s easy and not what’s important. Take highway police for example.

Youngstar: What about them?

Graybeard: They do a lot of speed traps, not because speed is the major cause of accidents, but because it’s easy to measure. Unlike reckless driving, which is far more dangerous but harder to catch.

Youngstar: I see. And how do I find these all important KPIs?

Graybeard: That’s a business question, I’m a tech guy. You’re the one owning a company - go and figure it out. As usual start simple and optimize along the way.

Youngstar: What about the other monitoring - disk, CPU, memory …

Graybeard: Keep them, but try to figure out how do they affect your business.

Youngstar: Anything else?

Graybeard: Yes - automate as much as you can.

Youngstar: For example?

Graybeard: If the disk is getting full, and you know a place where you can clean up - do it. Even better run what I call a janitor process periodically to clean things up.

Youngstar: Sound good. What’s system do you recommend for this?

Graybeard: There are many, many systems our there. See what you need and what they offer and try to find a good match. As usual go with boring reliable technology. Lately I’ve been using the ELK stack, but that’s just a personal preference. I already had Elasticsearch in place, so not using yet another system looked like a win to me. But really - have a look around, there are many and it might be that one of them is a better fit to your needs than ELK.

Youngstar: Great, more homework. Anything else?

Graybeard: It’s a good idea to do “ops drills” where you simulate problems and people solve them.

Youngstar: I guess we’ll have plenty of the real thing to practice on.

Graybeard: It’s better to deal with your first outage not at 3am with a customer shouting over the phone. Also other team members can look and learn.

Youngstar: Isn’t that what Netflix chaos monkey do?

Graybeard: Sort of, but wait until you get there. By the way they have more tools that destroy things. It’s called the Simian Army now.

Youngstar: Oh my… I need another drink to reflect on that. Want some?

Graybeard: OK, I get the hint. I’ll shut up about monitoring and alerting now :)

Security

First rule of computer security: don’t buy a computer. Second rule: if you buy one, don’t turn it on.

- Dark Avenger

Youngstar: I was going over our HTTP logs and found some weird stuff there.

Graybeard: “Little Bobby Tables”?

Youngstar: There was some SQL injection, some trying to run script and other fishy requests. How do I protect myself against such things?

Graybeard: One thing you need to keep in mind is that if someone is really targeting you - you will get hacked. Hackers managed to get into NASA, banks and many other secure places.

Youngstar: So I should just give up?

Graybeard: Why do you lock your door when you leave the house?

Youngstar: So bad people won’t be able to get in?

Graybeard: And you think that people who rob banks can’t get in your house?

Youngstar: They’ll be able to. But I do it to deter most casual thieves. Oh, I see where you’re going with this. I shouldn’t make myself an easy target.

Graybeard: Exactly. I’ll give you some simple rules to follow. Keep in mind I’m not a security expert.

Youngstar: If I had a penny on every thing you’re not an expert in…

Graybeard: You’ll probably have problems carrying all this weight.

Youngstar: Ha. OK, rules?

Graybeard: Let’s start with the social aspect. All the security in the world won’t help if you have weak passwords, if your computer doesn’t ask for login when you turn it on, if the people write passwords on a sticky note, or blindly click on any link sent to them.

Youngstar: You mean phishing?

Graybeard: Yup. And other social hacks. The key is to be aware, keep learning and educating people.

Youngstar: Good paranoid culture, sounds like fun.

Graybeard: Nah, just be careful - that’s all. You don’t think locking your door makes you a paranoid.

Youngstar: You’re right. But you told me that only the paranoids survive.

Graybeard: That was Andy Grove, not me.

Youngstar: OK. Apart from culture?

Graybeard: One more thing about culture is that you need to make security part of the process. Make security reviews to your code - Both as part of code reviews and dedicated security audits. Appoint someone in your company to be in charge of security.

Youngstar: Anything special I should look for in those reviews?

Graybeard: Try to think like the bad guy. “How can I break this piece of code?”. Read “The Security Mindset” by Bruce Schneier to get some ideas.

Youngstar: OK. What else?

Graybeard: We usually think of security in layers. There’s network layer, server layer, deception layer, encryption layer and more. Each has its own set of tools and practices. Think about the layers that are more valuable and effective and invest your time there.

Youngstar: Deception?

Graybeard: Yeah, something called honeypots.

Youngstar: Now I can’t get the image of Winnie the Pooh out of my head.

Graybeard: Funny, now I can’t either. In any case, security is a cat & mouse game and you need to be updated all the time. One good practice to keep things patched. Depends on your hosting choice, they usually do a good job patching. But you should keep track and make sure you’re up to date.

Youngstar: OK. I’ll patch away.

Graybeard: Note that some patches require reboots. You need to be ready for this and plan how to keep things up while rebooting.

Youngstar: I remember our talk on “hot deploys”. Any security tools I should familiarize myself with?

Graybeard: There are many. A good starting point is what comes with Kali Linux.

Youngstar: Isn’t Kali some Hindu goddess?

Graybeard: Envy of the competition?

Youngstar: Never envy, always cautious.

Graybeard: If you have time and money, you can hire a pentesting team.

Youngstar: pentesting?

Graybeard: Penetration testing. These companies will try to break into your site and will give you a report.

Youngstar: Like in Sneakers?

Graybeard: Yup.

Youngstar: I’ll go and watch it again. I love Robert Redford.

Graybeard: Should I tell your boyfriend he should be worried?

Youngstar: … Sure, I like to keep him on his toes.

Graybeard: The poor guy. I hope he appreciates his luck.

Youngstar: Let’s get back to security please?

Graybeard: OK. Do what you did - monitor your logs. Add some automation to alert you when something fishy happens. There are several tools for that, the technical term you’re looking for is SIEM.

Youngstar: OK. You mentioned hosting companies doing patches. Do they do more?

Graybeard: Yeah they do, sometimes for free since it’s their reputation as well, sometimes at cost. And there are companies who give security as serivce, WAF for example.

Youngstar: I’ll Google what WAF is. How much should I spend on security?

Graybeard: You need to think how much each security breach will cost you, not just money but also reputation. Then prioritize and protect.

Youngstar: Oh, I like that slogan.

Graybeard: Now about secrets…

Youngstar: Secrets? I don’t have any.

Graybeard: Sure you do. You’re email password, keys to your hosting provider and more.

Youngstar: Oh these, what about them?

Graybeard: How do you keep them safe?

Youngstar: I have an encrypted file with gpg with these. The master password is in my head.

Graybeard: And if you have software that needs some of these keys?

Youngstar: I set it in the environment when deploying.

Graybeard: And how does the deploy script knows?

Youngstar: It asks me.

Graybeard: So it’s not fully automated then.

Youngstar: Yup. By the way, is gpg good enough?

Graybeard: It’s better than rot13, which I saw people use.

Youngstar: rot13?

Graybeard: It’s a substitution cypher where each letter is replaced with the letter 13 places after it, in a cyclic manner.

Youngstar: And since there are 26 letters in the English alphabet, if you rot13 and rot13 you’ll get the original.

Graybeard: Yes. Not that secure but I’ve seen people use it. You can implement it with a single tr command8.

Youngstar: You and your aliases. Let’s get back to how can I fully automate my secrets.

Graybeard: Some of the automation systems like Ansible have modules that automate this process, There are special databases for managing secrets and some companies role their own.

Youngstar: NIH syndrome?

Graybeard: Probably. Sadly it’s a very common syndrome.

Youngstar: Any other things I should know?

Youngstar: Right. Now I’m heading back to my place, and will make sure the door is locked.

Graybeard: Sadly they didn’t invent virtual guard dogs like the beast you have at home.

Youngstar: What do you mean beast? He’s a cutie!

Graybeard: He is cute, but also very big and scary sometimes.

Youngstar: And probably needs a walk, I’m out of here.

Graybeard: Cheers.

Going Faster

Write clear, precise code. Every ten years it will run 1,000 times faster.

- Joe Armstrong

Youngstar: We’re starting to get traffic on our site and some of the servers became busy. I think I need to rewrite some of my modules in C.

Graybeard: You know the three rules of optimization9?

Youngstar: Nope.

Graybeard: First rule is: “Don’t.”

Youngstar: Very helpful.

Graybeard: Actually it is. Second rule is: “Don’t… yet”.

Youngstar: And the third is “never”?

Graybeard: Nope, it’s: “Profile before you optimize.”

Youngstar: That one I get, but why avoid optimization?

Graybeard: Because there are some many better ways to make things run faster than writing code which is hard to understand and maintain.

Youngstar: Do tell.

Graybeard: Let’s start with the industry obsession for speed. The question you should ask is not “Can I make it faster?” but “Is it fast enough?”.

Youngstar: What’s “enough?”

Graybeard: This is an excellent question, and a lot of companies are not asking it. Try to extract number from the product manager/business people. They need to understand that you’ll build a totally different system if they need minutes or milliseconds.

Youngstar: Minutes?

Graybeard: There are batch system in enterprise that runs once a day, so even days might be a valid answer.

Youngstar: And if I hit these numbers, spend my time elsewhere developing new features?

Graybeard: Exactly. A lot of time people say - “make it as fast as you can.”. Don’t let them get away with it.

Youngstar: And how do I do that?

Graybeard: I tell them something like: “OK, but I’ll need a supercomputer and two years to get as fast as I can.”

Youngstar: Nice!

Graybeard: Once thing you want to do before optimizing is making sure you code works.

Youngstar: Doh!

Graybeard: You’d be surprised how many times people optimize bugs. Make sure you have a good regression/acceptance test before you start. Also spend time with the code and understand what it’s doing.

Youngstar: Makes sense.

Graybeard: After you’re ready, the first thing you should do is profile.

Youngstar: I know about the Python profilers and pstats.

Graybeard: Excellent, these will help you identify the problem. Note that there are several UI front ends to pstats and some IDE’s have excellent integration.

Youngstar: You? Preaching UI?

Graybeard: Sometimes a pictures does worth a thousand words. Note however that pictures can lie just as good as words.

Youngstar: Meaning?

Graybeard: You need to understand what your viewing and how you measured it. For example people who use Windows should disable the anti-virus software before running profilers.

Youngstar: Oh! “Lies, damn lies and benchmarks.”10?

Graybeard: Exactly. Also note that there are several kind of profilers. You usually start with time based ones, but there are event based, memory and other profilers out there - know the tools.

Youngstar: I’ll make sure to have more than a hammer in my toolbox. However, my system is more complex than just one component. How can I find out how much time each part takes?

Graybeard: I tend to use a timing decorator on functions. This decorator logs the function execution time and then I can see what’s taking time. This combined with context object to know which functions belong to same request help me understand what’s going on.

Youngstar: Something like yslow?

Graybeard: Not as fancy, but yes.

Once you identified most promising candidates to optimize, it’s time to evaluate how much it’ll take to make it better and pick the one with best effort/speedup ratio.

Youngstar: Pain vs Gain again?

Graybeard: It’s always there.

Youngstar: Now that I have what to optimize, how do I do it?

Graybeard: There are many tools and techniques out there. I’ll try to point out some of the major ones. But do your homework.

Youngstar: I always do.

Graybeard: Always?

Youngstar: OK, when I feel like it.

Graybeard: Ha! The first easy solution is to throw hardware at it. I heard that some people at Google have a sticker on their laptop saying “My other computer is a data center.”.

Youngstar: Oh, I need one of these.

Graybeard: I sometime joke that the hardest part of my consulting gig is to convince people they don’t need “big data” solution. Amazon is about to offer instances with two terabyte of memory.

Youngstar: Two terabyte? Wow!

Graybeard: Relax, in a few years you’ll have it in your phone.

Youngstar: The never ending Moore’s law?

Graybeard: Something like that. The idea is the machines are way cheaper than developer time. It’s an old idea, check out Rule of Economy sometime.

Youngstar: I read TAOUP, some good guidelines there.

Graybeard: You do do your homework.

Youngstar: I try. What else?

Graybeard: Cheat whenever you can.

Youngstar: Huh?

Graybeard: I don’t recall the exact saying, but it says something like “the fastest code is the one not being executed.” Caching is a great example, and a lot of time you get get away with a fast approximation than actually doing the full calculation.

Youngstar: For example?

Graybeard: Floating points.

Youngstar: Oh, the 1.1 * 1.1 != 1.21 thing people always complain about?

Graybeard: Exactly. Floating point sacrifice accuracy to become fast.

Youngstar: And you can use the decimal module for accurate results.

Graybeard: Exactly. After you throw hardware and cheat. It’s time to reach out for algorithms and data structures. And you need to know the strengths and weaknesses of the ones you’re using.

Youngstar: Like list append is fast but prepend is slow?

Graybeard: Yup. If you need to prepend a lot what do you use?

Youngstar: A deque.

Graybeard: Very nice. Also try to use the builtin data structures, they are written in super optimized C.

Youngstar: Yeah, the notes in dict implementation makes my head spin.

Graybeard: Apart from algorithms, you need to know a bit about computer architecture.

Youngstar: Access times for various hardware?

Graybeard: Yes. There’s a huge performance penalty for a cache miss.

Youngstar: Yeah, you’ve sent me that tweet. I haven’t watched the video about “cache friendly algorithms” yet.

Graybeard: So you don’t do your homework? I’m confused now.

Youngstar: Told you - when I feel like it.

Graybeard: Moving on. Once you’ve exhausted all the options to makes things faster inside Python, and Python will take you a long way - it’s time to look at some alternatives.

Youngstar: Like C extensions?

Graybeard: Before you got that path, and sometimes you do need to get there. There are some less painful options.

Youngstar: Cython?

Graybeard: That’s on popular option. There’s also numba which is a JIT compiler who also can shed work to the GPU, and you can use alternative Python implementation such as pypy.

Youngstar: Hold on, you told me not to use pypy.

Graybeard: You didn’t have a reason, and there’s a price to pay.

Youngstar: Third party libraries?

Graybeard: Exactly. But sometimes pypy will give you the speed boost you need with almost zero effort.

Youngstar: OK. Anything else?

Graybeard: There are many speed optimization tricks. They are related to the Python implementation you’re using. Things like using __slots__ for memory reduction, avoiding dot lookup and may others. See the wiki for more, and you’ll probably pick more as you go.

Youngstar: OK.

Graybeard: Also there are many other tools you can use. For example strace let’s you see what system calls your program is doing.

Youngstar: I’ve played with strace, it’s fun.

Graybeard: Oh, you’re getting around to my definition of fun now?

Youngstar: Busted! What about parallelization?

Graybeard: This is also an option. Always remember Amdahl’s law, don’t expect miracles. There are many way to parallelize, from threads, to processes to different machines. Do your homework again.

Youngstar: Threading, multiprocessing and Celery?

Graybeard: There are so many solutions out there. There’s the new concurrent.futures modules in the standard library. And for multi machine parallelization there are many solutions, from Spark to distributed to many more. However before you get that path, try a better algorithm and better hardware. Going “big data” is painful.

Youngstar: OK. And writing stuff in C?

Graybeard: Once you know this is the right solution. You have several options, from the native C API to SWIG and others. But I’d start with Cython.

Youngstar: OK. So I’ll start but not optimizing and see how it goes for me.

Graybeard: Yes. Last thing to remember is not to expect miracles. Raymond Hettinger phrased it nicely: Much of the doubling of speed for core Python that has occurred over the last ten decade has occurred one little step at a time, none of the them being individually “dramatic”11.

Youngstar: OK, I’ll remember that - baby steps.

Graybeard: And now, let’s continue with our baby steps toward Ballmer Peak.

Youngstar: Two beers coming up.

Process

The only ‘best practice’ you should be using all the time is ‘Use Your Brain’.

- Steven Robbins

Youngstar: We’re up to six people now.

Graybeard: Nice. Recruiting is hard.

Youngstar: You have no idea… Actually you do :)

Graybeard: Yeah, I’ve done my share of interviews.

Youngstar: But your interview methodology is flawed since I passed.

Graybeard: I’m labeling you as true positive.

Youngstar: That’s a new one. Nice to be on the good side of the confusion matrix.

Graybeard: Do you see any changes once your size went up?

Youngstar: Oh yeah. Things look much more chaotic and seems like I spend too much of my time … not programming.

Graybeard: Sounds typical.

Youngstar: How can I beat typical and get to hacking?

Graybeard: It’s called process.

Youngstar: I was looking into that. Agile, Scrum, XP, Lean, Kanban, CMMI, … There are so many.

Graybeard: You know I didn’t manage much - it cuts down my hacking time.

Youngstar: Yeah, but you trained many of your managers.

Graybeard: That I did. I’ll give you some guidelines. The first thing is to think about why we need a process.

Youngstar: To remove chaos?

Graybeard: Chaos is not all bad. I think it was Michael Crichton who said that all functioning societies are somewhere between total stagnation and total chaos.

Youngstar: Currently we’re too much into the chaos side, some structure will help.

Graybeard: There’s a great article by Clay Shriky called “A Group Is Its Own Worst Enemy”. Highly recommended.

Youngstar: I’ll add it to my reading list.

Graybeard: Oh, we’re going to fill this list today. One of the main points in Clay’s article is that in order for a group to function, individuals need to make some sacrifices.

Youngstar: The same way I don’t park my car in the middle of the road but in a free parking spot?

Graybeard: Exactly. I see a process as set of rules that makes the group function better. It’s very important that your team members will understand it and will see the benefit of the process. Otherwise it’s just more red tape for them..

Youngstar: OK. So process metrics should be visible?

Graybeard: Yes. And you should select the metrics carefully. People will try to make them look good. For example if you measure number of bugs in production, people won’t write risky code.

Youngstar: What’s good in risky code?

Graybeard: Sometimes it’s the right solution. Even a good refactoring is risky.

Youngstar: Pain vs gain again?

Graybeard: Yes, but we’re digressing too much. Let’s get practical now.

Youngstar: Practical is my middle name.

Graybeard: Didn’t know you changed it. Anyway … Start with a light process. Jared Spool said “Too much process is just like no process, except slower.”

Youngstar: And nobody likes red tape.

Graybeard: Yup. I’ve seen many teams, each has it’s own process. It depends on the team, the product, the culture and many other things. Most start with Scrum since it’s very light and flexible.

Youngstar: What’s your variation?

Graybeard: It varies. The parts I tend to do are - sprint planning, morning scrum, code reviews and retrospectives.

Youngstar: Sprint planning is where you decide on what to do this sprint?

Graybeard: Yes. It depends on the length of the sprint, I like one week springs.

Youngstar: Do you leave spare time to unknowns?

Graybeard: If you have enough metrics, most of this unknowns will be “known unknowns”. People go on vacation, get sick, bugs in production, new customer demo … All of these are pretty predictable once you have enough data.

Youngstar: What about the “unknown unknowns”?

Graybeard: Deal with them when they happen. The hardest point is to make management/product team understand that that’s it - nothing else will get into the sprint in that week.

Youngstar: What’s called “feature creep”?

Graybeard: In part yes. OK, next thing are morning scrums, sometimes called standups. I tend to do them late-ish so people will be there.

Youngstar: And so you can sleep late?

Graybeard: I exercise in the morning.

Youngstar: Oh right. If I recall correctly each member says what they did yesterday, what they plan to do today and blockers.

Graybeard: Exactly. And you need to be very strict about keeping these short. Technical solutions are not part of the standup. I had teams do the standup in the chat room. This way you also have history of standups.

Youngstar: You love your chat rooms.

Graybeard: It’s simple math. 1 to 1 communication is $$O(n^2)$$, where chat room is $$O(n)$$. Add “The Mythical Man-Month” to your reading list.

Youngstar: That old book?

Graybeard: 1975. But since it talks about people, and people haven’t changed much since 1975 - it’s still very relevant.

Youngstar: OK. Next was code reviews. Anything special there?

Graybeard: Just make sure to do them. Allocate time for them in the process and make sure they happen quickly.

Youngstar: Quickly since the code in the master branch drifts?

Graybeard: Yes. And also since the person who wrote the code moved to another task and starts to forget.

Youngstar: And the last thing - retrospectives?

Graybeard: It comes from the old concept of OODA loop.

Youngstar: OODA?

Graybeard: Observe, Orient, Decide, Act. But the most important part is the loop. In retrospective you reflect on what you did and find out how to improve. A lot of research on successful teams shows that this is a key practice. Sadly a lot of teams skip it.

Youngstar: And you use “5 whys” in the retrospectives?

Graybeard: Up to you. The most common practice is to ask “what went well” and these are things you keep. Then as “what went wrong” and these are parts you need to improve.

Youngstar: Improve the process?

Graybeard: Yes, the process aim is to help the team work better. If it doesn’t do it - fix it.

Youngstar: Any tools you recommend?

Graybeard: A tool to track work is essential. There are many out there, free or not, hosted or not. A lot of services that host code, such as github provide simple issue system which is usually good enough.

Youngstar: What about JIRA?

Graybeard: JIRA is great and have excellent support and ecosystem. But I’d start with something free and move there only if there’s a need.

Youngstar: OK, what else?

Graybeard: Source control is a must, git seems to be the one most people use today and it has a great tooling around it.

Youngstar: How’s source control connected to process?

Graybeard: Help with collaboration. I work in a feature branch per issue. Once the branch passes tests and code review - it is merged to master.

Youngstar: OK.

Graybeard: You need a tool to review code.

Youngstar: github has it as well?

Graybeard: Yeah, and bitbucket, and gitlab, and …

Youngstar: OK, OK …

Graybeard: A tool that’s a bit different is gerrit. Apart from code reviews, it also automatically merges code to master once you get enough plus ones.

Youngstar: Plus one? As in the Apache voting system?

Graybeard: Yes. Another tool I find essential is group chat.

Youngstar: Yeah, we talked on this already. Anything specific?

Graybeard: Both Slack and HipChat are nice and comes with many bells and whistles. Internally companies usually install a Jabber server. Oh, and don’t forget IRC.

Youngstar: IRC? With the hashtags and AFK? It’s still a thing?

Graybeard: Very much, it works and there’s a lot tooling around it.

Youngstar: I’ll start with hosted, save me operations.

Graybeard: I agree.

Youngstar: Any other tools?

Graybeard: A place for documentation. Most of the collaboration tools mentioned earlier have a Wiki based system. Some companies use Google Docs or another online office suite.

Youngstar: I already have Google Docs since I use them for my email. They also provide calendar and other tools.

Graybeard: Yup. Note they’re not the only player, but they are a good default choice. And again you save on operations.

Youngstar: Anything else?

Graybeard: I think that’s pretty much it. Let’s see… Start sprint and use issue system. Code in branch and optionally update wiki, submit to code review. Communicate in chat room and do retrospective in wiki with new issue… Oh, I forgot Jenkis.

Youngstar: The CI system?

Graybeard: Yes and much more. There’s a joke: “I don’t care if it works on your machine, we’re not shipping your machine.”

Youngstar: Ha. I’ve seen this “works on my machine” too many times.

Graybeard: Another nice to have tool are all the bots in the chat room. Google chatops to see what you can do with these bots. It’s also a lot of fun to write and use them.

Youngstar: And fun is important to keep people around.

Graybeard: Not to mention that you spend a lot of time at work, so it better be fun… This reminds of another book.

Youngstar: Great.

Graybeard: It’s called Peopleware.

Youngstar: Like hardware but for people?

Graybeard: Exactly. It’s an excellent read. You reminded me of it by talking about fun. In this book they talk how many companies don’t calculate the price of replacing an employee - with is very high.

Youngstar: Recruiting, training …

Graybeard: Also most people are not at 100% until about a year at work.

Youngstar: Ouch!

Graybeard: Yup.

Youngstar: OK. Tomorrow I’ll install a bunch of tools and make sure we have fun. And be more productive.

Graybeard: It’s all about the journey. Tell people it takes time to adopt a new process. Show them this (draws on a napkin)

Process Adoption
Process Adoption

Youngstar: I see, good one.

Graybeard: Note that process is procedure. You’ve succeeded once it becomes culture. Peter Drucker said “Culture Eats Strategy For Breakfast.” When people come to standup because this is the way we communicate and not because the boss told them - that when you win.

Youngstar: Interesting … This also affects hiring.

Graybeard: Very much. You need to think about the company you want to create. I prefer places where everything is allowed unless explicitly forbidden.

Youngstar: Don’t we all? I’ll think on that on my journey home.

Graybeard: Oh, one more book for you.

Youngstar: You do deliver on your promises, which one is that?

Graybeard: It’s called “Getting Real”. You’ll like the “meetings are toxic” part of it.

Youngstar: Yes I will.

Graybeard: Alright, g’night and happy reading.

Youngstar: G’night.

Time Management

I love deadlines. I like the whooshing sound they make as they fly by.

- Douglas Adams

Youngstar: Sorry I’m late, got stuck at work.

Graybeard: Happens a lot lately.

Youngstar: Yeah. It feels like I’m no longer in control of my time. You on the other hand seem to have a lot of free time. How do you do it?

Graybeard: It’s all about focus and priorities.

Youngstar: Do tell.

Graybeard: I once read an article called “The Sad, Beautiful Fact That We’re All Going To Miss Almost Everything”. It states that even if you limit yourself to last 250 years of literature and only English. You still won’t be able to read all the good books even if you read a hundred books per year.

Youngstar: I don’t think even you read a hundred books per year.

Graybeard: I did as a teenager. But work has slowed me down a lot, not to mention family.

Youngstar: Good trade-off for the family bit.

Graybeard: Agree, but I also enjoy my work.

Youngstar: OK, I get it - there’s not enough time to do everything we want.

Graybeard: It’s not just about getting it, you need to internalize it and not regret missing things.

Youngstar: I’ll work on it. I guess this is connected to prioritizing things?

Graybeard: Exactly. Once you understand you won’t get to do everything, you prioritize and do things the matter most. Not regretting or getting tempted to try more than you can swallow.

Youngstar: Sounds hard.

Graybeard: Yeah, I’m still learning to say “no”. But this gives me enough time to work on the things I think are important and do them well.

Youngstar: What if the things that are important require a lot of time?

Graybeard: Then you prioritize what parts are more important and do them. I think it was Patrick Lencioni who said “If everything is important, then nothing is.”

Youngstar: Any tips on how to prioritize things?

Graybeard: There are many systems out there. I’d probably start with with the Eisenhower Method.

Youngstar: What’s that?

Graybeard: (Draws on a napkin)

 1     ^
 2     |
 3     +---------------+------------+
 4     |               |             |
 5  u  | urgent        | urgent      |
 6  r  | not importnat | important   |
 7  g  |               |             |
 8  e  |---------------+-------------+
 9  n  |               |             |
10  t  | not urgent    | not urgent  |
11     | not important | important   |
12     |               |             |
13     +---------------+-------------+-->
14                important

The idea is that you want to be in the not urgent/important square. If something is urgent/important - JFDI.

Youngstar: Cool. What about the other two?

Graybeard: The one to avoid is urgent/not important, it tends to suck time.

Youngstar: And the last one I just don’t do?

Graybeard: As much as you can. Successful companies tend to spend their time at not urgent/important area. If you find you spend most of your time in urgent/important - fix your process.

Youngstar: Any other tips?

Graybeard: There are other systems out there - GTD, Inbox Zero … Try out some, but for me it’s more of a gut feeling than anything else.

Youngstar: Don’t get me started on your gut.

Graybeard: Took me years to develop it, used a lot of beer in the process.

Youngstar: Funny! I guess I’ll move “Blink” to the top of my reading list.

Graybeard: Nice prioritizing! Two common mistakes people do is not to follow their priorities and forgetting to prioritize things outside work.

Youngstar: My boyfriend has been complaining I work too much.

Graybeard: Ouch! Nobody on their deathbed has ever said “I wish I had spent more time at the office”.

Youngstar: Yours?

Graybeard: Nope, not sure who said that.

Youngstar: I guess they didn’t achieve much in their life either.

Graybeard: Oh, you’ll find lazy people accomplish a lot. According to Larry Wall the tree great virtues of a programmer are: Laziness, Impatience and Hubris.

Youngstar: Larry Wall Of Perl?

Graybeard: And patch and many other things.

Youngstar: OK. Can we move to focus now?

Graybeard: Sure. The basic idea is the humans are very bad at context switching.

Youngstar: I can chew gum and walk at the same time.

Graybeard: I’m talking about things you need to concentrate on. What is called getting in the flow.

Youngstar: Oh that book by the guy who I can’t pronounce his name?

Graybeard: Yes, did you read it already?

Youngstar: Nope, I’ll re-prioritize.

Graybeard: Good call. The main idea it takes about 15-20 minutes to get into flow state where you can do things which require a lot of mental effort. Snapping out of it is matter of seconds.

Youngstar: That’s why I do most of the good work early in the morning when no one is bothering me.

Graybeard: Which says something about your work environment.

Youngstar: Yeah, I just realized I need to fix this. I do like my mornings.

Graybeard: I knew you’d find the lazy person inside you.

Youngstar: I never lost it. It always surprises me how easy it is to become a workaholic.

Graybeard: Yup.

Youngstar: OK, focus and priorities. Last words before I head home?

Graybeard: I’ll keep you a bit more since I think it’s important. Apologize to your boyfriend for me.

Youngstar: Sure.

Graybeard: One thing is that you need time to think. John Cleese says you need a place in time and space to be creative.

Youngstar: Meaning?

Graybeard: Allocate time for yourself, without interruptions. Use this time to think and reflect. Don’t just get carried away with day to day. I do it when I jog.

Youngstar: OK. I guess I need to start jogging.

Graybeard: Find your own sport! Which also should be a priority for you.

Youngstar: What’s the matter? Afraid I’ll outrun you?

Graybeard: You already do. And you’re more than welcome to join my silent jogging.

Youngstar: Will do. Anything else?

Graybeard: A good exercise that will force you to prioritize is leaving home early. Say 4pm.

Youngstar: Yeah, right.

Graybeard: I’m serious. Once you go home early, you start to prioritize and focus like nobody’s business. I’ve seen many people doing long hours and hardly working.

Anther good thing of short work days is it lets you charge. People get “burned” when doing long hours.

Youngstar: I can feel it, seems like my productivity isn’t what it used to be.

Graybeard: Oh, and sleep - sleep is way underrated. People perform badly when they’re sleep deprived.

Youngstar: Didn’t they use sleep deprivation as kind of torture?

Graybeard: They did. So get your work/life balance in order and stop torturing yourself.

Youngstar: Said the person who’s keeping me here.

Graybeard: And here I was thinking you enjoy our meetings.

Youngstar: I do, but I need some down time.

Graybeard: Go home then. I’ll see you next time.

Youngstar: G’night.

Asking Questions

“The best way to get the right answer on the Internet is not to ask a question, it’s to post the wrong answer.”

- Ward Cunningham

Graybeard: In other news … I’m going to be away for a while.

Youngstar: Where are you going to hike this time?

Graybeard: The Appalachian trail, not sure I’ll do all of it but I’ll probably be offline for a month.

Youngstar: That’s awesome! Send pictures.

Graybeard: What part of offline don’t you understand?

Youngstar: Doh! Oh wait… who’s going to help me then?

Graybeard: Don’t you have other friends?

Youngstar: Yes, but they don’t know Python and other programming skills like you.

Graybeard: Arguably, the most important skill you’ll acquire as a developer is the ability to ask good questions12. Let’s talk a bit on how you can get help somewhere other than here.

Youngstar: OK, but I do expect to meet you here regularly when you’re back.

Graybeard: Of course. Where do you go to find answers now?

Youngstar: Google.

Graybeard: Google is a good start. How are your search skills?

Youngstar: Hmm, never thought of that. I’d guess I’m an average googler.

Graybeard: Searching is important, invest time in getting better. Google has some excellent free online courses on searching called Power Searching. As a rule of thumb add python to every query.

Youngstar: Neat, I’ll do that. Another place I usually find myself is StackOverflow.

Graybeard: Yeah, they have great SEO.

Youngstar: SEO? I can’t keep up with all these acronyms.

Graybeard: Nobody can, I use acronymfinder a lot of times. In this case SEO stand for “search engine optimization”.

Youngstar: I found out that most cases StackOverflow has the answer to what I’m searching for.

Graybeard: Yeah, StackOverflow and the rest of StackExchange sites are a great resource. Sometimes it’s biased toward people who answer faster and not necessarily give the best answer.

What happens when you don’t find your answer in StackOverflow?

Youngstar: I ask there, most times I get an answer within an hour.

Graybeard: Edward Hodnett said “If you do not ask the right questions, you do not get the right answers. A question asked in the right way often points to its own answer.” Do you ask the right questions?

Youngstar: I think so, not sure. Any guidelines?

Graybeard: I’ll give you two reading assignments. One is “How To Ask Questions The Smart Way” by Eric Raymond where he goes into great details on how to formulate questions. The second is a research done by takipi people called “The anatomy of a Great Stack Overflow Question”.

Youngstar: Sure, I’ll add them to my pile of things to read. Can you give me the TL;DR?

Graybeard: (sighs) Sure, but it doesn’t mean you’re off the hook for reading these.

Youngstar: Agree.

Graybeard: The two main points are respecting people’s time and giving enough context. If people feel that you’re asking them a trivial question instead of RTFM - they won’t answer you. However if they see that you tried something before asking them - they’ll be more inclined to help. The other thing is giving context. The worst questions you can get “Help, it doesn’t work”. Give enough information of what you were trying to do, what did you expect to happen and what did actually happened.

Youngstar: This sounds a lot like bug reports.

Graybeard: Well, we’re in a technical world - so a lot of things that don’t work seems like bugs. Check out PBKAC acronym sometime.

Also be polite and make sure to thank people. Oh, and don’t feed the trolls.

Youngstar: Trolls are the ones who argue for just for the sake of arguing?

Graybeard: Yeah, and they usually do it in a foul manner.

Youngstar: Any other advice?

Graybeard: Get a rubber duck.

Youngstar: What? Like the ones kids use in a bath?

Graybeard: What do you mean by “kids”? I have one like that.

Youngstar: For real?

Graybeard: You’re never too old to have a bath with your rubber duck. Has it ever happen to you that you were stuck on something, went to a co-worker for help, started to describe the problem and somewhere in the middle said - “never mind, found the solution.”?

Youngstar: Sure, several times. How does this relate to a rubber duck?

Graybeard: There’s something about formulating our questions verbally or in writing that helps us solve the problem. The idea here is that instead of wasting a co-worker’s time, you talk to the rubber duck. This is known as Rubber Ducking. Place a rubber duck next to your monitor and describe your problems to it when you’re stuck.

Youngstar: Here goes my social life … not that I had much of one.

Graybeard: You can also write it down instead of talking to the duck. But try to do it as if explaining the problem to someone. Richard Feynman has this algorithm for solving problems:

  1. Write down the problem.
  2. Think real hard.
  3. Write down the solution.

Youngstar: And this guy got a Nobel prize?

Graybeard: He was very smart. The first step is that one that relates to our current discussion. Once you write the problem down, it’ll be easier to find a solution.

Youngstar: Got it. Back to the online world - any more good places?

Graybeard: Well, there’s the good old Python newsgroup - comp.lang.python. I access it via the Google groups web interface.

Youngstar: Newsgroup? It’s still alive?

Graybeard: Yes, I lurk there and answer questions from time to time as well.

Youngstar: What about IRC?

Graybeard: That’s a different mode of communication. So far we talked about async13 communication where you ask a question and someone replies where they can. Now we’re moving to synchronous communication - very much like the conversation we’re having now.

Youngstar: Never thought about it like that.

Graybeard: Most people don’t. One common mistake is to try to use one form of communication as the other. Mostly people who think you should reply to an email right now. But back to IRC - the #python IRC channel on freenode is very busy with more than a thousand users logged in.

Youngstar: Whoa!

Graybeard: Yes. That’s why they seem rude, but they’re just trying to be efficient. The room topic says: “don’t ask to ask, just ask”. If you’re not familiar with IRC jargon, take to time to learn it. My best advice is hang around just reading conversations before you jump in the water.

Youngstar: I’ll do that. Do people actually manage to talk in such noisy environment?

Graybeard: Sure. Once you get the hang of it it can be fun - but time consuming and require attention. This is synchronous communication after all.

Youngstar: So Google, StackOverflow, comp.lang.python and #python IRC channel. Anything else?

Graybeard: I think that will give you enough for now. Since we’re on the subject of asking questions - let’s talk on the questions you should ask yourself.

Youngstar: I ask myself questions all the time.

Graybeard: I’m not talking about the usual day to day existential stuff. I’m talking about work related questions.

Youngstar: Like retrospective?

Graybeard: Yes, we’ve covered that when talking about development process. But it worth repeating. A lot of studies show that companies/groups who excel at what they do, do a lot of retrospective/debriefing.

Youngstar: I was in some retrospectives where people got heated up.

Graybeard: Yes, don’t just do a retrospective - do it well. Note that retrospectives are not just for “end of sprint” but also for incidents. This is where Toyota’s “5 whys” come in handy.

Youngstar: OK, I’ll brush up on how to do good retrospectives.

Graybeard: Other questions you should ask yourself are about the state of things. I learned from a very talented person a trick about staying in focus. The idea is to find one question, and keep asking it repeatedly. For example “Why aren’t we deploying?” - this will focus you on a JFDI attitude.

Youngstar: Language! I already gave you a pass on RTFM.

Graybeard: RTFM stand for “Read the Fine Manual”, what’s wrong with that?

Youngstar: Yeah, right!

Graybeard: But you’re right. You should speak up whenever you feel the environment get toxic. A good culture needs gardening all the time.

Youngstar: Thanks. I’ll figure out a focusing mantra for me. Looks like a fun exercise. Any other questions I should ask myself?

Graybeard: It depends on what you’re doing and the state of things. For example in “Zero to One” Peter Theil writes about seven questions all businesses must ask themselves.

Youngstar: And they are?…

Graybeard: Go read the book, it’s a good one.

Youngstar: I’ll need to rent a bigger place just to have room for all these books.

Graybeard: Nah, one Kindle is all you need.

Youngstar: That’s right.

Youngstar: Well, that will keep me covered until you get back. When do you leave?

Graybeard: We have time for one more beer before you drive me to my flight.

Notes

1Keep it simple, Stupid.

2Do not repeat yourself

3Single point of truth

4High Frequency Trading

5Integrated Development Environment

6Object Relational Mapping

7There ain’t no such thing as a free lunch

8ge '[N-Mn-m]' '[A-MN-Za-mn-z]'14

9From the wonderful c2 wiki

10Originally “Lies, damned lies, and statistics.” attributed to Mark Twain

11http://bugs.python.org/issue25823

12IBM’s iconic CEO Thomas J. Watson famously said “The ability to ask the right question is more than half the battle of finding the answer.”

13Short for asynchronous.

14The answer of course is encrypted with rot13 ☺