What happens when you screw up?

Non-engineers want to know: what happens when a big bug is found in your software, and the bug is causing real users real problems, and you’re the one who wrote the code?

Engineers do sometimes write bad code, and sometimes it makes it into production, it’s true.

But shipping production software involves a lot more than writing code. It goes beyond that one engineer. That engineer is not the only person who saw or ran that code.

  • The changes were probably reviewed by other engineers.
  • The code was probably tested by QA.
  • The component that the code is in probably has unit tests which were run during the build.

In short, in a sizable professional software organization a single person doesn’t really have the power to screw up all alone. So the right thing to do when a production bug bites you is, figure out how you - as an organization - let that happen.

  • Was the code review process too lax?
  • Was the documentation on this component inaccurate?
  • Is the coverage of the unit tests insufficient?

What “happens to” the engineer who typed the code in question, hopefully, is that he/she participates in a post-mortem review that helps the team figure out how they can improve things to reduce the likelihood of similar problems in the future.

For more on this, read the utterly excellent and inspiring “Blameless PostMortems and a Just Culture” essay by John Allspaw of Etsy.