The Unreported Bugs

I’ve been working hard with my team to reduce our customer-impacting issues (CIs) from approximately 80 when I started to 5 as I write this. We’ve been having a 1-hour call each week with the whole team to look at our backlog, investigate issues and propose fixes.

During the process, we had to add a lot of additional monitoring and logging to find the root cause of some CIs. This monitoring and logging has uncovered some really interesting information, issues that people have been encountering for a long time but never reported - the unreported bugs.

Investigation

While investigating the logs around a known CI assigned to us, we noticed 2 additional issues that Bugsnag (the tool we were using for this particular logging) had picked up. We divided up the investigation between us - one developer took on the investigation of the first error and I took on the investigation of the second.

We each dug into Cloudwatch and Honeycomb, piecing together the patterns that emerged and caused this unwanted behaviour to happen. The developer and I paired and exchanged notes. We questioned each other and interrogated the system and thought of even more permutations that could cause the problems. After gathering all of this information, we knew what we had to do to fix it and we did it - investigated, fixed, tested and released within a couple of days.

It was amazing to us that this issue was affecting approximately 20% of our users for quite some time and essentially making the product unusable for them, but they had never reported it. So, naturally, we started to wonder: what other issues are customers experiencing and not reporting?

Enlightenment

This thought process led us to our next wonderful idea. We all agreed to keep the weekly 1-hour CI review call but when we have no CIs to review, we’re going to look at the error logs. Each week, we’ll review at least one set of error logs to assess if the error is expected or unexpected. If it’s unexpected, we’ll investigate the root cause and, where possible, get working on a fix for it.

Whether a bug or CI is reported or not, if it’s happening for anyone, it’s affecting their perception of the quality of your product and negatively impacting their experience. If it’s an easy fix and you have the capacity to do it, make the time and maybe you’ll surprise and delight your users.