DevOps Zone is brought to you in partnership with:

I'm a software developer working as a senior consultant at Kentor in Stockholm, Sweden. My core competence is as a technical specialist within development and system architecture. In my heart I am, and probably will remain, a programmer. I still think programming is tremendously fun, more than 20 years after I first tried it. That's why my blog is named Passion for Coding.  Anders is a DZone MVB and is not an employee of DZone and has posted 80 posts at DZone. You can read more from them at their website. View Full User Profile

My Worst Bug

11.29.2013
| 6096 views |
  • submit to reddit

This post is a copy of a comment I made to a DZone post asking for the worst bug ever found and solved.

The worst bug I’ve ever tracked down and fixed was a system freeze hidden in some 300.000 lines of code. It was only experienced when the device was left untouched for about an hour (typically a lunch break) while mounted in a grader and connected to a high precision GPS. I only had a few days to find and solve it.

The device used GPS measurements to automatically control the height and angle of the grader’s blade, by connecting to the graders hydraulic system. We’re talking about a system where the data sent out actually did things in the physical world instantaneously.

We had been working on the system for nearly a year and it was during the final field tests that the bug was found. It only occurred once every few days, but it was frequent enough to be a blocker for the release of the product. The large problem was that we were never able to reproduce the bug in a lab with a debugger attached. It only occurred when the device was wired to a proper million dollar machine. Just the GPS receivers cost tens of thousands of dollars.

The code base was large and was completely multi threaded. The only thing I could do was start reading.

I started at the core GUI message pump and tried to follow all code paths to find out where the GUI thread could possibly get stuck. After a few days of digging, I found that someone had created an extra message pump in a section of the GUI that displayed the results of some background tasks. It was a hack to keep the background tasks running and reporting status so that the GUI could be updated. Once removed and reworked into a proper message passing design we no longer experienced any hangs.

The project manager was happy too – he got another bug to mark as fixed in his excel sheet, although he complained that I had taken a long time to fix mine compared to the other bugs, which were mostly fix-a-typo or move-a-widget-2px-left style bugs…



Published at DZone with permission of Anders Abel, author and DZone MVB. (source)

(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)