Despite the best efforts of modern development to eliminate all bugs prior to release, crashes still happen. However, unlike the improvements to the commit and integration processes—with the preponderance of tools like GitHub, Bitbucket, GitLab, Travis CI, CircleCI, and Jenkins—most organizations still rely on dated methods to spot and fix errors plaguing their applications in production. By relying on customers to report problems, developers only get a portion of the story and have no means to close the loop with the user on the most important details.
The “Game of Telephone” that results from an error report usually requires enough crashes affecting a large enough portion of the customer base that some of those users contact customer or technical support, whose first step is to document and categorize the problem, usually by asking the user for screenshots and a write-up.
From customer support, the quality assurance team will try to recreate the error from the description in a shared doc and run a battery of test scenarios to better understand its context, origins, breadth, and scope. Even then, many of the important details, not least of which is impact, can not be recaptured by QA alone.
The next line of defense is to share the collected details with an engineering manager, whose best guess about prioritization and how to fix the problem leads to a triage and post-mortem investigation into log files. At this point in the Game of Telephone, crucial details have been lost, and there’s likely more concern about preventing collateral brand damage than interest in reengaging the users to get firsthand insight and pinpoint the crash details.
Interested in reading more? This is a white paper, so you had to know we’d ask for your email address. Please enter it here to experience the full PDF.