How SmugMug continues to improve its customer experience with Sentry

Share

Share on Twitter.
Share on Facebook.
Share on LinkedIn.

About SmugMug

SmugMug is a premium online photo & video hosting service trusted by photographers across the globe with their memories, their passion, and their businesses. With SmugMug, photographers have a place to securely store, globally access, beautifully share, and effortlessly sell their high-quality photos.

Using Sentry, SmugMug benefits from:

  • Addressing latency issues faster, and improving metrics around # of issues, time to resolve, customer churn, and trial conversion.
  • Having full visibility into how its frontend code is performing and how code health is affecting user experience
  • Getting alerts if the performance of its site falls below a defined threshold

Challenge

SmugMug had a home-grown logging solution that was integrated with an off-the-shelf tool. However, monitoring errors and performance remained a persistent challenge. The SmugMug engineering team found it tough to debug issues just using logs, and often had to rely on customer support to surface issues to the product team.

Waiting for a customer to raise an issue was too disruptive for the development team and made it impossible to get ahead of errors, so the team needed a better solution.

We could not understand what issues our users were experiencing. This led to wasted cycles on recreating the problem instead of spending time fixing and deploying the solution. In addition, we were concerned that these invisible issues were impacting page load times which could hurt our page rank score.

Mike Diaz, Staff Software Engineer

Requirements

Mike’s team was looking to solve three core problems:

  • They needed a better way to monitor their JavaScript code in order to understand what errors users were seeing. The team also needed to communicate with product managers to monitor the success of any new features.

  • The infrastructure team was working on the server-side rendering of their application, which was CPU-intensive and had the potential to slow down the app. This added complexity is why SmugMug needed a solution to help them identify how to improve page load speed and give actionable insights into performance issues affecting the end-user. Additionally, the team needed visibility into their legacy code and the performance of existing customer sites.

  • SmugMug has to work well for a large contingency of users, regardless of their location or device type; photographers from across the world need to upload images and have them render fast with full resolution. Because photographers rely on SmugMug to provide a great hosting experience, page load times are critical: slow website performance can impact PageRank scores, which may hurt the photographer’s business.

Solution

Using Sentry, SmugMug was able to streamline both their backend and frontend. With a simple-to-install SDK, SmugMug could instrument its tech stack supporting web and mobile (Javascript, React, and Node.js.) Being able to ingest events in real-time meant the team could exactly see what users were experiencing. If a page was loading too slowly, they could jump into action.

Immediately, they saw several benefits. With Issue Grouping, they were able to see common traits between errors and exceptions, such as browser version and breadcrumbs. These insights helped Mike’s team prioritize which issues to fix according to which would have the most significant impact.

By integrating Sentry notifications in Slack, the team was able to get immediate alerting so that they could triage and deal with customer-facing issues quickly. Around the time SmugMug set up Sentry Performance, they had also recently migrated to a new search backend. They used Sentry Alerting as an additional signal for user-facing impacts due to any degradation in the search infrastructure. Search was a critical part of SmugMug’s user experience, and it needed to continuously scale to billions of photos that needed to be stored and indexed. The team now received alerts when searches failed, where they were failing, for how many users, and in what environments. As a result, they could catch production issues even before customers reported the problem or before other tools had spotted the issue.

Monitoring core web vitals was also critical to the SmugMug team, as these metrics (defined by Google) impact their user’s website search rankings. SmugMug used Sentry to monitor metrics related to rendering and response times, including Largest Contentful Paint (LCP), First Input Delay (FID), and Time To First Byte (TTFB). These data points helped provide insights about overall application performance to their team.

With multiple/cross-project support, SmugMug chose Sentry for its technical capabilities, usability, and intuitive setup. Staff Software Engineer Mike Diaz notes: “Installing SDKs for Sentry was very easy. Within minutes we were able to ingest events in real-time to start seeing errors and performance issues. Combining errors and performance helped our engineers better grasp the problem, which streamlined the troubleshooting process. As a result, we were able to resolve latency issues much faster.”

Now, with stack trace visualization (including source map support for their modern frontend), the team could immediately click into the line of code and start diagnosing the issue. Knowing exactly where in the code the issue originated meant that the right engineer was working on the right problem at the right time. And finally, using the Github integration meant commit messages would automatically resolve issues detected by Sentry—closing the loop between issue alerting, triaging, troubleshooting, and resolution.

Results

With Sentry, SmugMug gained full visibility into its user experience and was able to resolve errors proactively and quickly address latency issues. This streamlined process resulted in reducing the number of issues while simultaneously reducing time to resolve. In tandem, these improvements reduced customer churn and increased trial conversions.

Sentry has been a Force multiplier in being able to ship high-quality code.” - Mike Diaz. “It’s a key piece of our toolchain in helping our team create the best possible experience for our users.

What’s next?

As SmugMug’s engineering team continues to migrate its legacy code and add new features, they plan to use Release Health in conjunction with Performance Budgets. This way, any release that results in a regression automatically fails the build. Additionally, since 70% of issues are caused by new code rollouts, using Sentry to monitor their releases will ensure that every release continues to improve the experience for SmugMug’s customers.

© 2021 • Sentry is a registered Trademark
of Functional Software, Inc.