Key Results

Reduced tech debt by improving both server latency and database load
Reduced average total latency by 2x
API response time became 13x faster (i.e. decreased API response time from 17.5 seconds to 1.2 seconds) in their largest bottleneck
Identified critical bugs in order to continue meeting predefined customer SLAs

SDK

PHP, Go, Python, TypeScript, React

Solutions

Error Monitoring, Performance Monitoring

Share on Twitter.

Share on Facebook.

Share on LinkedIn.

How Intelligence Fusion made API response time 13x faster by finding performance bottlenecks with Sentry

About Intelligence FusionAbout Intelligence Fusion

Intelligence Fusion specializes in providing intelligence, risk assessment, and situational awareness solutions to businesses and government agencies. Their threat intelligence platform manages geographical information and relies on assurance and prediction to help keep people safe; they also provide a threat intelligence REST API. Intelligence Fusion gathers and analyzes various types of data, including open-source information, social media content, and news reports to create actionable insights and reports for their clients.

Their Tech StackTheir Tech Stack

8 REST APIs written in PHP
Background processes written using Go and Python
Web platform written using Typescript and React
Has Sentry instrumented across 8 PHP services, 3 Go services, and 1 Typescript
JavaScript service (web platform). The Go and JavaScript services are work-in-progress, as the team adds more error details, full releases and also include tracing.

Moving away from reactive debugging and making performance a priorityMoving away from reactive debugging and making performance a priority

We started with Sentry 3 years ago when we were building infrastructure and didn’t have a lot of test coverage. Sentry became essential to us to know when things were going wrong. Thomas Hockaday, Lead Engineer @ Intelligence Fusion

As a long-time Sentry customer for error and exception monitoring, Intelligence Fusion was ready to invest in making their services more performant for customers. As their tech team grew from 7 to 16 and started scaling its databases, performance also grew to be more of a priority.

With Sentry integrated into their services and dealing with a legacy API, the Intelligence Fusion engineers knew from experience that some of their endpoints were slower than others. They also knew of slow rendering issues on their threat data heatmap — the main feature users see upon logging into the Intelligence Fusion platform — from historical internal and client feedback. Thus, the team needed a solution to find app slowdowns in a swift and painless way — leading them to Sentry Performance.

Improving developer productivity with Performance IssuesImproving developer productivity with Performance Issues

Pre-Sentry, Lead Engineer Thomas Hockaday of Intelligence Fusion recalls the laborious process his team used to diagnose application performance issues. Typically, the engineering team needed to figure out:

How urgent an issue was – which they often determined based off of whether a customer was complaining
What was causing it — a process that usually required their engineers to dig around in AWS logs for the culprit, as well as attempt to reproduce the error while communicating with the user

With Performance Monitoring from Sentry, Thomas’s team was able to troubleshoot their performance issues faster (which was reflected in existing metrics on overall uptime and service latency that the team tracked in Grafana). In Sentry, they could immediately:

Know which of their issues were critical and demanded attention
Find the root cause of issues using Sentry’s latest Profiling feature

Here’s the exact workflow Thomas’s engineering team takes to quickly diagnose a performance issue in Sentry. With Sentry’s Slack integration, the team gets alerted via Slack about any critical app performance issues in their project, like N+1 database query issues and slow database queries.

Then from the transaction summary in Sentry, the Intelligence Fusion engineers can identify exactly where the performance issue lies — whether it’s in the database, the response builder, or somewhere else. The engineers may click into the span waterfall associated with the transaction to see how long a query is running. Or, they may look at how many database queries are running in an endpoint to see if they can be reduced.

Using tracing with the Trace Navigator in Sentry helps you see what parts of the application are doing most of the heavy lifting. For PHP applications, tracing is particularly helpful to see which parts of the API call are slowest. The average API call goes through multiple stages:

First, the request to fetch data passes through the server into the PHP application.
The PHP application bootstraps relevant dependencies, then routes to identify which part of the application the request is trying to access.
Next, middleware is executed to authenticate and sanitize data on the request to ensure application security.
The validated data request moves through the main application layers to prepare a database query. Then, this data passes back into the application layers, is shaped into a response, and returns to the user.

Given this complexity, tracing helps the team decide whether the performance issue is in the server, the main PHP code, a specific part of the PHP code, or the database. If more detail is needed, the team delves in deeper and looks at the profiles.

This trace view shows the server, the middleware, and the database. The database bars are short (and hence, fast). The middleware band is long — so the performance issue is coming from PHP.

Speed up slowest endpoints for critical bottlenecks with ProfilingSpeed up slowest endpoints for critical bottlenecks with Profiling

A few months ago, Intelligence Fusion implemented tracing on their main API and then steadily rolled it out to all of their other PHP services. Seeing all their tracing data in one place helped the team identify 1) areas that were particularly slow and 2) universal improvements they could make to reduce overall server latency.

Once tracing was enabled, the Intelligence Fusion team also easily set up Profiling to identify their largest performance bottlenecks. The main bottlenecks (apart from the issues sent to Slack) were transactions with the highest latency and user misery scores, clearly visible from the Performance dashboard.

From there, it was simply a matter of clicking into each profile to look at the breadcrumbs, the slowest functions widget, and the aggregated flame graph (see images below). These Profiling features helped reveal the functions with the longest execution times that needed optimization.

Within two weeks, the Intelligence Fusion team improved their overall application speed with several application-level changes. This included compressing large responses with gzip, opcache, and JIT in PHP, as well as chunking larger dataset queries to reduce PHP memory usage.

Using Sentry, the team monitored these improvements by comparing recent traces against historical ones as the optimizations were gradually deployed. The two screenshots below show how the slowest endpoint improved between May/June and today.

The Trends page also provided a helpful graph that showed how our endpoint performance had improved over time:

As a result of these optimizations, Intelligence Fusion reduced their average latency across all of their APIs (as tracked in Grafana). For example:

API 1 (threat actors) average latency decreased from 1820ms to 23.8ms
API 2 (static assets) average latency decreased from 60.7ms to 24.1ms
API 3 (authorization) average latency decreased from 105ms to 24.9ms

The slowest endpoint, their country data endpoint, initially took 17.5 seconds to return. Having identified the database query as the biggest bottleneck from the flame graph, they optimized the endpoint in several ways (e.g. applied a GIN index to one of the columns and made sensible reductions to the accuracy of coordinate data). Sentry also helped them see where they were doing unnecessary column selection, further enhancing endpoint speed. Ultimately, average time decreased from 17.5 seconds to 1.2 seconds – making their API response time 13x faster.

Joe Sweeny, VP of Engineering at Intelligence Fusion, speaks to the impact of these application performance improvements within the company:

Sentry has always been our go to tool for critical error handling to ensure we continue to deliver reliable and robust software. As Sentry has evolved its offering, it has allowed us to not only scale our microservice architecture horizontally but vertically, as Profiling and performance metrics let us dive deeper into our applications to ensure we provide an optimal level of performance for our customers.

What’s Next for the Customer?What’s Next for the Customer?

As Intelligence Fusion continues growing their customer base, they aim to deepen their Sentry engagement. They plan to add Sentry release integration as part of their CI processes and improve the Sentry integration in their Golang services to see more detailed errors. To achieve frontend-to-backend visibility across their services, Intelligence Fusion will also introduce tracing in their JavaScript web platform.

When we inherited a legacy API, Sentry was essential in helping us know exactly which parts of the tech needed the most care and attention. In the past 3 years, we’ve taken that codebase from 400 unit tests to 3400 — and still growing, thanks to the information we got from Sentry errors. I’m looking forward to expanding our insights now that we’re starting to really dig into the performance aspect of it too. Thomas Hockaday, Lead Engineer

Platform

Languages & Frameworks

Why Sentry?

Key Results

SDK

Solutions

Share

How Intelligence Fusion made API response time 13x faster by finding performance bottlenecks with Sentry

About Intelligence FusionAbout Intelligence Fusion

Their Tech StackTheir Tech Stack

Moving away from reactive debugging and making performance a priorityMoving away from reactive debugging and making performance a priority

Improving developer productivity with Performance IssuesImproving developer productivity with Performance Issues

Speed up slowest endpoints for critical bottlenecks with ProfilingSpeed up slowest endpoints for critical bottlenecks with Profiling

What’s Next for the Customer?What’s Next for the Customer?

A better experience for your users. An easier life for your developers.

Key Results

SDK

Solutions

Share

How Intelligence Fusion made API response time 13x faster by finding performance bottlenecks with Sentry

Their Tech StackTheir Tech Stack

Moving away from reactive debugging and making performance a priorityMoving away from reactive debugging and making performance a priority

Improving developer productivity with Performance IssuesImproving developer productivity with Performance Issues

Speed up slowest endpoints for critical bottlenecks with ProfilingSpeed up slowest endpoints for critical bottlenecks with Profiling

What’s Next for the Customer?What’s Next for the Customer?

A better experience for your users. An easier life for your developers.

A peek at your privacy

Who we collect PII from

PII we may collect about you

How we use your PII

Third parties who receive your PII

We use cookies (but not for advertising)

Know your rights