About Factory
Factory is the enterprise platform for autonomous software engineering. They have built a 24/7 system that humans configure and run continuously, producing software that adheres to enterprise standards and governance, and giving engineering teams a fleet of autonomous AI agents called Droids. Each Droid is designed to handle a specific stage of the development lifecycle and to work with any LLM, in any IDE, and in any interface. They accelerate bottlenecks that slow down large organizations like migrations, refactors, testing, documentation, code review, and incident response. Factory connects every stage of the software development lifecycle into a continuous feedback loop the company calls the software factory.
Sentry is woven into Factory’s workflow in two distinct ways: As the monitoring platform that helps their developers find and fix issues faster, and as production context for their Droids.
The Challenge
Running an agent-native development platform introduces a category of reliability challenges that didn’t exist before. When a Droid takes action on production code (opening a pull request or applying a fix) it needs to understand not just that something broke, but why, in what context, and whether the error is new or recurring.
For Factory, that meant capturing the full production context of a failure: the user action that triggered it, the services it touched, and the exact line of code responsible.
“Sentry’s broad SDK coverage empowers Factory to track errors across every customer touchpoint. With their APIs, MCP server, and CLI, errors are easily accessible to our Droids, which can autonomously handle them end-to-end.”
— Alvin Sng, Member of Technical Staff
Before Sentry
Factory came to Sentry at a familiar inflection point for fast-growing companies: close to a new product launch and needing a cross-platform observability solution immediately.
Previously, the team relied on server-side monitoring, but they were launching a new web interface built on a new codebase with new primitives. It was the right moment to rethink their approach.
Several of Factory’s engineers had used Sentry before and knew it could deliver the simple instrumentation and multi-platform support they needed.
How Factory Uses Sentry
Monitoring the platform
Factory runs Sentry across its own infrastructure to monitor the reliability of the platform that Droids run on. Every error and release is tracked, giving the engineering team an immediate line of sight when something breaks.
Factory processes hundreds of millions events per month in Sentry - spanning errors and performance transactions - across 8 projects that cover their full platform, including:
- The Factory desktop agent, built on Electron
- Their Next.js-powered web frontend and internal tooling
- Node.js backend services handling core API and AI orchestration
- AWS Lambda functions for serverless infrastructure
GitHub and Slack integrations are wired directly into Factory’s incident response workflow, so when an issue occurs, the right person or agent is notified immediately through the right channel.
Production context for Droids
The Factory team connects Sentry directly to its internal software factory. When an error fires in Sentry, a Droid can pull the issue details, check how many users are affected, replay the session, and suggest code fix all without a human opening a browser.
The same capability extends to Factory’s customers. Factory’s Droids can integrate directly with a customer’s Sentry environment – reading error events, stack traces, breadcrumbs, and issue metadata to trace the root cause and generate a fix. In many cases, a pull request is opened autonomously.
The structured nature of Sentry’s error grouping is what makes this possible at scale.
Catching production-only errors the instant they surface
After a production release, Sentry surfaced a FAILED_PRECONDITION: query requires an index error from a newly activated code path. A recently enabled feature was running a database query that depended on a composite index - one that only became necessary once the new code path was live, a condition that by its nature couldn’t surface beforehand.
Sentry caught the error, and Factory’s incident response agent remediated the issue within minutes of the release. But it wasn’t just the detection that made resolution fast. The required index definition was embedded directly in the error payload, giving Factory’s agents everything needed to diagnose the issue and create the missing index immediately.
Confident triage during an upstream outage
In another incident, Factory’s on-call engineer was paged with a sudden spike in API errors on an integration service. Because Sentry groups errors by root cause and surfaces volume and stack trace together, the engineer could see within seconds that the spike was overwhelmingly a single issue: a surge of Token refresh failed events, all traced back to a single file in the authentication path returning the same invalid_grant response from an upstream auth provider.
This gave the team an immediate, confident diagnosis: the surge came from a third-party provider, not a regression in Factory’s own code, and the application was degrading gracefully throughout.
In both cases, Sentry didn’t just flag that something broke, it pinpointed exactly what broke and where, letting the teams avoid unnecessary investigation cycles and resolve issues quickly.
The Future is Autonomous
Factory’s exponential growth reflects that reliability is a competitive advantage. With a growing enterprise customer base across financial services, healthcare, and technology, the team is building toward a future where error monitoring doesn’t live in a dashboard that engineers open. It’s infrastructure that agents consume: a continuous signal that keeps autonomous systems honest, and gives the humans supervising them the confidence to let the Droids do their work.