Django and the Notorious N+1

videoVideoSnacks

Welcome back to another episode of Sentry’s Snack of the Week.

In today’s episode we’re going to join Adam as he talks a little bit about the common N+1 problem.

How’s it going everyone? My name is Adam, and by day I’m one of the Engineering Managers here at Sentry working on our performance tooling. By night, I like to pretend to be a developer. Today I’m going to quickly walk you through a performance issue I discovered while building a way to manage your personal expenses. I also wrote about this in a blog post

In one part of my application, I needed to pull up a bunch of expense reports, loop through it, and show each expense to the user. Since Django is my framework of choice, I knew I could use it’s ORM to pull the data, send it down to a template, and then loop through it and use the template tag to display the data.

After releasing the code, I went to Sentry to see how things were going and I noticed a performance issue. Now this is a project that I work on…a lot…So I went straight to the code to see if I could find any obvious performance issues. I looked at my code for the normal issues:

  • Debug statements
  • Random sleeps left in for performance tutorials
  • Gremlins I couldn’t find anything.

Luckily, I could go back into Sentry and check out that specific transaction and see what was going on. I was a little surprised by what I saw because I was only fetching the data once and, yet, I saw a bunch of different database calls. But I guess I wasn’t that surprised because I had just encountered an N+1 problem.

That’s when I realized I had forgotten two things:

  • Pagination
  • To use select_related

It’s always a good idea when you’re looping through a large dataset to paginate your results. This is good, both because it creates a digestible amount of information for your user, and it also limits the max number of database calls you have, if you have an N+1 problem.

The second, and more useful solution, is to use Django select_related. select_related will basically follow foreign key relationships and pull all of your data into the first query, instead of doing one query to list all of the data, and then separate data to pull all of the individual items.

After updating my code, I deployed it and went back to Sentry to see how things were going. I went to that transaction and noticed that the duration had changed from 3.5 second back down to 300 milliseconds. So that’s a win in my book.

N+1 problems are practically unavoidable. If you’re using Django, you will eventually encounter one of these. So the things to think about are to paginate your queries to make sure you’re limiting the max number of queries that it can run. As well as to use things like select_related or prefetch_related and think about how your queries are built.

After that, I closed my laptop, poured myself a cold drink, and patted myself on the back. I still got this.

And don’t forget to like, subscribe and follow us on YouTube. You don’t want to miss anymore of these Sentry Snacks of the Week.

< Watch the previous Snack

Watch the Next Snack >

Featuring

  • Sarah Guthals

    Director of DevRel, Sentry

  • Adam McKerlie

Get in touch.

By filling out this form, you agree to our privacy policy. This form is protected by reCAPTCHA and Google's Privacy Policy and Terms of Service apply.
© 2024 • Sentry is a registered Trademark
of Functional Software, Inc.