I had many time the need to optimize some performance issues with Django ORM queries, then many times the prefetch_related and select_related were my best tool for this problem.
But when calling some business logic which reside in some model's methods which try to calculate or to fetch some related model data the optimization will not have any effect as these method are using
filter or some queryset aggregations
The solution to this problem is either to redefine the logic to use prefetched data or more clean to make these methods now if there already prefetched objects, that's the best solution in my cases:
A basic models example
class Teacher(models.Model): name = models.CharField(max_length=255) def students_count(self): return self.student_set.filter(age__gte=20).count() class Student(models.Model): name = models.CharField(max_length=255) teacher = models.ForeignKey(Teacher) age = models.IntegerField(default=20)
for t in Teacher.objects.all(): print(teacher.students_count()) # This will results into multiple DB queries equal to the number of teachers + 1 # If we have 100 teachers in the DB, this will be causing 101 queries.
for t in Teacher.objects.all().prefetch_related('student_set'): print(teacher.students_count()) # Our prefetch have no effect as the method students_count is using filter which # will ignore the prefetched objects
We have to fix the method
students_count to consider checking for prefetched objects:
class Teacher(models.Model): name = models.CharField(max_length=255) def students_count(self): if hasattr(self, '_prefetched_objects_cache') and 'student' in self._prefetched_objects_cache: return len([x for x in self.student_set.all() if x.age >= 20]) return self.student_set.filter(age__gte=20).count()
This way even calling
students_count in a single instance without calling
prefetch_related on the queryset will use the filter instead of the prefetched.