I had many time the need to optimise some performance issues with some ORM queries, then many times the prefetch_related and select_related were my best friends for this topic. But when calling some business logic which reside in some model's methods which try to calculate or to fetch some related model data the optimisation will not have any effect as these method are using filter
or some queryset aggregations Sum
or Count
...
The solution to this problem is either to redefine the logic to use prefetched data or more clean to make these methods now if there already pre-feteched objects, that's the best solution in my cases:
A basic models example models.py
:
class Teacher(models.Model):
name = models.CharField(max_length=255)
def students_count(self):
return self.student_set.filter(age__gte=20).count()
class Student(models.Model):
name = models.CharField(max_length=255)
teacher = models.ForeignKey(Teacher)
age = models.IntegerField(default=20)
for t in Teacher.objects.all():
print(teacher.students_count())
# This will results to the number of teachers + 1 SQL queries
Let's use prefetch_related
:
for t in Teacher.objects.all().prefetch_related('student_set'):
print(teacher.students_count())
# Our prefetch have no effect as the method students_count is using filter which
# will ignore the prefetched objects
We have to fix the method students_count
to consider checking for prefetched objects:
class Teacher(models.Model):
name = models.CharField(max_length=255)
def students_count(self):
if hasattr(self, '_prefetched_objects_cache') and 'student' in self._prefetched_objects_cache:
return len([x for x in self.student_set.all() if x.age >= 20])
return self.student_set.filter(age__gte=20).count()
This way even calling students_count
in a single instance without calling prefetch_related
on the queryset will use the filter instead of the prefetched, this can be useful in many cases but I better to use this only when you need to.
Thanks for reading