Aggregation & grouping¶
Aggregation lets you compute values across many rows instead of fetching them one by one. With yara_orm you build aggregates such as Count, Sum, Avg, Min and Max, attach them to a query as annotations, and optionally group by one or more columns. Everything stays lazy and chainable until you await it, keeping aggregation idiomatic in an async Python ORM.
The models used throughout this guide:
from yara_orm import Model, fields
class Author(Model):
id = fields.IntField(pk=True)
name = fields.CharField(max_length=120)
class Book(Model):
id = fields.IntField(pk=True)
title = fields.CharField(max_length=200)
rating = fields.DecimalField(max_digits=3, decimal_places=1, default=0)
author = fields.ForeignKeyField("Author", related_name="books")
Aggregate functions¶
The aggregate expressions are imported directly from yara_orm:
| Aggregate | SQL | Typical target |
|---|---|---|
Count |
COUNT | a relation (reverse FK / M2M) or column |
Sum |
SUM | a numeric column |
Avg |
AVG | a numeric column |
Min |
MIN | any orderable column |
Max |
MAX | any orderable column |
Every aggregate shares the same constructor:
field— the name of a column (e.g."rating") or a relation (e.g."books", a reverse foreign key). When the target is a relation, the query compiler adds the necessaryJOINfor you.distinct— whenTrue, the aggregate counts/aggregates distinct values only, compiling toCOUNT(DISTINCT ...).
Annotating a query¶
.annotate(**annotations) adds computed columns to a query. Each keyword becomes the output name; each value is an aggregate expression.
# Count over a relation (reverse FK): how many books each author has
qs = Author.annotate(book_count=Count("books"))
# Aggregate over a column: the average rating of all books
Book.annotate(avg_rating=Avg("rating"))
You can attach several annotations at once:
Reading annotated results¶
Awaiting an annotated queryset returns ordinary model instances, with each annotation set as an attribute named after its keyword:
for author in await Author.annotate(book_count=Count("books")):
print(author.name, author.book_count)
Projecting with .values()
When you only need the computed numbers (not full model instances), project with .values(...) to get plain dicts, or .values_list(...) for tuples:
Grouping with group_by¶
.group_by(*fields) groups the result rows by the given columns. Combine it with an annotation and a projection to produce one aggregated row per group:
rows = (
await Book.annotate(total=Sum("rating"))
.group_by("author_id")
.values("author_id", "total")
)
# [{"author_id": 1, "total": Decimal("8.0")}, {"author_id": 2, "total": Decimal("4.0")}]
Aggregating over the whole table
Calling .group_by() with no arguments collapses every row into a single group, which is handy for table-wide statistics:
Filtering annotations → HAVING¶
yara_orm decides between WHERE and HAVING by what you filter on:
- Filtering by a normal field adds a
WHEREcondition (applied before grouping). - Filtering by an annotation name adds a
HAVINGcondition (applied after the aggregate is computed).
# HAVING COUNT(...) >= 1 — keep only authors that have at least one book
Author.annotate(books=Count("books")).filter(books__gte=1)
Because books is an annotation, the books__gte=1 lookup compiles to HAVING. If you had filtered on a column such as name__startswith="A", that would compile to WHERE instead. The two can be mixed freely in a single .filter(...) call.
Putting it together¶
A realistic report: group authors, count their books, average the ratings, keep only authors with at least one book, and order by the busiest author first.
from yara_orm import Count, Avg
authors = (
await Author.annotate(
book_count=Count("books"),
avg_rating=Avg("books__rating"),
)
.filter(book_count__gte=1) # HAVING COUNT(books) >= 1
.order_by("-book_count") # busiest authors first
)
for author in authors:
print(author.name, author.book_count, author.avg_rating)
Here Avg("books__rating") reaches across the books relation to the related rating column using the relation__column path, filter(book_count__gte=1) becomes a HAVING clause, and order_by("-book_count") sorts by the annotation in descending order.