Implementing Event Average Rating with SQLAlchemy

While implementing Open Event Server version 2, we decided to have a better way of ranking events by their quality. To define the “quality” of events, the programmers decided to accumulate the feedbacks of specific events and take the average of the ratings involved. Thus, the average rating of an event proves to be a good (enough) measure of its quality. While there are many ways to implement aggregate relationships in an app, here I demonstrate a rather modern methodology which insists on storing such aggregates once they’re computed.

Since there is always a space-time/computation tradeoff in software development, this task was no exception. At first, the straightforward idea that came to my mind was to query the Postgres database every time a request for average rating was made. This sounds simple, but with hundreds of events stored on a server, and potentially thousands of users querying for events, this seemed to be a computationally expensive approach. It was costly because the average rating aggregate would be computed for each request, and there could potentially be thousands of such concurrent requests. Therefore, a better idea is to compute the aggregate once, store it in the database (compromising space in the tradeoff mentioned above, but saving a large amount of computation at the same time), and update only when a change is made. In our specific case, the update should happen only when a new rating is added, a rating is deleted or an existing rating is modified. Since the advantages outnumbered the disadvantages, this was the strategy to be implemented.

The first step in implementing average rating was to modify the database model of events accordingly. For this, I performed the necessary imports in the events’ database model file:

from sqlalchemy_utils import aggregated
from app.models.feedback import Feedback

Now comes the tricky part. We want an average_rating column in the events table, that contains the mean rating of events. The values in this column should be updated every time a change is made to the feedbacks table. To perform this sort of functionality, the best, raw tool is a Postgres trigger. A trigger should be created that is fired after every update to the feedbacks table, which should update the average rating values in the events table. Here’s how the raw code of such a trigger looks like:

create or replace function UpdateAverageRating() returns trigger AS
$$
BEGIN
UPDATE events SET average_rating=(
SELECT avg(rating) FROM feedbacks
WHERE event_id=NEW.event_id
GROUP BY event_id
)

WHERE id = NEW.event_id
END
$$
language plpgsql

Fortunately, the translation of such a trigger into SQLAlchemy-speak is not only easy, but also very elegant. The imports I showed above already set the context for this translation.

The event model class looks like the following:

class Event(db.Model):
    """Event object table"""
    __tablename__ = 'events'
    __versioned__ = {
'exclude': ['schedule_published_on', 'created_at']
}
    id = db.Column(db.Integer, primary_key=True)
    identifier = db.Column(db.String)
    name = db.Column(db.String, nullable=False)
    external_event_url = db.Column(db.String)

    …

    …

    …

The list of attributes continues, and to the end of this list, we now add a decorated method:

…
…
…
xcal_url = db.Column(db.String)
is_sponsors_enabled = db.Column(db.Boolean, default=False)
discount_code_id = db.Column(db.Integer, db.ForeignKey(
'discount_codes.id', ondelete='CASCADE'))

@aggregated('feedbacks', db.Column(db.Float))
def average_rating(self):
    return db.func.avg(Feedback.rating)

That’s it with the translation – this slick, decorated method can be thought of as a bridge between Python and the trigger shown earlier that’s usually implemented in the database itself. Once this method is added, we save the model file and perform a database migration:

$ python manage.py db migrate

This generates a migration file associated with our changes. This file shows the following alembic migration code:

"""empty message

Revision ID: 1471fe0d04ee
Revises: 49f3a33f5437
Create Date: 2018-06-08 19:32:47.485543

"""

from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision = '1471fe0d04ee'
down_revision = '49f3a33f5437'


def upgrade():
    op.add_column('events', sa.Column('average_rating', sa.Float(), nullable=True))
    op.add_column('events_version', sa.Column('average_rating', sa.Float(), autoincrement=False, nullable=True))

def downgrade():
    op.drop_column('events_version', 'average_rating')
    op.drop_column('events', 'average_rating')

Now that the file is generated, we upgrade our database state by utilizing this migration file:

$ python manage.py db upgrade

And here are the successful migration logs that immediately follow the upgrade command:

INFO  [alembic.runtime.migration] Context impl PostgresqlImpl.
INFO  [alembic.runtime.migration] Will assume transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade 49f3a33f5437 -> 1471fe0d04ee, empty message

This completes the implementation of the average rating attribute of events. We can use the same technique to implement other attributes in our server, like sum, and minimum, just to name a couple. We saw how SQLAlchemy so elegantly manages to map all the mapping from Python code to database commands. This is just one of the plethora of advantages of using database object-relational-mappers (ORMs), and Open Event Server utilizes them to full extent.

Resources:

Implementing Event Average Rating with SQLAlchemy

Related

Leave a ReplyCancel reply

Share this:

Related

Leave a ReplyCancel reply