Recently, it was discovered by a contributor that the rating attribute for event feedbacks in Open Event was of type String. The type was incorrect, indeed. After a discussion, developers came concluded that it should be of type Float. In this post, I explain how to perform this simple migration task of changing a data type across a typical Flask app’s stack.
To begin this change, we first, we modify the database model. The model file for feedbacks (feedback.py) looks like the following:
from app.models import db class Feedback(db.Model): """Feedback model class""" __tablename__ = 'feedback' id = db.Column(db.Integer, primary_key=True) rating = db.Column(db.String, nullable=False) # ←-- should be float comment = db.Column(db.String, nullable=True) user_id = db.Column(db.Integer, db.ForeignKey('users.id', ondelete='CASCADE')) event_id = db.Column(db.Integer, db.ForeignKey('events.id', ondelete='CASCADE')) def __init__(self, rating=None, comment=None, event_id=None, user_id=None): self.rating = rating # ←-- cast here for safety self.comment = comment self.event_id = event_id self.user_id = user_id … … …
The change here is quite straightforward, and spans just 2 lines:
rating = db.Column(db.Float, nullable=False)
and
self.rating = float(rating)
We now perform the database migration using a couple of manage.py commands on the terminal. This file is different for different projects, but the migration commands essentially look the same. For Open Event Server, the manage.py file is at the root of the project directory (as is conventional). After cd’ing to the root, we execute the following commands:
$ python manage.py db migrate
and then
$ python manage.py db upgrade
These commands update our Open Event database so that the rating is now stored as a Float. However, if we execute these commands one after the other, we note that an exception is thrown:
sqlalchemy.exc.ProgrammingError: column "rating" cannot be cast automatically to type float HINT: Specify a USING expression to perform the conversion. 'ALTER TABLE feedback ALTER COLUMN rating TYPE FLOAT USING rating::double precision'
This happens because the migration code is ambiguous about what precision to use after converting to type float. It hints us to utilize the USING clause of PostgreSQL to do that. We accomplish this manually by using the psql client to connect to our database and command it the type change:
$ psql oevent psql (10.1) Type "help" for help. oevent=# ALTER TABLE feedback ALTER COLUMN rating TYPE FLOAT USING rating::double precision
We now exit the psql shell and run the above migration commands again. We see that the migration commands pass successfully this time, and a migration file is generated. For our migration, the file looks like the following:
from alembic import op import sqlalchemy as sa # These values would be different for your migrations. revision = '194a5a2a44ef' down_revision = '4cac94c86047' def upgrade(): op.alter_column('feedback', 'rating', existing_type=sa.VARCHAR(), type_=sa.Float(), existing_nullable=False) def downgrade(): op.alter_column('feedback', 'rating', existing_type=sa.Float(), type_=sa.VARCHAR(), existing_nullable=False)
This is an auto-generated file (built by the database migration tool Alembic) and we need to specify the extra commands we used while migrating our database. Since we did use an extra command to specify the precision, we need to add it here. PostgreSQL USING clause can be added to alembic migration files via the postgresql_using keyword argument. Thus, the edited version of the upgrade function looks like the following:
def upgrade(): op.alter_column('feedback', 'rating', existing_type=sa.VARCHAR(), type_=sa.Float(), existing_nullable=False, postgresql_using='rating::double precision')
This completes our work on database migration. Migration files are useful for a variety of purposes – they allow us to easily get to a previous database state, or a new database state as suggested by a project collaborator. Just like git, they allow for easy version control and collaboration.
We didn’t finish this work after database migration. We also decided to impose limits on the rating value. We concluded that 0-5 would be a good range for rating. Furthermore, we also decided to round off the rating value to the “nearest 0.5”, so if the input rating is 2.3, it becomes 2.5. Also, if it is 4.1, it becomes 4.0. This was decided because such values are conventional for ratings across numerous web and mobile apps. So this will hopefully enable easier adoption for new users.
For the validation part, marshmallow came to rescue. It is a simple object serialization and deserialization tool for Python. So it basically allows to convert complex Python objects to JSON data for communicating over HTTP and vice-versa, among other things. It also facilitates pre-processing input data and therefore, allows clean validation of payloads. In our case, marshmallow was specifically used to validate the range of the rating attribute of feedbacks. The original feedbacks schema file looked like the following:
from marshmallow_jsonapi import fields from marshmallow_jsonapi.flask import Schema, Relationship from app.api.helpers.utilities import dasherize class FeedbackSchema(Schema): """ Api schema for Feedback Model """ class Meta: """ Meta class for Feedback Api Schema """ type_ = 'feedback' self_view = 'v1.feedback_detail' self_view_kwargs = {'id': '<id>'} inflect = dasherize id = fields.Str(dump_only=True) rating = fields.Str(required=True) # ← need to validate this comment = fields.Str(required=False) event = Relationship(attribute='event', self_view='v1.feedback_event', self_view_kwargs={'id': '<id>'}, related_view='v1.event_detail', related_view_kwargs={'feedback_id': '<id>'}, schema='EventSchemaPublic', type_='event') … …
To validate the rating attribute, we use marshmallow’s Range class:
from marshmallow.validate import Range
Now we change the line
rating = fields.Str(required=True)
to
rating = fields.Float(required=True, validate=Range(min=0, max=5))
So with marshmallow, just about 2 lines of work implements rating validation for us!
After the validation part, what’s remaining is the rounding-off business logic. This is simple mathematics, and for getting to the “nearest 0.5” number, the formula goes as follows:
rating * 2 --> round off --> divide the result by 2
We will use Python’s built-in function (BIF) to accomplish this. To implement the business logic, we go back to the feedback model class and modify its constructor. Before this type change, the constructor looked like the following:
def __init__(self, rating=None, comment=None, event_id=None, user_id=None): self.rating = rating self.comment = comment self.event_id = event_id self.user_id = user_id
We change this by first converting the input rating to float, rounding it off and then finally assigning the result to feedback’s rating attribute. The new constructor is shown below:
def __init__(self, rating=None, comment=None, event_id=None, user_id=None): rating = float(rating) self.rating = round(rating*2, 0) / 2 # Rounds to nearest 0.5 self.comment = comment self.event_id = event_id self.user_id = user_id
This completes the rounding-off part and ultimately concludes rating’s type change from String to Float. We saw how a simple high-level type change requires editing code across multiple files and the use of different tools in between. In doing so, we thus also learned the utility of alembic and marshmallow in database migration and data validation, respectively.
Resources