Open Event Server – Change a Column from NULL to NOT NULL

FOSSASIA‘s Open Event Server uses alembic migration files to handle all database operations and updating. Whenever the database is changed a corresponding migration python script is made so that the database will migrate accordingly for other developers as well. But often we forget that the automatically generated script usually just add/deletes columns or alters the column properties. It does not handle the migration of existing data in that column. This can lead to huge data loss or error in migration as well.

For example :

def upgrade():
    # ### commands auto generated by Alembic - please adjust! ###
    op.alter_column('ticket_holders', 'lastname',
                    existing_type=sa.VARCHAR(),
                    nullable=False)
    # ### end Alembic commands ###

Here, the goal was to change the column “ticket_holders” from nullable to not nullable. The script that alembic autogenerated just uses op.alter_column().

It does not count for the already existing data. So, if the column has any entries which are null, this migration will lead to an error saying that the column contains null entries and hence cannot be “NOT NULL”.

How to Handle This?

Before altering the column definition we can follow the following steps :

  1. Look for all the null entries in the column
  2. Give some arbitrary default value to those
  3. Now we can safely alter the column definition

Let’s see how we can achieve this. For connecting with the database we will use SQLAlchemy. First, we get a reference to the table and the corresponding column that we wish to alter.

ticket_holders_table = sa.sql.table('ticket_holders',
                                        sa.Column('lastname', sa.VARCHAR()))

 

Since we need the “last_name” column from the table “ticket_holders”, we specify it in the method argument.

Now, we will give an arbitrary default value to all the originally null entries in the column. In this case, I chose to use a space character.

op.execute(ticket_holders_table.update()
               .where(ticket_holders_table.c.lastname.is_(None))
               .values({'lastname': op.inline_literal(' ')}))

op.execute() can execute direct SQL commands as well but we chose to go with SQLAlchemy which builds an optimal SQL command from our modular input. One such example of a complex SQL command being directly executed is :

op.execute('INSERT INTO event_types(name, slug) SELECT DISTINCT event_type_id, lower(replace(regexp_replace(event_type_id, \'& |,\', \'\', \'g\'), \' \', \'-\')) FROM events where not exists (SELECT 1 FROM event_types where event_types.name=events.event_type_id) and event_type_id is not null;'))

Now that we have handled all the null data, it is safe to alter the column definition. So we proceed to execute the final statement –

op.alter_column('ticket_holders', 'lastname',
                    existing_type=sa.VARCHAR(),
                    nullable=False)

Now the entire migration script will run without any error. The final outcome would be –

  1. All the null “last_name” entries would be replaced by a space character
  2. The “last_name” column would now be a NOT NULL column.

References

Continue Reading

Working With Inter-related Resource Endpoints In Open Event API Server

In this blogpost I will discuss how the resource inter-related endpoints work. These are the endpoints which involve two resource objects which are also related to another same resource object.


The discussion in this post is of the endpoints related to Sessions Model. In the API server, there exists a relationship between event and sessions. Apart from these, session also has relationships with microlocations, tracks, speakers and session-types. Let’s take a look at the endpoints related with the above.

`/v1/tracks/<int:track_id>/sessions` is a list endpoint which can be used to list and create the sessions related to a particular track of an event. To get the list we define the query() method in ResourceList class as such:

The query method is executed for GET requests, so this if clause looks for track_id in view_kwargs dict. When the request is made to `/v1/tracks/<int:track_id>/sessions`, track id will be present as ‘track_id in the view_kwargs. The tracks are filtered based on the id passed here and then joined on the query with all sessions object from database.

For the POST method, we need to add the track_id from view_kwargs to pass into the track_id field of database model. This is achieved by using flask-rest-jsonapi’s before_create_object() method. The implementation for track_id is the following:

When a POST request is made to `/v1/tracks/<int:track_id>/sessions`, the view_kwargs dict will have ‘track_id’ in it. So if track_id is present in the url params, we first ensure that a track with the passed id exists, then only proceed to create a sessions object under the given track. Now the safe_query() method is a generic custom method written to check for such things. The model is passed along with the filter attribute, and a field to include in the error message. This method throws an ObjectNotFound exception if no such object exists for a given id.

We also need to take care of the permissions for these endpoints. As the decorators are called even before schema validation, it was difficult to get the event_id for permissions unless adding highly endpoint-specific code in the permission manager core, leading to loss of generality.  So the leave_if parameter of permission check was used to overcome this issue. Since the permissions manager isn’t fully developed yet, this is to be changed in the improved implementation of the permissions manager.

Similar implementations for micro locations and session types was done. All the same is not explained in this blogpost. For extended code, take a look at the source code of this schema.

For speaker relation, a few things were different. This is because speakers-sessions is a many-to-many relationship. Let’s take a look at this:

As it is a many-to-many relationship, a association_table was used with flask-sqlalchemy. So for the query() method, the same association table is queried after extracting the speaker_id from view_kwargs dict.

For the POST request on `/v1/speakers/<int:speaker_id>/sessions` , flask-rest-jsonapi’s after_create_object() method is used to insert the request in association_table. In this method the parameters are the following: self, obj, data, view_kwargs

Now view_kwargs contains the url parameters, so we make a check for speaker_id in view_wargs. If it is present, then before proceeding to insert data, we ensure that a speaker exists with that id using the safe_query() method as described above. After that, the obj argument of the method is used. This contains the object that was created in  previous method. So now once the sessions object has been created and we are sure that a speaker exists with given speaker_id, this is just to be appended to obj.speakers  so that this relationship tuple is inserted into the association table.

This updates the association table ‘speakers_session’ in this case. The other such endpoints are being worked upon in a similar fashion and will be consolidated as part of a set of robust APIs, along with the improved permissions manager for the Open Event API Server.

Additional Resources

Code, Issues and Pull Request involved

Continue Reading
Close Menu
%d bloggers like this: