Using S3 for Cloud storage

In this post, I will talk about how we can use the Amazon S3 (Simple Storage Service) for cloud storage. As you may know, S3 is a no-fuss, super easy cloud storage service based on the IaaS model. There is no limit on the size of file or the amount of files you can keep on S3, you are only charged for the amount of bandwidth you use. This makes S3 very popular among enterprises of all sizes and individuals.

Now let’s see how to use S3 in Python. Luckily we have a very nice library called Boto for it. Boto is a library developed by the AWS team to provide a Python SDK for the amazon web services. Using it is very simple and straight-forward. Here is a basic example of uploading a file on S3 using Boto –

import boto
from boto.s3.key import Key
# connect to the bucket
conn = boto.connect_s3(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
bucket = conn.get_bucket(BUCKET_NAME)
# set the key
key = 'key/for/file'
file = '/full/path/to/file'
# create a key to keep track of our file in the storage
k = Key(bucket)
k.key = key
k.set_contents_from_filename(file)

The above example uploads a file to s3 bucket BUCKET_NAME.

Buckets are containers which store data. The key here is the unique key for an item in the bucket. Every item in the bucket is identified by a unique key assigned to it. The file can be downloaded from the urlBUCKET_NAME.s3.amazonaws.com/{key}. It is therefore essential to choose the key name smartly so that you don’t end up overwriting an existing item on the server.

In the Open Event project, I thought of a scheme that will allows us to avoid conflicts. It relies on using IDs of items for distinguishing them and goes as follows –

  • When uploading user avatar, key should be ‘users/{userId}/avatar’
  • When uploading event logo, key should be ‘events/{eventId}/logo’
  • When uploading audio of session, key should be ‘events/{eventId}/sessions/{sessionId}/audio’

Note that to store user ‘avatar’, I am setting the key as /avatar and not /avatar.extension. This is because if user uploads pictures in different formats, we will end up storing different copies of avatars for the same user. This is nice but it’s limitation is that downloading file from the url will give the file without an extension. So to solve this issue, we can use the Content-Disposition header.

k.set_contents_from_filename(
	file,
	headers={
		'Content-Disposition': 'attachment; filename=filename.extension'
	}
)

So now when someone tries to download the file from that link, they will get the file with an extension instead of a no-extension “Choose what you want to do” file.

This covers up the basics of using S3 for your Python project. You may explore Boto’s S3 documentation to find other interesting functions like deleting a folder, copy one folder to another and so.

Also don’t forget to have a look at the awesome documentation we wrote for the Open Event project. It provides a more pictorial and detailed guide on how to setup S3 for your project.

 

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/s3-for-storage.html }}

Continue ReadingUsing S3 for Cloud storage

Paginated APIs in Flask

Week 2 of GSoC I had the task of implementing paginated APIs in Open Event project. I was aware that DRF provided such feature in Django so I looked through the Internet to find some library for Flask. Luckily, I didn’t find any so I decided to make my own.

A paginated API is page-based API. This approach is used as the API data can be very large sometimes and pagination can help to break it into small chunks. The Paginated API built in the Open Event project looks like this –

{
    "start": 41,
    "limit": 20,
    "count": 128,
    "next": "/api/v2/events/page?start=61&limit=20",
    "previous": "/api/v2/events/page?start=21&limit=20",
    "results": [
    	{
    		"data": "data"
    	},
    	{
    		"data": "data"
    	}
    ]
}

Let me explain what the keys in this JSON mean –

  1. start – It is the position from which we want the data to be returned.
  2. limit – It is the max number of items to return from that position.
  3. next – It is the url for the next page of the query assuming current value of limit
  4. previous – It is the url for the previous page of the query assuming current value of limit
  5. count – It is the total count of results available in the dataset. Here as the ‘count’ is 128, that means you can go maximum till start=121 keeping limit as 20. Also when you get the page with start=121 and limit=20, 8 items will be returned.
  6. results – This is the list of results whose position lies within the bounds specified by the request.

Now let’s see how to implement it. I have simplified the code to make it easier to understand.

from flask import Flask, abort, request, jsonify
from models import Event

app = Flask(__name__)

@app.route('/api/v2/events/page')
def view():
	return jsonify(get_paginated_list(
		Event, 
		'/api/v2/events/page', 
		start=request.args.get('start', 1), 
		limit=request.args.get('limit', 20)
	))

def get_paginated_list(klass, url, start, limit):
    # check if page exists
    results = klass.query.all()
    count = len(results)
    if (count < start):
        abort(404)
    # make response
    obj = {}
    obj['start'] = start
    obj['limit'] = limit
    obj['count'] = count
    # make URLs
    # make previous url
    if start == 1:
        obj['previous'] = ''
    else:
        start_copy = max(1, start - limit)
        limit_copy = start - 1
        obj['previous'] = url + '?start=%d&limit=%d' % (start_copy, limit_copy)
    # make next url
    if start + limit > count:
        obj['next'] = ''
    else:
        start_copy = start + limit
        obj['next'] = url + '?start=%d&limit=%d' % (start_copy, limit)
    # finally extract result according to bounds
    obj['results'] = results[(start - 1):(start - 1 + limit)]
    return obj

Just to be clear, here I am assuming you are using SQLAlchemy for the database. The klass parameter in the above code is the SqlAlchemy db.Model class on which you want to query upon for the results. The url is the base url of the request, here ‘/api/v2/events/page’ and it used in setting the previous and next urls. Other things should be clear from the code.

So this was how to implement your very own Paginated API framework in Flask (should say Python). I hope you found this post interesting.

Until next time.

Ciao

 

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/paginated-apis-flask.html }}

Continue ReadingPaginated APIs in Flask

Better fields and validation in Flask Restplus

We at Open Event Server project are using flask-restplus for API. Apart from auto-generating of Swagger specification, another great plus point of restplus is how easily we can set input and output models and the same is automatically shown in Swagger UI. We can also auto-validate the input in POST/PUT requests to make sure that we get what we want.

@api.expect(EVENT_POST, validate=True)
def put(self, id):
    """Modify object at id"""
    pass

As can be seen above, the validate param for namespace.expect decorator allows us to auto-validate the input payloads. This used to work well until one day I realized there were a few problems.

  1. When a field was defined as say for example field.Integer, then it will accept only Integer values, not even null.
  2. If there is a string field and it has required param set to True, then also it is possible to set empty string as its value and the in-built validator won’t catch it.
  3. Even if I somehow managed to hack my way to support null in field, it will also support null even if required=True.
  4. We had no control on what error message was returned.
EVENT = api.model('Event', {
    'id': fields.Integer,
    'name': fields.String(required=True)
})

Specially problem #1 was a huge one as it questioned the whole foundation of the API. So we realized it will be better if we don’t use namespace.expect and use a custom validator. For custom validator, we first had to create custom fields that this validator can benefit from. Luckily flask-restplus comes with a great API for creating custom fields. So we quickly created custom fields for all common fields (Integer, String) and more specific fields like Email, Uri and Color. Creating these specific fields were a huge advantage as now we can show proper example for each field types in the Swagger UI.

class Email(fields.String):
    """
    Email field
    """
    __schema_type__ = 'string'
    __schema_format__ = 'email'
    __schema_example__ = 'email@domain.com'

Consider the above code; now when we use Email as a field for a value, then the example shown for it in Swagger UI will be ‘email@domain.com’. Quite cool, right?

Now we needed a way to validate these fields. For that, what we did was to create a validate method in each of the field-classes. This validate method would get the value and check if it was valid. Consider the following code –

import re
EMAIL_REGEX = re.compile(r'S+@S+.S+')

class Email():
	def validate(self, value):
		if not value:
		    return False if self.required else True
		if not EMAIL_REGEX.match(value):
		    return False
		return True

Once each of the field had their validate methods, we created a validate_payload() function that uses the API model and compares it with the payload. It will first check if all required keys are present in the payload or not. When that is true, it finally validates each field’s value using their field’s classvalidate method.

from flask import abort
from flask_restplus import fields
from custom_fields import CustomField

def validate_payload(payload, api_model):
    # check if any reqd fields are missing in payload
    for key in api_model:
        if api_model[key].required and key not in payload:
            abort(400, 'Required field '%s' missing' % key)
    # check payload
    for key in payload:
        field = api_model[key]
        if isinstance(field, fields.List):
            field = field.container
            data = payload[key]
        else:
            data = [payload[key]]
        if isinstance(field, CustomField) and hasattr(field, 'validate'):
            for i in data:
                if not field.validate(i):
                    abort(400, 'Validation of '%s' field failed' % key)

The CustomField is the base class that each of the custom fields mentioned above inherit. So checking if field was an instance of CustomField is enough to know if it is a custom field or not. Other thing that may look weird in the above code is use of fields.List. If you look closely, I have added this to support custom fields inside lists. So if you have used a custom field in a list, it will also work too. But obviously, this only supports single level lists for now. The thing is we didn’t needed more than that so I let it go. :stuck_out_tongue_winking_eye:

This basically sums up how we are validating input payloads at Open Event. Of course this is very basic but we will keep on improving it as the project progresses. Stay tuned to opev blog if you want to be in touch with the progress of the project.

Links to full code at the time of writing this post are –

  1. Custom Fields
  2. Validate Payload

I hope you found this post useful. Thanks for reading.

 

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/restplus-validation-custom-fields.html }}

Continue ReadingBetter fields and validation in Flask Restplus

REST API Authentication in Flask

Recently I had the challenge of restricting unauthorized personnel from accessing some views in Flask. Sure the naive way will be asking the username and password in the json itself and checking the records in the database. The request will be something like this-

{
	"username": "open_event_user",
	"password": "password"
}

But I wanted to do something better. So I looked up around the Internet and found that it is possible to accept Basic authorization credentials in Flask (sadly it isn’t documented). For those who don’t know what Basic authorization is a way to send plain username:password combo as header in a request after obscuring them with base64 encoding. So for the above username and password, the corresponding header will be –

{
	"Authorization": "Basic b3Blbl9ldmVudF91c2VyOnBhc3N3b3Jk"
}

where the hashed string is base64 encoded form of string “open_event_user:password”.

Now back to the topic, so the next job is to validate the views by checking the Basic auth credentials in header and call abort() if credentials are missing or wrong. For this, we can easily create a helper function that aborts a view if there is something wrong with the credentials.

from flask import request, Flask, abort
from models import UserModel

app = Flask(__name__)

def validate_auth():
	auth = request.authorization
	if not auth:  # no header set
		abort(401)
	user = UserModel.query.filter_by(username=auth.username).first()
	if user is None or user.password != auth.password:
		abort(401)

@app.route('/view')
def my_view():
	validate_auth()
	# stuff on success
	# more stuff

This works but wouldn’t it be nice if we could specify validate_auth function as a decorator. This will give us the advantage of only having to set it once in a model view with all auth-required methods. Right ? So here we go

def requires_auth(f):
	@wraps(f)
	def decorated(*args, **kwargs):
		auth = request.authorization
		if not auth:  # no header set
			abort(401)
		user = UserModel.query.filter_by(username=auth.username).first()
		if user is None or user.password != auth.password:
			abort(401)
		return f(*args, **kwargs)
	return decorated

@app.route('/view')
@requires_auth
def my_view():
	# stuff on success
	# more stuff

I renamed the function from validate_auth to requires_auth because it suits the context better.

At this point, the above code may look perfect but it doesn’t work when you are accessing the API through Swagger web UI. This is because it is not possible to set base64 encoded authorization header from the swagger UI. For those who are wondering “what the hell is swagger”, I will define Swagger as a tool for API based projects which creates a nice web UI to live-test the API and also exports a schema of the API that can be used to understand API definitions.

Now how do we get requires_auth to work when a request is sent through swagger UI ? It was a little tricky and took me a couple of hours but I finally got it. The trick therefore is to check for active sessions when there are no authorization headers set (as in the case of swagger UI). If an active session is found, it means that the user is authenticated. Here I would like to suggest using Flask-Login extension which makes session and login management a child’s play. Always use it if your flask project deals with login, user accounts and stuff.

Now back to the task in hand, here is how we can set the requires_auth function to check for existing sessions.

from flask import request, abort, g
from flask.ext import login

def requires_auth(f):
	@wraps(f)
	def decorated(*args, **kwargs):
		auth = request.authorization
		if not auth:  # no header set
			if login.current_user.is_authenticated:  # check active session
				g.user = login.current_user
				return f(*args, **kwargs)
			else:
				abort(401)
		user = UserModel.query.filter_by(username=auth.username).first()
		if user is None or user.password != auth.password:
			abort(401)
		g.user = user
		return f(*args, **kwargs)
	return decorated

Pretty easy right !! Also notice that I am saving the user who was currently authenticated in flask’s global variable g. Now the authenticated user can be accessed from views as g.user. Cool, isn’t it ? Now if there is a need to add a more secure form of authorization like ‘Token’ based, you can easily update therequires_auth decorator to get the same results.

I hope this article provided valuable insight into managing REST API authorizations in Flask. I will keep posting more awesome things I learn in my GSoC journey.

That’s it. Sayonara.

 

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/auth-flask-done-right.html }}

Continue ReadingREST API Authentication in Flask

Rest-API implementation in the flask Open Event Application

These days are getting so worked up. For the first time in my life, I have to attend to around 40 emails a day. If you count the replies with them, it will be well over 150. Things have got so busy.

To start off, our project is called Open Event. Open Event is a software suite to help people manage events, summits, conferences etc with relative ease. The organizers will have the power to manage sessions, speakers, tracks and schedule from a clean and user-friendly interface.

They can auto-tweet the sessions, generate a google calendar, post to social networks and do other fancy things with just a click of the button. Then there are android apps for both attendees and organizers. The organizers can use the app to manage the event whereas the attendees can use it to get details about upcoming events, give reviews, vote on stuff and so on. From the web interface, rich data can be shown about the event like the number of sessions, distinct speakers, schedule and other statistics. In other words, a website can be easily generated for the event. Apart from that, the project will feature a rich API which can used by other services to build up on it.

My part

Me and @shivamMg are responsible for the REST API part of the project. The plan is to build a full-proof API that will cover the entire scope of the application.

We have started the work using Flask-Restplus as the framework. The GET API part has been done. Now what is needed is to add the PUT, POST and DELETE verbs.

We will soon get to it and plan to have a basic version ready before the midterm (26th June). Once midterm is done, we will work on enhancing the API, finding and fixing bug cases, writing docs and improving overall project quality.

This is getting so much fun. The next 3 months will be full of heavy coding. Probably my GitHub streak will cross 100 days :stuck_out_tongue_winking_eye:. I am learning so much with every day that passes.

As many as 10 people are working on the Open Event project this summer. Working in such a big group project will be a new experience for me and I am so glad to be a part of it.

That’s all for now. See ya !!

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/so-it-begins.html }}

Continue ReadingRest-API implementation in the flask Open Event Application