Science Hack Day India 2016

Announcing Science Hack Day India – 2016

We are excited to announce our 1st Science Hack Day India. The event will take place on 22-23 October 2016 at Belgaum, a small city surrounded by some splendid nature, in Karnataka State of India.

We welcome you all to join us at SHD India. Let’s collaborate, learn, hack, build cool stuff and have lots of fun.

Registration is now open at eventbrite.

For more announcements follow us on…

facebooktwitter

What is Science Hack Day?

Science Hack Day is a two-day event where anyone excited about making weird, silly or serious things with science comes together in the same physical space to see what they can prototype within 30 consecutive hours. Designers, developers, scientists and anyone who is excited about making things with science are welcome to attend – no experience in science or hacking is necessary, just an insatiable curiosity.

The mission of Science Hack Day is to get excited and make things with science! People organically form multidisciplinary teams over the course of a weekend: particle physicists team up with designers, marketers join forces with open source rocket scientists, writers collaborate with molecular biologists, and developers partner with school kids. By collaborating on focused tasks during this short period, small groups of hackers are capable of producing remarkable results.

Venue

We have an amazing place called Sankalp Bhumi  Farm Resort for this event. It was once an abandoned quarry,  today natures glory restored. The resort resembles an enchanting oasis, with a thick set of trees, sprawling lawns, and a large lagoon surrounded by picturesque expansive rock walls as backdrop.

5

Tentative Program

Day 1:

09:00 Arrive, check-in, eat breakfast (provided)
10:00 Welcome, introductions
10:30 Hacking begins!
10:45 Lightning talks
12:00 Lunch (provided)
13:00 Hacking continues
18:00 Door closes

Day 2:

09:00 Doors open, breakfast (provided)
12:00 Lunch (provided)
13:30 Hacking stops
14:00 Hack demos begin! (Typically 2-3 minutes per demo)
16:00 Winning teams announced & given awards/medals

Science Workshops

Along with hacking we also have Science Workshops for kids. Workshops will run parallel to the SHD. We will be making amazing science toys and solar lanterns 🙂

Organisers:

FOSSASIA India Team.

Praveen Patil, Hong Phuc Dang, Rahul Khanolkar

FOSSASIA Code-In Grand Prize Winners Gathering at Google Headquarter

FOSSASIA was thrilled to be selected once again as a mentor organisation of Google Code-In (GCI) 2015 – a contest to introduce pre-university students (ages 13-17) to open source software development. Together with 13 other orgs we reached out to 980 students from 65 countries completed a total number of 4,776 tasks. As a part of our participation, I got a chance to present FOSSASIA at the Grand Prize Winners trip.

FOSSASIA Team, photo by Jeremy Allison
FOSSASIA Team, photo by Jeremy Allison

GCI 2015 Awards Ceremony

28 grand prize winners, their parents along with one mentor from each participating organisation were invited to a trip to the Bay Area as a reward to their hard work during the last GCI program. Students had a chance to meet with mentors and to interact with their fellow students from other projects, enjoyed a few days in San Francisco and received many cool gifts/swags from Google.

Chris DiBona and Jason Wong, photo by Jeremy Allision
Chris DiBona and Jason Wong, photo by Jeremy Allision

Chris DiBona – Director of Open Source at Google – a super busy man who was so kind to spend his morning personally congratulated each single student in front of his/her parent. I do believe enjoy what you are doing and get recognition for your work is the best gift ever and to be able to share it with your family is even better. Thanks Google for celebrating the open source culture.

Meet, learn and share

I was very impressed by the level of knowledge and abilities of all the 28 students. They are young, enthusiastic and inspiring. Thanks to all the parents for believing and supporting the kids in pursuing their open source journey.

Group photo by Jeremy Allison
Group photo by Jeremy Allison

It was wonderful to meet our two FOSSASIA GCI students for the first time. Jason grew up in the States, seemed a bit reserved while Yathannsh from India was very outspoken. They both were very new to open source when they joined the program and now have become active contributors and very eager to learn more. Three of us had a team presentation on FOSSASIA labs and our achievement from GCI 2015. Jason expressed his wish to go on as a mentor for the next GCI.

Jason and Yathannsh
Jason and Yathannsh

I had several interesting conversations with the parents who finally understood why their kids were on the computers all the time. About 14% of the parents are working in IT and very aware of open technology. The rest was super excited to learn about various open source projects. Many said to me they would love to have their second son/daughter to join the program as well.

The mentor group had a few discussions on pros and cons, how to improve and maximize the outcome of the program, and ways to keep students engaging afterwards. I learned a lot from other orgs and also shared FOSSASIA workflow and guidelines with them. The 7 weeks of GCI was an amazing experience for me and my team. I must give our FOSSASIA mentors credit for their incredible efforts. It was truly a pleasure to work with Mario, Sean, Mohit, Praveen, Nikunj, Abhishek, Jigyasa, Dukeleto, Manan, Saptaks, Aruna, Rohit, Arnav, Diwanshi, Martin, Nicco, Sudheesh, Samarjeet, Harsh, Luther, Jung and many more.

mentors
GCI 2015 – 14 Org Mentors, photo by Jeremy Allison

The Fun

It was the best field trip ever! The program was carefully planned: Meeting with Google engineers, a tour of the Google campus, a day of sightseeing around San Francisco and much more.

Segway Tour

I was my first time on a Segway and I loved it, so cool! Thanks Stephanie for encouraging me to try this. It is never too late to learn something.

Segway by the bay
Segway by the bay

Afternoon walk over the Golden Gate Bridge

Sanya and I could have completed the entire bridge but.. because of our slow male mentors we only made it halfway through. To all my geek friends out there – Please do more exercises!

On Golden Gate Bridge
On Golden Gate Bridge
Walking on the bridge, photo by Florian Schmidt
Walking on the bridge, photo by Florian Schmidt

Yacht Dinner Cruise

This was the highlight for many of us: sailing along the bay, relaxing time on the water, beautiful landscape, nice chats and yummy food.

Photo by Jeremy Allison
Photo by Jeremy Allison
Ladies pose, photo by Jeremy Allison
Ladies pose, photo by Jeremy Allison
with Cat Allman, photo by Jeremy Allison
with Cat Allman, photo by Jeremy Allison

Thank you organisers!

We just couldn’t thank Stephanie enough for her hard work and the extreme energy not only to GCI but also to the whole open source community and especially her care for us all during our trip. I was amazed by the level of details that been brought in: additional medication, sunscreen, chocolate tips, gift card, travel guide, luggage storage, special diet etc.

Stephanie Taylor, GCI program manager, photo by Jeremy Allison
Stephanie Taylor, GCI program manager, photo by Jeremy Allison

Last but not least thank to the lovely Mary, kind Helen, cool Eric, friendly Josh, awesome photographer Jeremy and of course my favorite Cat Allman for another unforgettable experience!

Links

Photos: https://goo.gl/photos/htCKY4yJooX9ZSNBA

Google Code-In: developers.google.com/open-source/gci/

Google Summer of Code: developers.google.com/open-source/gsoc/

FOSSASIA Labs: http://labs.fossasia.org/

Using Cron Scheduling to automatically run background jobs

Cron scheduling is nothing new. It is basically a concept in which the system keeps executing lines of code every few seconds, minutes, or hours. It is required in many large applications which require some automation in their working. I had to use it for two purposes in our Open-Event application.

  • To automatically delete the items in the trash after a period of 30 days.
  • To automatically send after event mails to the speakers and organizers when an event gets completed

1. Delete items in Trash system

So the deleted items get stored in the trash of the admin. However if the items are not deleted or no action is performed on them then they should be deleted automatically if they stay in the trash for more than 30 days. I have used a framework – apscheduler (Advanced Python Scheduler) for this.The code is like this

from apscheduler.schedulers.background import BackgroundScheduler

def empty_trash():
 with app.app_context():
 print 'HELLO'
 events = Event.query.filter_by(in_trash=True)
 users = User.query.filter_by(in_trash=True)
 sessions = Session.query.filter_by(in_trash=True)
 for event in events:
 if datetime.now() - event.trash_date >= timedelta(days=30):
 DataManager.delete_event(event.id)

 for user in users:
 if datetime.now() - user.trash_date >= timedelta(days=30):
 transaction = transaction_class(Event)
 transaction.query.filter_by(user_id=user.id).delete()
 delete_from_db(user, "User deleted permanently")

 for session in sessions:
 if datetime.now() - session.trash_date >= timedelta(days=30):
 delete_from_db(session, "Session deleted permanently")


trash_sched = BackgroundScheduler(timezone=utc)
trash_sched.add_job(empty_trash, 'cron', day_of_week='mon-fri', hour=5,
                    minute=30)
trash_sched.start()

The trash_sched is initialized first. It is given an instance of BackgroundScheduler() meaning it will always run in the background as long as the application server is running.

The second line defines the trigger which is given to the scheduler meaning at what time intervals do we want the scheduler to execute the function.

trash_sched.add_job(empty_trash, 'cron', day_of_week='mon-fri', hour=5, 
                    minute=30)

Here the scheduler adds the function to the job list. The function is empty_trash and the trigger given here is ‘cron’. The following line:

day_of_week='mon-fri', hour=5, minute=30

sets the time at which the sheduler executes the job. The above line means that the job will be executed every day from Monday to Friday at 5:30 AM. Now coming to the function which is to be executed:

def empty_trash():
    with app.app_context():
        events = Event.query.filter_by(in_trash=True)
        users = User.query.filter_by(in_trash=True)
        sessions = Session.query.filter_by(in_trash=True)
        for event in events:
            if datetime.now() - event.trash_date >= timedelta(days=30):
                DataManager.delete_event(event.id)

        for user in users:
            if datetime.now() - user.trash_date >= timedelta(days=30):
                transaction = transaction_class(Event)
                transaction.query.filter_by(user_id=user.id).delete()
                delete_from_db(user, "User deleted permanently")

        for session in sessions:
            if datetime.now() - session.trash_date >= timedelta(days=30):
                delete_from_db(session, "Session deleted permanently")

There are three items in the trash: events, users and sessions. We get all the items in the trash by the query.all() method. Each model: Events, Users and Sessions has a column

trash_date = db.Column(db.DateTime)

The date on which the item is moved to the trash is stored in the trash_date column. And the following line:

 if datetime.now() - event.trash_date >= timedelta(days=30)

checks if the item has been in the trash for more than 30 days. If yes then it is deleted automatically. This function is constantly executed by the apscheduler according to the time settings given by us. Thus we do not need to manually delete the trash after 30 days.

2. Send After Event Mails

This is similar to the trash emptying function.

from apscheduler.schedulers.background import BackgroundScheduler

def send_after_event_mail():
 with app.app_context():
 events = Event.query.all()
 for event in events:
 upcoming_events = DataGetter.get_upcoming_events(event.id)
 organizers = DataGetter.get_user_event_roles_by_role_name(event.id, 'organizer')
 speakers = DataGetter.get_user_event_roles_by_role_name(event.id, 'speaker')
 if datetime.now() > event.end_time:
 for speaker in speakers:
 send_after_event(speaker.user.email, event.id, upcoming_events)
 for organizer in organizers:
 send_after_event(organizer.user.email, event.id, upcoming_events)

#logging.basicConfig()
sched = BackgroundScheduler(timezone=utc)
sched.add_job(send_after_event_mail, 'cron', day_of_week='mon-fri', hour=5, minute=30)
#sched.start()

The scheduler settings are the same as the trash scheduler settings. The function returns all the events and checks whether the event’s end_date has come or not. This check is performed against the present date by the following line:

if datetime.now() > event.end_time

If yes then the speakers and organizers for that event are obtained and after event mails are sent to them automatically.

In this way the whole system is automated. 🙂

Import/Export feature of Open Event – Challenges

We have developed a nice import/export feature as a part of our GSoC project Open Event. It allows user to export an event and then further import it back.

Event contains data like tracks, sessions, microlocations etc. When I was developing the basic part of this feature, it was a challenge on how to export and then further import the same data. I was in need of a format that completely stores data and is recognized by the current system. This is when I decided to use the APIs.

API documentation of Open Event project is at http://open-event.herokuapp.com/api/v2. We have a considerably rich API covering most aspects of the system. For the export, I adopted this very simple technique.

  1. Call the corresponding GET APIs (tracks, sessions etc) for a database model internally.
  2. Save the data in separate json files.
  3. Zip them all and done.

This was very simple and convenient. Now the real challenge came of importing the event from the data exported. As exported data was nothing but json, we could have created the event back by sending the data back as POST request. But this was not that easy because the data formats are not exactly the same for GET and POST requests.

Example –

Sessions GET –

{
	"speakers": [
		{
			"id": 1,
			"name": "Jay Sean"
		}
	],
	"track": {
		"id": 1,
		"name": "Warmups"
	}
}

Sessions POST –

{
	"speaker_ids": [1],
	"track_id": 1
}

So the exported data can only be imported when it has been converted to POST form. Luckily, the only change between POST and GET APIs was of the related attributes where dictionary in GET was replaced with just the ID in POST/PUT. So when importing I had to make it so such that the dicts are converted to their POST counterparts. For this, all that I had to do was to list all dict-type keys and extract the id key from them. I defined a global variable as the following listing all dict keys and then wrote a function to extract the ids and convert the keys.

RELATED_FIELDS = {
    'sessions': [
        ('track', 'track_id', 'tracks'),
        ('speakers', 'speaker_ids', 'speakers'),
    ]
}

Second challenge

Now I realized that there was even a tougher problem, and that was how to re-create the relations. In the above json, you must have realized that a session can be related to speaker(s) and track. These relations are managed using the IDs of the items. When an event is imported, the IDs are bound to change and so the old IDs will become outdated i.e. a track which was at ID 62 when exported can be at ID 92 when it is imported. This will cause the relationships to break. So to counter this problem, I did the following –

  1. Import items in a specific order, independent first
  2. Store a map of old IDs v/s new IDs.
  3. When dependent items are to be created, get new ID from the map and relate with it.

Let me explain the above –

The first step was to import/re-create the independent items first. Here independent items are tracks and speakers, and the dependent item is session. Now while creating the independent items, store their new IDs after create. Create a map of old ids v/s new ids and store it. This map will hold a clue to what became what after they were recreated from the json. Now the key final step is that when dependent items are to be created, find the indepedent related keys in their json using the above defined RELATED_FIELDS listing. Once they are found, extract their IDs and find the new ID corresponding to their old ID. Link the new ID with the dependent item and that would be all.

This post covers the main challenges I faced when developing the import/export feature and how I overcame them. I hope it will provide some help when you are dealing with similar problems.

 

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/open-event-import-export-algo.html }}

Creating Dynamic Footer with Popover

In Open-Event Webapp generator, the track page height varies according to the popover that appears on hovering the tracks. The problem with this design was the footer of the page that always remains static and produce a bad UI to user.

12

So, I have decided to make footer dynamic so that it varies it’s position according to the popover appeared on hover. The approach was a bit tricky but the diagram below will make it easy to understand.

Dynamic footer

The following code will work on hovering the track.

//popover.js 

var outerContheight= $('.main').offset().top + $('.main').outerHeight();
var tracknext= $(track).next();
var tracktocheck= track.offset().top + track.outerHeight() + 
 tracknext.outerHeight() + 15;
 var shift= tracktocheck - outerContheight;
 if(shift > 0){
 
 $('.footer').css({
 'position':'absolute',
 'top': outerContheight + shift,
 'width':'100%',
 'z-index': '999'
 })
 }

If shift > 0 which is calculated as shown in the above code it means that the footer needs to be shifted and hence we shift the footer by setting absolute position in CSS. Else we set position: static for footer.

 $('.footer').css({
 'position':'static'
 })

After following the above approach the footer position changes according to the popover. Here is the screencast for the approach.

 

Unit Testing

There are many stories about unit testing. Developers sometimes say that they don’t write tests because they write a good quality code. Does it make sense, if no one is infallible?.

At studies only a  few teachers talk about unit testing, but they only show basic examples of unit testing. They require to write a few tests to finish final project, but nobody really  teaches us the importance of unit testing.

I have also always wondered what benefits can it bring. As time is a really important factor in our work it often happens that we simply resign of this part of process development to get “more time” rather than spend time on writing stupid tests. But now I know that it is a vicious circle.

Customers requierments does not help us. They put a high pressure to see visible results not a few statistics about coverage status. None of them cares about some strange numbers. So, as I mentioned above, we usually focuses on building new features and get riid of tests. It may seem to save time, but it doesn’t.

In reality tests save us a lot of time because we can identify and fix bugs very quickly. If a bug ocurrs because someone’s change we don’t have to spend long hours trying to figure out wgat is going out. That’s why we need tests.  

It is especially visible in huge open source projects. FOSSASIA organization has about 200 contributors. In OpenEvent project we have about 20 active developers, who generate many lines of code every single day. Many of them change over and over again as well as interfere  with each other.

Let me provide you with a simple example. In our team we have about 7 pull requests per day. As I mentioned above we want to make our code high quality and free of bugs, but without testing identifying if pull request causes a bug is very difficult task. But fortunately this boring job makes Travis CI for us. It is a great tool which uses our tests and runs them on every PR  to check if bugs occur. It helps us to quickly notice bugs and maintain our project very well.

What is unit testing?

Unit testing is a software development method in which the smallest testable parts of an application are tested

Why do we need writing unit tests?

Let me point all arguments why unit testing is really important while developing a project.

  • To prove that our code works properly

If developer adds another condition, test checks if method returns correct results. You simply don’t need to wonder if something is wrong with you code.

  • To reduce amount of bugs

It let you to know what inputs params’ function should get and what results should be returned. You simply don’t  write unused code

  • To save development time

Developers don’t waste time on checking every code’s change if his code works correctly

  • Unit tests help to understand software design
  • To provide quick feedback about method which you are testing
  • To help document a code

How to write unit test in Python

In my work I write use tests in Python. I am going to share my sample code  with you now

  • Import module unittest
  • Choose function to test
  • Write unit test

Example OpenEvent test in Python

class TestPagesUrls(OpenEventTestCase):

   def setUp(self):

       self.app = Setup.create_app()

   def test_if_urls_exist(self):

       """Test all urls via GET method"""

       with app.test_request_context():

           for rule in app.url_map.iter_rules():

               if excluded_paths(rule):

                   status_code = self.app.get(request.url[:-1] + str(rule).replace('//', '/'),        follow_redirects=True).status_code

                   self.assertTrue(status_code in [200, 302, 401])

 

I want to check if all views exist but it required a lot of time. That’s why I wonder I how to avoid writing similar tests. Finally, based  on our list of routes I am able to write test which checks code’s status  on every page.

If some of them response returns status_code different than 200, 302 or 401, test fails.This results means that somethings is wrong. Simple, isn’t it ?  Try to test it manually…. This one short test cover about 40 use cases…

This example shows an incredible value of unit tests! If developer makes a bug in response he receives an error that something is wrong with a view. Travis CI allows to reject all  wrong pull requests and merge only these which fulfill our quality requirements.   

Fixing  error is one part but finding a bug is even harder task. But an ability to detect bug on early stage of process development reduces cost of software.

 

Working with GCM Task Service on Android

The GCM Network Manager enables apps to register services that perform network-oriented tasks.
The API helps with scheduling these tasks, allowing Google Play services to batch network operations across the system.

Hence, multiple tasks get executed in a go, rather than executing each of them individually.
This in turns save battery as the mobile radio is awoken only once rather than waking up the device multiple times.

Using GCM Network Manager in our app

Add the dependency to build.gradle

dependencies {
    ...
    compile 'com.google.android.gms:play-services-gcm:8.4.0'
    ...
}

Next, declare a new Service in the Android Manifest which will make the network calls:

<service android:name=".GCMService"
        android:permission="com.google.android.gms.permission.BIND_NETWORK_TASK_SERVICE"
        android:exported="true">
   <intent-filter>
       <action android:name="com.google.android.gms.gcm.ACTION_TASK_READY"/>
   </intent-filter>
</service>

The name of the service (GCMService) is the name of the class that will extend GcmTaskService, which is the core class for dealing with GCM Network Manager. This service will handle the running of a task.

Next we will define the GCMSerice class

public class GCMService extends GcmTaskService {
        ...
}

On implementing the methods from GcmTaskService, we have something like this :

@Override
public int onRunTask(TaskParams taskParams) {
        switch (taskParams.getTag()) {
                case TAG_TASK_ONEOFF_LOG:
                        //bundle with other tasks and then execute
                        return GcmNetworkManager.RESULT_SUCCESS;
                case TAG_TASK_PERIODIC_LOG:
                        //run this periodically 
                        return GcmNetworkManager.RESULT_SUCCESS;
                default:
                        return GcmNetworkManager.RESULT_FAILURE;
        }
}

Once this is set up, we need to schedule tasks from our Activity.

This can be done easily by first creating an object of GcmNetworkManager and class and hook up tasks to be executed from the above defined service

private GcmNetworkManager mGcmNetworkManager;

@Override
protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);
        setContentView(R.layout.activity_main);
        ...
        gcmNetworkManager = GcmNetworkManager.getInstance(this);
}

The GcmNetworkManager object will be used to schedule tasks, so one way to do this is to hold a reference as a member variable in onCreate . Alternatively, we can just get an instance of the GcmNetworkManager when we need it for scheduling.

Scheduling Tasks

Scheduling a OneOff Task

We provide the task with a window of execution, and the scheduler will determine the actual execution time. Since tasks are non-immediate, the scheduler can batch together several network calls to preserve battery life.

The scheduler will consider network availability, network activity and network load. If none of these matter, the scheduler will always wait until the end of the specified window.

Now, here’s how we would schedule a one-off task:

Task task = new OneoffTask.Builder()
              .setService(GCMService.class)
              .setExecutionWindow(0, 30)
              .setTag(LogService.TAG_TASK_ONEOFF_LOG)
              .setUpdateCurrent(false)
              .setRequiredNetwork(Task.NETWORK_STATE_CONNECTED)
              .setRequiresCharging(false)
              .build();

mGcmNetworkManager.schedule(task);

Using the builder pattern, we define all the aspects of our Task:

  • Service: The specific GcmTaskService that will control the task. This will allow us to cancel it later.
  • Execution window: The time period in which the task will execute. First param is the lower bound and the second is the upper bound (both are in seconds).
  • Tag: We’ll use the tag to identify in the onRunTask method which task is currently being run. Each tag should be unique, and the max length is 100.
  • Update Current: This determines whether this task should override any pre-existing tasks with the same tag. By default, this is false.
  • Required Network: Sets a specific network state to run on. If that network state is unavailable, then the task won’t be executed until it becomes available.
  • Requires Charging: Whether the task requires the device to be connected to power in order to execute.

All together, the task is built, scheduled on the GcmNetworkManager instance, and then eventually runs when the time is right.

Scheduling a periodic task

Here’s what a periodic task looks like:

Task task = new PeriodicTask.Builder()
                        .setService(GCMService.class)
                        .setPeriod(30)
                        .setFlex(10)
                        .setTag(LogService.TAG_TASK_PERIODIC_LOG)
                        .setPersisted(true)
                        .build();

mGcmNetworkManager.schedule(task);

It seems pretty similar, but there are a few key differences:

  • Period: Specifies that the task should recur once every interval at most, where the interval is the input param in seconds.
  • Flex: Specifies how close to the end of the period (set above) the task may execute. With a period of 30 seconds and a flex of 10, the scheduler will execute the task between the 20-30 second range.
  • Persisted: Determines whether the task should be persisted across reboots. Defaults to true for periodic tasks, and is not supported for one-off tasks. Requires “Receive Boot Completed” permission, or the setter will be ignored.

Cancelling Tasks

We saw how to schedule tasks, so now we should also take a look at how to cancel them. There’s no way to cancel a task that is currently being executed, but we can cancel any task that hasn’t yet run.

We can cancel all tasks for a given GcmTaskService:

  • gcmNetworkManager.cancelAllTasks(GCMService.class);

    And we can also cancel a specific task by providing its tag and GcmTaskService:

    gcmNetworkManager.cancelTask(
            GCMService.TAG_TASK_PERIODIC_LOG,
            GCMService.class
    );

    Well, we took a deep look at the GCM Network Manager and how to use it to conserve battery life, optimize network performance, and perform batched work using Tasks.
    I would urge you to try this out in your app for a streamlined yet robust approach on making networking calls.

Configuring Codacy : Use Your Own Conventions

Screenshot from 2016-07-23 01:08:20

All the developers agree on at least one thing – writing clean code is necessary. Because as someone anonymous said, always write a code as if the developer who comes after you is a homicidal maniac who knows your address. So, yeah, writing clean code is very important. Codacy helps in code reviewing and code quality monitoring. You can set codacy in any of your github project. It automatically identifies new static analysis issues, code coverage, code duplication and code complexity evolution in every commit and pull request.

Continue reading Configuring Codacy : Use Your Own Conventions

ETag based caching for GET APIs

Many client applications require caching of data to work with low bandwidth connections. Many of them do it to provide faster loading time to the client user. The Webapp and Android app had similar requirements. Previously they provided caching using a versions API that would keep track of any modifications made to Events or Services. The response of the API would be something like this:

[{
  "event_id": 6,
  "event_ver": 1,
  "id": 27,
  "microlocations_ver": 0,
  "session_ver": 4,
  "speakers_ver": 3,
  "sponsors_ver": 2,
  "tracks_ver": 3
}]

The number corresponding to "*_ver" tells the number of modifications done for that resource list. For instance, "tracks_ver": 3 means there were three revisions for tracks inside the event (/events/:event_id/tracks). So when the client user starts his app, the app would make a request to the versions API, check if it corresponds to the local cache and update accordingly. It had some shortcomings, like checking modifications for a individual resources. And if a particular service (microlocation, track, etc.) resource list inside an event needs to be checked for updates, a call to the versions API would be needed.

ETag based caching for GET APIs

The concept of ETag (Entity Tag) based caching is simple. When a client requests (GET) a resource or a resource list, a hash of the resource/resource list is calculated at the server. This hash, called the ETag is sent with the response to the client, preferably as a header. The client then caches the response data and the ETag alongside the resource. Next time when the client makes a request at the same endpoint to fetch the resource, he sets an If-None-Match header in the request. This header contains the value of ETag the client saved before. The server grabs the resource requested by the client, calculates its hash and checks if it is equal to the value set for If-None-Match. If the value of the hash is same, then it means the resource has not changed, so a response with resource data is not needed. If it is different, then the server returns the response with resource data and a new ETag associated with that resource.

Little modifications were needed to deal with ETags for GET requests. Flask-Restplus includes a Resource class that defines a resource. It is a pluggable view. Pluggable views need to define a dispatch_request method that returns the response.

import json
from hashlib import md5

from flask.ext.restplus import Resource as RestplusResource

# Custom Resource Class
class Resource(RestplusResource):
    def dispatch_request(self, *args, **kwargs):
        resp = super(Resource, self).dispatch_request(*args, **kwargs)

        # ETag checking.
        # Check only for GET requests, for now.
        if request.method == 'GET':
            old_etag = request.headers.get('If-None-Match', '')
            # Generate hash
            data = json.dumps(resp)
            new_etag = md5(data).hexdigest()

            if new_etag == old_etag:
                # Resource has not changed
                return '', 304
            else:
                # Resource has changed, send new ETag value
                return resp, 200, {'ETag': new_etag}

        return resp

To add support for ETags, I sub-classed the Resource class to extend the dispatch_request method. First, I grabbed the response for the arguments provided to RestplusResource‘s dispatch_request method. old_etag contains the value of ETag set in the If-None-Match header. Then hash for the resp response is calculated. If both ETags are equal then an empty response is returned with 304 HTTP status (Not Modified). If they are not equal, then a normal response is sent with the new value of ETag.

[smg:~] $ curl -i http://127.0.0.1:8001/api/v2/events/1/tracks/1 
HTTP/1.0 200 OK
Content-Type: application/json
Content-Length: 1061
ETag: ada4d057f76c54ce027aaf95a3dd436b
Server: Werkzeug/0.11.9 Python/2.7.6
Date: Thu, 21 Jul 2016 09:01:01 GMT

{"description": "string", "sessions": [{"id": 1, "title": "Fantastische Hardware Bauen & L\u00f6ten Lernen mit Mitch (TV-B-Gone) // Learn to Solder with Cool Kits!"}, {"id": 2, "title": "Postapokalyptischer Schmuck / Postapocalyptic Jewellery"}, {"id": 3, "title": "Query Service Wikidata "}, {"id": 4, "title": "Unabh\u00e4ngige eigene Internet-Suche in wenigen Schritten auf dem PC installieren"}, {"id": 5, "title": "Nitrokey Sicherheits-USB-Stick"}, {"id": 6, "title": "Heart Of Code - a hackspace for women* in Berlin"}, {"id": 7, "title": "Free Software Foundation Europe e.V."}, {"id": 8, "title": "TinyBoy Project - a 3D printer for education"}, {"id": 9, "title": "LED Matrix Display"}, {"id": 11, "title": "Schnittmuster am PC erstellen mit Valentina / Valentina Digital Pattern Design"}, {"id": 12, "title": "PC mit Gedanken steuern - Brain-Computer Interfaces"}, {"id": 14, "title": "Functional package management with GNU Guix"}], "color": "GREEN", "track_image_url": "http://website.com/item.ext", "location": "string", "id": 1, "name": "string"}

[smg:~] $ curl -i --header 'If-None-Match: ada4d057f76c54ce027aaf95a3dd436b' http://127.0.0.1:8001/api/v2/events/1/tracks/1 
HTTP/1.0 304 NOT MODIFIED
Connection: close
Server: Werkzeug/0.11.9 Python/2.7.6
Date: Thu, 21 Jul 2016 09:01:27 GMT

ETag based caching has a drawback. Since the hash is calculated for every GET request it increases the load on servers. So if four clients request the same resource, the server calcuates hashes four times. This can be solved by calculating and saving the ETag during creation and modification of resources, and then getting and sending this ETag directly.

Transcript from the Python Toolbox 101

At the Python User Group Berlin, I lead a talk/discussion about free-of-charge tools for open-source development based on what we use GSoC. The whole content was in an Etherpad and people could add their ideas.

Because there are a lot of tools, I thought, I would share it with you. Maybe it is of use. Here is the talk:


Python Users Berlin 2016/07/14 Talk & Discussion

 

START: 19:15
Agenda 1min END: 19:15
======
– Example library
– What is code
– Version Control
  – Python Package Index
– …, see headings
– discussion: write down, what does not fit into my structure
Example Library (2min)  19:17
======================
What is Code (2min) 19:19
===================
.. note:: This frames our discussion
– Source files .py, .pyw
– tests
– documentation
– quality
– readability
– bugs and problems
– <3
Configurationsfiles plain Text for editing
Version Control (2min) 19:21
======================
.. note:: Sharing and Collaboration
– no Version Control:
  – Dropbox
  – Google drive
  – Telekom cloud
  – ftp, windows share
– Version Control Tools:
  – git
    – gitweb own server
    – 
  – mecurial
  – svn
  – perforce (proprietary)
  
  
  
  
  
  
Python Package Index (3min) 19:24
—————————
.. note:: Shipping to the users
hosts python packages you develop.
Example: “knittingpattern” package
pip
Installation from Pypi:
    $ python3 -m pip install knittingpattern # Linux
    > py -3.4 -m pip install knittingpattern # Windows
Documentation upload included!
Documentation (3min) 19:27
====================
.. note:: Inform users
I came across a talk:
Documentation can be:
– tutorials
– how to
– introduction to the community/development process
– code documentation!!!
– chat
– 
Building the documentation (3min)  19:30
———————————
Formats:
– HTML
– PDF
– reRST
– EPUB
– doc strings in source code
– test?
Tools:
– Sphinx
– doxygen
– doc strings
  – standard how to put in docstrings in Python
    – 
Example: Sphinx  3min 19:33
~~~~~~~~~~~~~~~
– Used for Python
– Used for knittingpattern
Python file:
Documentation file with sphinx.ext.autodoc:
Built documentation:
    See the return type str, Intersphinx can reference across documentations.
    Intersphinx uses objects inventory, hosted with the documentation:
Testing the documentation:
    – TODO: link
      – evertying is included in the docs
      – everything that is public is documented
      
      syntax
      – numpy 
      – google 
      – sphinx
Hosting the Documentation (3min) 19:36
——————————–
Tools:
– pythonhosted
  only latest version
– readthedocs.io
  several branches, versions, languages
– wiki pages
– 
Code Testing 2min 19:38
============
.. note:: Tests show the presence of mistakes, not their absence.
What can be tested:
– features
– style: pep8, pylint, 
– documentation
– complexity
– 
Testing Features with unit tests 4min 19:42
——————————–
code:
    def fib(i): …
Tools with different styles
– unittest
  
    import unittest
    from fibonacci import fib
    class FibonacciTest(unittest.TestCase):
        def testCalculation(self):
            self.assertEqual(fib(0), 0)
            self.assertEqual(fib(1), 1)
            self.assertEqual(fib(5), 5)
            self.assertEqual(fib(10), 55)
            self.assertEqual(fib(20), 6765)
    if __name__ == “__main__”: 
        unittest.main()
 
– doctest
    import doctest
    def fib(n):
        “”” 
        Calculates the n-th Fibonacci number iteratively  
        >>> fib(0)
        0
        >>> fib(1)
        1
        >>> fib(10) 
        55
        >>> fib(15)
        610
        >>> 
        “””
        a, b = 0, 1
        for i in range(n):
            a, b = b, a + b
        return a
    if __name__ == “__main__”: 
        doctest.testmod()
– pytest (works with unittest)
    import pytest
    from fibonacci import fib
    
    @pytest.mark.parametrize(“parameter,value”,[(0, 0), (1, 1), (10, 55), (15, 610)])
    def test_fibonacci(parameter, value):
        assert fib(parameter) == value
– nose tests?
– …
– pyhumber
– assert in code,  PyHamcrest
– Behaviour driven development
  – human test
Automated Test Run & Continuous Integration 2min 19:44
===========================================
.. note:: 
Several branches:
– production branch always works
– feature branches
– automated test before feature is put into production
Tools running tests 6min 19:50
——————-
– Travis CI for Mac, Ubuntu
– Appveyor for Windows
Host yourself:
– buildbot
– Hudson
– Jenkins
– Teamcity
– circle CI
  + selenium for website test
– 
– …?????!!!!!!
Tools for code quality 4min 19:54
———————-
– landscape
  complexity, style, documentation
  – libraries are available separately
    – flake8
    – destinate
    – pep257
– codeclimate
  code duplication, code coverage
  – libraries are available separately
– PyCharm
  – integrated what landscape has 
  – + complexity
Bugs, Issues, Pull Requests, Milestones 4min 19:58
=======================================
.. note:: this is also a way to get people into the project
1. find bug
2. open issue if big bug, discuss
3. create pull request
4. merge
5. deploy
– github
  issue tracker
– waffle.io – scrumboard
  merge several github issues tracker
– Redmine
JIRA
– trac 
– github issues + zenhub integrated in github
– gitlab
– gerrit framework that does alternative checking https://www.gerritcodereview.com/
  1. propose change
  2. test
  3. someone reviews the code
      – X people needed
  QT company uses it
Localization 2min 20:00
============
crowdin.com
    Crowdsourced translation tool:
    
Discussion
– spellchecker is integrated in PyCharm
  – character set
  – new vocabulary
  – not for continuous integration (CI)
– Emacs
  – 
– pylint plugin 
   – not all languages?
– readthedocs
  – add github project, 
  – hosts docs
– sphinx-plugin?
– PyCon testing talk:
    – Hypothesis package
      – tries to break your code
      – throws in a lot of edge cases (huge number, nothing, …)
      -> find obscure edge cases
      
Did someone create a Pylint plugin
– question:
    – cyclomatic code complexity
    – which metrics tools do you know?
    –
Virtual Environment:
    nobody should install everything in the system
    -> switch between different python versions
    – python3-venv
      – slightly different than virtual-env(more mature)
Beginners:
    Windows:
        install Anaconda

The new AYABInterface module

One create knit work with knitting machines and the AYAB shield. Therefore, the computer communicates with the machine. This communication shall be done, in the future, with this new library, the AYABInterface.

Here are some design decisions:

Complete vs. Incomplete

The idea is to have the AYAB seperated from the knittingpattern format. The knittingpattern format is an incomplete format that can be extended for any use case.  In contrast, the AYAB machine has a complete instruction set. The knittingpattern format is a means to transform these formats into different complete instruction sets. They should be convertible but not mixed.

Desciptive vs. Imperative

The idea is to be able to pass the format to the AYABInterface as a description. As much knowledge about the behavior is capsuled in the AYABInterface module. With this striving, we are less prone to intermix concerns across the applications.

Responsibilty Driven Design

I see these separated responsibilities:

  • A communication part focusing on the protocol to talk and the messages sent across the wire. It is an interpreter of the protocol, transforming it from bytes to objects.
  • A configuration that is passed to the interface
  • Different Machines types supported.
  • Actions the user shall perform.

Different Representations

I see these representations:

  • Commands are transferred across the wire. (PySerial)
  • For each movement of a carriage, the needles are used and put into a new position, B or D. (communication)
  • We would like to knit a list of rows with different colors. (interface)
    • Holes can be described by a list of orders in which meshes are moved to other locations, i.e. on needle 1 we can find mesh 1, on needle 2 we find mesh 2 first and then mesh 3, so mesh 2 and mesh 3 are knit together in the following step
  • The knitting pattern format.

Actions and Information for the User

The user should be informed about actions to take. These actions should not be in the form of text but rather in the form of an object that represents the action, i.e. [“move”, “this carriage”, “from right to left”]. This way, they can be adequately represented in the UI and translated somewhere central in the UI.

Summary

The new design separates concerns and allows testing. The bridge between the machine and the knittnigpattern format are primitive, descriptive objects such as lists and integers.