Downloading Files from URLs in Python

This post is about how to efficiently/correctly download files from URLs using Python. I will be using the god-send library requests for it. I will write about methods to correctly download binaries from URLs and set their filenames.

Let’s start with baby steps on how to download a file using requests –

import requests

url = 'http://google.com/favicon.ico'
r = requests.get(url, allow_redirects=True)
open('google.ico', 'wb').write(r.content)

The above code will download the media at http://google.com/favicon.ico and save it as google.ico.

Now let’s take another example where url is https://www.youtube.com/watch?v=9bZkp7q19f0. What do you think will happen if the above code is used to download it ? If you said that a HTML page will be downloaded, you are spot on. This was one of the problems I faced in the Import module of Open Event where I had to download media from certain links. When the URL linked to a webpage rather than a binary, I had to not download that file and just keep the link as is. To solve this, what I did was inspecting the headers of the URL. Headers usually contain a Content-Type parameter which tells us about the type of data the url is linking to. A naive way to do it will be –

r = requests.get(url, allow_redirects=True)
print r.headers.get('content-type')

It works but is not the optimum way to do so as it involves downloading the file for checking the header. So if the file is large, this will do nothing but waste bandwidth. I looked into the requests documentation and found a better way to do it. That way involved just fetching the headers of a url before actually downloading it. This allows us to skip downloading files which weren’t meant to be downloaded.

import requests

def is_downloadable(url):
    """
    Does the url contain a downloadable resource
    """
    h = requests.head(url, allow_redirects=True)
    header = h.headers
    content_type = header.get('content-type')
    if 'text' in content_type.lower():
        return False
    if 'html' in content_type.lower():
        return False
    return True

print is_downloadable('https://www.youtube.com/watch?v=9bZkp7q19f0')
# >> False
print is_downloadable('http://google.com/favicon.ico')
# >> True

To restrict download by file size, we can get the filesize from the Content-Length header and then do suitable comparisons.

content_length = header.get('content-length', None)
if content_length and content_length > 2e8:  # 200 mb approx
	return False

So using the above function, we can skip downloading urls which don’t link to media.

Getting filename from URL

We can parse the url to get the filename. Example – http://aviaryan.in/images/profile.png.

To extract the filename from the above URL we can write a routine which fetches the last string after backslash (/).

url = 'http://aviaryan.in/images/profile.png'
if url.find('/'):
	print url.rsplit('/', 1)[1]

This will be give the filename in some cases correctly. However, there are times when the filename information is not present in the url. Example, something like http://url.com/download. In that case, the Content-Disposition header will contain the filename information. Here is how to fetch it.

import requests
import re

def get_filename_from_cd(cd):
    """
    Get filename from content-disposition
    """
    if not cd:
        return None
    fname = re.findall('filename=(.+)', cd)
    if len(fname) == 0:
        return None
    return fname[0]


url = 'http://google.com/favicon.ico'
r = requests.get(url, allow_redirects=True)
filename = get_filename_from_cd(r.headers.get('content-disposition'))
open(filename, 'wb').write(r.content)

The url-parsing code in conjuction with the above method to get filename from Content-Dispositionheader will work for most of the cases. Use them and test the results.

These are my 2 cents on downloading files using requests in Python. Let me know of other tricks I might have overlooked.

{{ Repost from my personal blog http://aviaryan.in/blog/gsoc/downloading-files-from-urls.html }}


Continue ReadingDownloading Files from URLs in Python

Building a logger interface for FlightGear using Python: Part One

{ Repost from my personal blog @ https://blog.codezero.xyz/python-logger-interface-for-flightgear-part-one/ }

The FlightGear flight simulator is an open-source, multi-platform, cooperative flight simulator developed as a part of the FlightGear project. I have been using this Flight simulator for a year for Virtual Flight testing, running simulations and measuring flight parameters during various types of maneuvers. I have noticed that, logging the data, (figuring out how to log in the first place) has been quite difficult for users with less technical knowledge in such softwares.

Also, the Property Tree of FlightGear is pretty extensive making it difficult to properly traverse the huge tree to get the parameters that are actually required.

That’s when I got the idea of making a simple, easy to use, user friendly logging interface for FlightGear. I gave it a name ‘FlightGear Command Center’:wink: and the project was born at github.com/niranjan94/flightgear-cc.

After 44 commits, this is what I have now.

1. A simple dashboard to connect to FlightGear, open FlightGear with a default plane, Getting individual parameter values or to log a lot of parameters continuously

2. An interface to choose the parameters to log and the interval

  1. The User interface is a web application written in HTML/javascript.
  2. The Web application communicates with a python bridge using WebSockets.
  3. The python bridge communicates with FlightGear via telnet.
  4. The data is logged to a csv file continuously (until the user presses stop) by the bridge once the web application requests it.
The interface with FlightGear

FlightGear has an internal “telnet” command server which provides us “remote shell” into the running FlightGear process which we can exploit to interactively view or modify any property/variable of the simulation.

FlightGear can be instructed to start the server and listen for commands by passing the --telnet=socket,out,60,localhost,5555,udp command line argument while starting FlightGear. (The argument is of format --telnet=medium,direction,speed_in_hertz,localhost,PORT,style.)

Communication with that server can be done using any simple telnet interface. But FlightGear also provides us with a small wrapper class that makes retrieving and setting properties using the telnet server even more easier.

The wrapper can be obtained from the official repository atsourceforge.net/p/flightgear/flightgear/ci/master/tree/scripts/python/FlightGear.py

Using the wrapper is straightforward. Initialize an instance of the class with the hostname and port. The class will then make a connection to the telnet server.

from FlightGear import FlightGear

flightgear_server = 'localhost'  
flightgear_server_port = 5555  
fg = FlightGear(flightgear_server, flightgear_server_port)

The wrapper makes use of python’s magic methods __setitem__ and __getitem__ to make it easy for us to read or manipulate the property tree.

For example, getting the current altitude of the airplane is as easy as

print fg['/position[0]/altitude-ft']

and setting the altitude is as simple as

fg['/position[0]/altitude-ft'] = 345.2

But the important thing here is, knowing the path to the data you want in the FlightGear property tree. Most of the commonly used properties are available over at Aircraft properties reference – FlightGear Wiki.

Now that we have basic interface between python and FlightGear in place, the next step would be to setup a link between the user interface (a small web app) and the python bridge. We would be using WebSockets for that so as to have a Real-time and an always on link to the bridge which would enable us to in turn communicate with FlightGear in realtime.

We need a WebSocket server in place. So, I used the SimpleWebSocketServer.pyclass from github.com/dpallot/simple-websocket-server.

A websocket server can be created by,

from SimpleWebSocketServer import SimpleWebSocketServer, WebSocket

hostname = 'localhost'  
websocket_server_port = 8888

class SocketHandler(WebSocket):

    def handleMessage(self):
        # print the message when received 
        print self.data

    def handleConnected(self):
        print self.address, 'connected'

    def handleClose(self):
        print self.address, 'closed'

server = SimpleWebSocketServer(hostname, websocket_server_port, SocketHandler)  
server.serveforever()
  • handleMessage is called whenever a client sends a message to the server
  • handleConnected is called when a new client connects to the server
  • handleClose is called when a client disconnects from the server

A message can be sent to the clients by using the sendMessage method from within the SocketHandler.

class SocketHandler(WebSocket):

    def handleMessage(self):
        # send a hello whenever a message is received  
        print self.data
        self.sendMessage('Hello')

    def handleConnected(self):
        print self.address, 'connected'

    def handleClose(self):
        print self.address, 'closed'

We now have a WebSocket server in place. Now the web app can easily talk to this server using javascript websockets API. Which would be continued in upcoming blog articles.

Continue ReadingBuilding a logger interface for FlightGear using Python: Part One

Unit Testing

There are many stories about unit testing. Developers sometimes say that they don’t write tests because they write a good quality code. Does it make sense, if no one is infallible?.

At studies only a  few teachers talk about unit testing, but they only show basic examples of unit testing. They require to write a few tests to finish final project, but nobody really  teaches us the importance of unit testing.

I have also always wondered what benefits can it bring. As time is a really important factor in our work it often happens that we simply resign of this part of process development to get “more time” rather than spend time on writing stupid tests. But now I know that it is a vicious circle.

Customers requierments does not help us. They put a high pressure to see visible results not a few statistics about coverage status. None of them cares about some strange numbers. So, as I mentioned above, we usually focuses on building new features and get riid of tests. It may seem to save time, but it doesn’t.

In reality tests save us a lot of time because we can identify and fix bugs very quickly. If a bug ocurrs because someone’s change we don’t have to spend long hours trying to figure out wgat is going out. That’s why we need tests.  

It is especially visible in huge open source projects. FOSSASIA organization has about 200 contributors. In OpenEvent project we have about 20 active developers, who generate many lines of code every single day. Many of them change over and over again as well as interfere  with each other.

Let me provide you with a simple example. In our team we have about 7 pull requests per day. As I mentioned above we want to make our code high quality and free of bugs, but without testing identifying if pull request causes a bug is very difficult task. But fortunately this boring job makes Travis CI for us. It is a great tool which uses our tests and runs them on every PR  to check if bugs occur. It helps us to quickly notice bugs and maintain our project very well.

What is unit testing?

Unit testing is a software development method in which the smallest testable parts of an application are tested

Why do we need writing unit tests?

Let me point all arguments why unit testing is really important while developing a project.

  • To prove that our code works properly

If developer adds another condition, test checks if method returns correct results. You simply don’t need to wonder if something is wrong with you code.

  • To reduce amount of bugs

It let you to know what inputs params’ function should get and what results should be returned. You simply don’t  write unused code

  • To save development time

Developers don’t waste time on checking every code’s change if his code works correctly

  • Unit tests help to understand software design
  • To provide quick feedback about method which you are testing
  • To help document a code

How to write unit test in Python

In my work I write use tests in Python. I am going to share my sample code  with you now

  • Import module unittest
  • Choose function to test
  • Write unit test

Example OpenEvent test in Python

class TestPagesUrls(OpenEventTestCase):

   def setUp(self):

       self.app = Setup.create_app()

   def test_if_urls_exist(self):

       """Test all urls via GET method"""

       with app.test_request_context():

           for rule in app.url_map.iter_rules():

               if excluded_paths(rule):

                   status_code = self.app.get(request.url[:-1] + str(rule).replace('//', '/'),        follow_redirects=True).status_code

                   self.assertTrue(status_code in [200, 302, 401])

 

I want to check if all views exist but it required a lot of time. That’s why I wonder I how to avoid writing similar tests. Finally, based  on our list of routes I am able to write test which checks code’s status  on every page.

If some of them response returns status_code different than 200, 302 or 401, test fails.This results means that somethings is wrong. Simple, isn’t it ?  Try to test it manually…. This one short test cover about 40 use cases…

This example shows an incredible value of unit tests! If developer makes a bug in response he receives an error that something is wrong with a view. Travis CI allows to reject all  wrong pull requests and merge only these which fulfill our quality requirements.   

Fixing  error is one part but finding a bug is even harder task. But an ability to detect bug on early stage of process development reduces cost of software.

 

Continue ReadingUnit Testing

Features and Controls of Pocket Science Lab

Prerequisite reading:

PSLab is equipped with array of useful control and measurement tools. This tiny but powerful Pocket Science Lab enables you to perform various experiments and study a wide range of phenomena.

Some of the important applications of PSLab include a 4-channel oscilloscope, sine/triangle/square waveform generators, a frequency counter, a logic analyser and also several programmable current and voltage sources.

Add-on boards, both wired as well as wireless(NRF+MCU), enable measurement of physical parameters ranging from acceleration and angular velocity, to luminous intensity and Passive Infra-red. (Work under progress…)

As a reference for digital instruments a 12-MHz Crystal is chosen and a 3.3V voltage regulator is chosen for the analogue instruments. The device is then calibrated against professional instruments in order to squeeze out maximum performance.

Python based communication library and experiment specific PyQt4 based GUI’s make PSLab a must have tool for programmers, hobbyists, science and engineering teachers and also students.

PSLab is interfaced and powered by USB port of the computer. For connecting external signals it has several input/output terminals as shown in the figure.

pslabdesign
New panel design for PSLab

psl2

Feature list for the acquisition and control :

  • The most important feature of PSLab is a 4-channel oscilloscope which can monitor analog inputs at maximum of 2 million samples per second. Includes the usual controls such as triggering, and gain selection. Uses Python-Scipy for curve fitting.
oscilloscope
PSLab Oscilloscope

 

 

Waveform Generators

  • W1 : 5Hz – 5KHz arbitrary waveform generator. Manual amplitude control up to +/-3Volts
  • W2 : 5Hz – 5KHz arbitrary waveform generator. Amplitude of +/-3Volts. Attenuable via software
  • PWM : There are four phase correlated PWM outputs with maximum frequency 32MHz, 15nano second duty cycle, and phase difference control.

Measurement Functions

  • Frequency counter tested up to 16 MHz.
  • Capacitance Measurement. pF to uF range
  • PSLab has several 12-bit Analog inputs (function as voltmeters) with programmable gains, and maximum ranges varying from +/-5mV to +/-16V.

Voltage and Current Sources

  • 12-bit Constant Current source. Maximum current 3.3mA [subject to load resistance].
  • PSLab has three 12-bit Programmable voltage sources/ +/-3.3V,+/-5V,0-3V . (PV1, PV2, PV3)
controls
Main Control Panel

Other useful tools

  • 4MHz, 4-channel Logic analyzer with 15nS resolution.Voltage and Current Sources
  • SPI,I2C,UART outputs that can be configured and controlled entirely through Python functions. (Work in progress…)
  • On-board 2.4GHz transceiver for wireless data acquisition. (Work in progress..)
  • Graphical Interfaces for Oscilloscope, Logic Analyser, streaming data, wireless acquisition, and several experiments developed that use a common framework which drastically reduces code required to incorporate control and plotting widgets.
  • PSLab also has space for an ESP-12 module for WiFi access with access point / station mode.

Screen-shots of GUI apps.

advanced-controls
Advanced Controls with Oscilloscope
wirelesssensordataloger
Wireless Sensors ( Work in progress…)
logicanalyzer
Logic Analyzer

With all these features PSLab is taking a good shape and I see it as a potential tool that can change the way we teach and learn science. 🙂 🙂

 

Continue ReadingFeatures and Controls of Pocket Science Lab

Getting fired up with Firebase Database

As you might’ve noticed, in my Open Event Android Project, we are asking the user to enter his/her details and then using these details at the backend for generating the app according to his/her needs.

One thing to wonder is how did we transmit the details from webpage to the server.

Well, this is where Firebase comes to the rescue one more time!

If you’ve read my previous post on Firebase Storage, you might have started to appreciate what an awesometastic service Firebase is.

So without any further adieu, lets get started with this.

Step 1 :

Add your project to Firebase from the console.

 newProj
Click on the Blue button

Step 2 :

Add Firebase to your webapp

Open the project, you’ve just created and click on the bright red button that says, “ Add Firebase to your web app”

 addFirebase

Copy the contents from here and paste it after your HTML code.

Step 3 :

Next up, navigate to the Database section in your console and move to the Rules tab.

 screenshot-area-2016-07-18-204133.png

For now, let us edit the rules to allow anyone to read and write to the database.

 screenshot-area-2016-07-18-204437

Almost all set up now.

Step 4 :

Modify the HTML to allow entering data by the user

This looks something like this :

<form name="htmlform" id="form" enctype="multipart/form-data">
<p align="center"><b><big>FOSSASIA's App Generator</big></b></p>
<table align="center"
width = "900px"
height="200px">
<tr>
<td valign="top">
<label for="Email">Email</label>
</td>
<td valign="top">
<input id="email" type="email" name="Email" size="30">
</td>
<td>
<td valign="top">
<label for="name">App's Name</label>
</td>&nbsp;
<td valign="top">
<input id="appName" type="text" name="App_Name" maxlength="50" size="30">
</td>&nbsp;
</tr>
<tr>
<td valign="top">
<label for="link">Api Link</label>
</td>
<td valign="top">
<input id="apiLink" type="url" name="Api_Link" maxlength="90" size="30">
</td>
</tr>
<tr>
<td valign="top">
<label for="sessions">Zip containing .json files</label>
</td>
<td valign="top">
<input accept=".zip" type="file" id="uploadZip" name="sessions">
</td>
</tr>
<tr>
<td colspan="5" style="text-align:center">
<button type="submit">Generate and Download app</button>
</td>
</tr>
</table>
</form>
view raw index.html hosted with ❤ by GitHub

Now let us setup our javascript to extract this data and store this in Firebase Database.

<script src="https://www.gstatic.com/firebasejs/live/3.0/firebase.js"></script>
<script src="https://code.jquery.com/jquery-1.10.2.js"></script>
<script src="https://code.jquery.com/ui/1.11.2/jquery-ui.js"></script>
<script>
var $ = jQuery;
var timestamp = Number(new Date()); //this will server as a unique ID for each user
var form = document.querySelector("form");
var config = {
apiKey: "API_KEY",
authDomain: "app-id.firebaseapp.com",
databaseURL: "https://app-id.firebaseio.com",
storageBucket: "app-id.appspot.com",
};
firebase.initializeApp(config);
var database = firebase.database();
form.addEventListener("submit", function(event) {
event.preventDefault();
var ary = $(form).serializeArray();
var obj = {};
for (var a = 0; a < ary.length; a++) obj[ary[a].name] = ary[a].value;
console.log("JSON",obj);
var file_data = $('#uploadZip').prop('files')[0];
var storageRef = firebase.storage().ref(timestamp.toString());
storageRef.put(file_data);
var form_data = new FormData();
form_data.append('file', file_data);
firebase.database().ref('users/' + timestamp).set(obj);
database.ref('users/' + timestamp).once('value').then(function(snapshot) {
console.log("Received value",snapshot.val());
)};
});
</script>
view raw script.js hosted with ❤ by GitHub

We are almost finished with uploading the data to the database.

Enter data inside the fields and press submit.

If everything went well, you will be able to see the newly entered data inside your database.

screenshot-area-2016-07-18-205651.png

Now on to retrieving this data on the server.

Our backend runs on a python script, so we have a library known as python-firebase which helps us easily fetch the data stored in the Firebase database.

The code for it goes like this

firebase = firebase.FirebaseApplication('https://app-id.firebaseio.com', None)
result = firebase.get('/users', str(arg))
jsonData = json.dumps(result)
email = json.dumps(result['Email'])
email = email.replace('"', '')
app_name = json.dumps(result['App_Name'])
app_name = app_name.replace('"', '')
print app_name
print email
view raw firebase.py hosted with ❤ by GitHub

The data will be returned in JSON format, so you can manipulate and store it as you wish.

Well, that’s it!

You now know how to store and retrieve data to and from Firebase.
It makes the work a lot simpler as there is no Database schema or tables that need to be defined, firebase handles this on its own.

I hope that you found this tutorial helpful, and if you have any doubts regarding this feel free to comment down below, I would love to help you out.

Cheers.

Continue ReadingGetting fired up with Firebase Database

Transcript from the Python Toolbox 101

At the Python User Group Berlin, I lead a talk/discussion about free-of-charge tools for open-source development based on what we use GSoC. The whole content was in an Etherpad and people could add their ideas.

Because there are a lot of tools, I thought, I would share it with you. Maybe it is of use. Here is the talk:


Python Users Berlin 2016/07/14 Talk & Discussion

 

START: 19:15
Agenda 1min END: 19:15
======
– Example library
– What is code
– Version Control
  – Python Package Index
– …, see headings
– discussion: write down, what does not fit into my structure
Example Library (2min)  19:17
======================
What is Code (2min) 19:19
===================
.. note:: This frames our discussion
– Source files .py, .pyw
– tests
– documentation
– quality
– readability
– bugs and problems
– <3
Configurationsfiles plain Text for editing
Version Control (2min) 19:21
======================
.. note:: Sharing and Collaboration
– no Version Control:
  – Dropbox
  – Google drive
  – Telekom cloud
  – ftp, windows share
– Version Control Tools:
  – git
    – gitweb own server
    – 
  – mecurial
  – svn
  – perforce (proprietary)
  
  
  
  
  
  
Python Package Index (3min) 19:24
—————————
.. note:: Shipping to the users
hosts python packages you develop.
Example: “knittingpattern” package
pip
Installation from Pypi:
    $ python3 -m pip install knittingpattern # Linux
    > py -3.4 -m pip install knittingpattern # Windows
Documentation upload included!
Documentation (3min) 19:27
====================
.. note:: Inform users
I came across a talk:
Documentation can be:
– tutorials
– how to
– introduction to the community/development process
– code documentation!!!
– chat
– 
Building the documentation (3min)  19:30
———————————
Formats:
– HTML
– PDF
– reRST
– EPUB
– doc strings in source code
– test?
Tools:
– Sphinx
– doxygen
– doc strings
  – standard how to put in docstrings in Python
    – 
Example: Sphinx  3min 19:33
~~~~~~~~~~~~~~~
– Used for Python
– Used for knittingpattern
Python file:
Documentation file with sphinx.ext.autodoc:
Built documentation:
    See the return type str, Intersphinx can reference across documentations.
    Intersphinx uses objects inventory, hosted with the documentation:
Testing the documentation:
    – TODO: link
      – evertying is included in the docs
      – everything that is public is documented
      
      syntax
      – numpy 
      – google 
      – sphinx
Hosting the Documentation (3min) 19:36
——————————–
Tools:
– pythonhosted
  only latest version
– readthedocs.io
  several branches, versions, languages
– wiki pages
– 
Code Testing 2min 19:38
============
.. note:: Tests show the presence of mistakes, not their absence.
What can be tested:
– features
– style: pep8, pylint, 
– documentation
– complexity
– 
Testing Features with unit tests 4min 19:42
——————————–
code:
    def fib(i): …
Tools with different styles
– unittest
  
    import unittest
    from fibonacci import fib
    class FibonacciTest(unittest.TestCase):
        def testCalculation(self):
            self.assertEqual(fib(0), 0)
            self.assertEqual(fib(1), 1)
            self.assertEqual(fib(5), 5)
            self.assertEqual(fib(10), 55)
            self.assertEqual(fib(20), 6765)
    if __name__ == “__main__”: 
        unittest.main()
 
– doctest
    import doctest
    def fib(n):
        “”” 
        Calculates the n-th Fibonacci number iteratively  
        >>> fib(0)
        0
        >>> fib(1)
        1
        >>> fib(10) 
        55
        >>> fib(15)
        610
        >>> 
        “””
        a, b = 0, 1
        for i in range(n):
            a, b = b, a + b
        return a
    if __name__ == “__main__”: 
        doctest.testmod()
– pytest (works with unittest)
    import pytest
    from fibonacci import fib
    
    @pytest.mark.parametrize(“parameter,value”,[(0, 0), (1, 1), (10, 55), (15, 610)])
    def test_fibonacci(parameter, value):
        assert fib(parameter) == value
– nose tests?
– …
– pyhumber
– assert in code,  PyHamcrest
– Behaviour driven development
  – human test
Automated Test Run & Continuous Integration 2min 19:44
===========================================
.. note:: 
Several branches:
– production branch always works
– feature branches
– automated test before feature is put into production
Tools running tests 6min 19:50
——————-
– Travis CI for Mac, Ubuntu
– Appveyor for Windows
Host yourself:
– buildbot
– Hudson
– Jenkins
– Teamcity
– circle CI
  + selenium for website test
– 
– …?????!!!!!!
Tools for code quality 4min 19:54
———————-
– landscape
  complexity, style, documentation
  – libraries are available separately
    – flake8
    – destinate
    – pep257
– codeclimate
  code duplication, code coverage
  – libraries are available separately
– PyCharm
  – integrated what landscape has 
  – + complexity
Bugs, Issues, Pull Requests, Milestones 4min 19:58
=======================================
.. note:: this is also a way to get people into the project
1. find bug
2. open issue if big bug, discuss
3. create pull request
4. merge
5. deploy
– github
  issue tracker
– waffle.io – scrumboard
  merge several github issues tracker
– Redmine
JIRA
– trac 
– github issues + zenhub integrated in github
– gitlab
– gerrit framework that does alternative checking https://www.gerritcodereview.com/
  1. propose change
  2. test
  3. someone reviews the code
      – X people needed
  QT company uses it
Localization 2min 20:00
============
crowdin.com
    Crowdsourced translation tool:
    
Discussion
– spellchecker is integrated in PyCharm
  – character set
  – new vocabulary
  – not for continuous integration (CI)
– Emacs
  – 
– pylint plugin 
   – not all languages?
– readthedocs
  – add github project, 
  – hosts docs
– sphinx-plugin?
– PyCon testing talk:
    – Hypothesis package
      – tries to break your code
      – throws in a lot of edge cases (huge number, nothing, …)
      -> find obscure edge cases
      
Did someone create a Pylint plugin
– question:
    – cyclomatic code complexity
    – which metrics tools do you know?
    –
Virtual Environment:
    nobody should install everything in the system
    -> switch between different python versions
    – python3-venv
      – slightly different than virtual-env(more mature)
Beginners:
    Windows:
        install Anaconda
Continue ReadingTranscript from the Python Toolbox 101

How to create a Windows Installer from tagged commits

I working on an open-source Python project, an editor for knit work called the “KnitEditor”. Development takes place at Github. Here, I would like to give some insight in how we automated deployment of the application to a Windows installer.

The process works like this:

  1. Create a tag with git and push it to Github.
  2. AppVeyor build the application.
  3. AppVeyor pushes the application to the Github release.

(1) Create a tag and push it

Tags should reflect the version of the software. Version “0.0.1” is in tag “v0.0.1”. We automated the tagging with the “setup.py” in the repository. Now, you can run

py -3.4 setup.py tag_and_deploy

Which checks that there is no such tag already. Several commits have the same version, so, we like to make sure that we do not have two versions with the same name. Also, this can only be executed on the master branch. This way, the software has gone through all the automated quality assurance. Here is the code from the setup.py:

from distutils.core import Command
# ...
class TagAndDeployCommand(Command):

    description = "Create a git tag for this version and push it to origin."\
                  "To trigger a travis-ci build and and deploy."
    user_options = []
    name = "tag_and_deploy"
    remote = "origin"
    branch = "master"

    def initialize_options(self):
        pass

    def finalize_options(self):
        pass

    def run(self):
        if subprocess.call(["git", "--version"]) != 0:
            print("ERROR:\n\tPlease install git.")
            exit(1)
        status_lines = subprocess.check_output(
            ["git", "status"]).splitlines()
        current_branch = status_lines[0].strip().split()[-1].decode()
        print("On branch {}.".format(current_branch))
        if current_branch != self.branch:
            print("ERROR:\n\tNew tags can only be made from branch"
                  " \"{}\".".format(self.branch))
            print("\tYou can use \"git checkout {}\" to switch"
                  " the branch.".format(self.branch))
            exit(1)
        tags_output = subprocess.check_output(["git", "tag"])
        tags = [tag.strip().decode() for tag in tags_output.splitlines()]
        tag = "v" + __version__
        if tag in tags:
            print("Warning: \n\tTag {} already exists.".format(tag))
            print("\tEdit the version information in {}".format(
                    os.path.join(HERE, PACKAGE_NAME, "__init__.py")
                ))
        else:
            print("Creating tag \"{}\".".format(tag))
            subprocess.check_call(["git", "tag", tag])
        print("Pushing tag \"{}\" to remote \"{}\"."
              "".format(tag, self.remote))
        subprocess.check_call(["git", "push", self.remote, tag])
# ...
SETUPTOOLS_METADATA = dict(
# ...
    cmdclass={
# ...
        TagAndDeployCommand.name: TagAndDeployCommad
    },
)
# ...
if __name__ == "__main__":
    import setuptools
    METADATA.update(SETUPTOOLS_METADATA)
    setuptools.setup(**METADATA) # METADATA can be found in several other 

Above, you can see a “distutils” command that executed git through the command line interface.

(2) AppVeyor builds the application

As mentioned above, you can configure AppVeyor to build your application. Here are some parts of the “appveyor.yml” file, that I comment on inline:

# see https://packaging.python.org/appveyor/#adding-appveyor-support-to-your-project
environment:
  PYPI_USERNAME: niccokunzmann3
  PYPI_PASSWORD:
    secure: Gxrd9WI60wyczr9mHtiQHvJ45Oq0UyQZNrvUtKs2D5w=

  # For Python versions available on Appveyor, see
  # http://www.appveyor.com/docs/installed-software#python
  # The list here is complete (excluding Python 2.6, which
  # isn't covered by this document) at the time of writing.

  # we only need Python 3.4 for kivy
  PYTHON: "C:\\Python34"


install:
  - "%PYTHON%\\python.exe -m pip install docutils pygments pypiwin32 kivy.deps.sdl2 kivy.deps.glew"
  - "%PYTHON%\\python.exe -m pip install -r requirements.txt"
  - "%PYTHON%\\python.exe -m pip install -r test-requirements.txt"
  - "%PYTHON%\\python.exe setup.py install"
  
build_script:
- cmd: cmd /c windows-build\build.bat

test_script:
  # Put your test command here.
  # If you don't need to build C extensions on 64-bit Python 3.3 or 3.4,
  # you can remove "build.cmd" from the front of the command, as it's
  # only needed to support those cases.
  # Note that you must use the environment variable %PYTHON% to refer to
  # the interpreter you're using - Appveyor does not do anything special
  # to put the Python version you want to use on PATH.
  - windows-build\dist\KnitEditor\KnitEditor.exe /test
  - "%PYTHON%\\python.exe -m pytest --pep8 kniteditor"

artifacts:
  # bdist_wheel puts your built wheel in the dist directory
- path: dist/*
  name: distribution
- path: windows-build/dist/Installer/KnitEditorInstaller.exe
  name: installer
- path: windows-build/dist/KnitEditor
  name: standalone

deploy:
- provider: GitHub
  # http://www.appveyor.com/docs/deployment/github
  tag: $(APPVEYOR_REPO_TAG_NAME)
  description: "Release $(APPVEYOR_REPO_TAG_NAME)"
  auth_token:
    secure: j1EbCI55pgsetM/QyptIM/QDZC3SR1i4Xno6jjJt9MNQRHsBrFiod0dsuS9lpcC7
  artifact: installer
  force_update: true
  draft: false
  prerelease: false
  on:
    branch: master                 # release from master branch only
    appveyor_repo_tag: true        # deploy on tag push only

Note that the line

  - windows-build\dist\KnitEditor\KnitEditor.exe /test

executes the tests in the built application.

These commands are executed to build the application and are executed by this step:

build_script:
- cmd: cmd /c windows-build\build.bat
"%PYTHON%\python.exe" -m pip install pyinstaller

The line above installs pyinstaller

"%PYTHON%\python.exe" -m PyInstaller KnitEditor.spec

The line above uses pyinstaller to create an executable from the specification.

"Inno Setup 5\ISCC.exe" KnitEditor.iss

The line above uses Inno Setup to create the Installer for the built application.

(3) Deploy to Github

As you can see in the “appveyor.yml” file, the resulting executable is listed as an artifact. Artifacts can be downloaded directly from appveyor or used to deploy. In this case, we use the github deploy, which can be customized via the UI of appveyor.

- path: windows-build/dist/Installer/KnitEditorInstaller.exe
  name: installer
deploy:
- provider: GitHub
  # http://www.appveyor.com/docs/deployment/github
  tag: $(APPVEYOR_REPO_TAG_NAME)
  description: "Release $(APPVEYOR_REPO_TAG_NAME)"
  auth_token:
    secure: j1EbCI55pgsetM/QyptIM/QDZC3SR1i4Xno6jjJt9MNQRHsBrFiod0dsuS9lpcC7
  artifact: installer
  force_update: true
  draft: false
  prerelease: false
  on:
    branch: master                 # release from master branch only
    appveyor_repo_tag: true        # deploy on tag push only

Summary

Now, every time we push a tag to Github, AppVeyor build a new installer for our application.

Continue ReadingHow to create a Windows Installer from tagged commits

Implementing revisioning feature in Open Event

{ Repost from my personal blog @ https://blog.codezero.xyz/implementing-revisioning-feature-in-open-event }

As I said in my previous blog post about Adding revisioning to SQLAlchemy Models,

In an application like Open Event, where a single piece of information can be edited by multiple users, it’s always good to know who changed what. One should also be able to revert to a previous version if needed.

Let’s have a quick run through on how we can enable SQLAlchemy-Continuum on our project.

  1. Install the library SQLAlchemy-Continuum with pip
  2. Add __versioned__ = {} to all the models that need to be versioned.
  3. Call make_versioned() before the models are defined
  4. Call configure_mappers from SQLAlchemy after declaring all the models.

Example:

import sqlalchemy as sa  
from sqlalchemy_continuum import make_versioned

# Must be called before defining all the models
make_versioned()

class Event(Base):

    __tablename__ = 'events'
    __versioned__ = {}  # Must be added to all models that are to be versioned

    id = sa.Column(sa.Integer, primary_key=True, autoincrement=True)
    name = sa.Column(sa.String)
    start_time = sa.Column(db.DateTime, nullable=False)
    end_time = sa.Column(db.DateTime, nullable=False)
    description = db.Column(db.Text)
    schedule_published_on = db.Column(db.DateTime)

# Must be called after defining all the models
sa.orm.configure_mappers()

We have SQLAlchemy-Continuum enabled now. You can do all the read/write operations as usual. (No change there).

Now, for the part where we give the users an option to view/restore revisions. The inspiration for this, comes from wordpress’s wonderful revisioning functionality.

The layout is well designed. The differences are shown in an easy-to-read form. The slider on top makes it intuitive to move b/w revisions. We have a Restore this Revision button on the top-right to switch to that revision.

A similar layout is what we would like to achieve in Open Event.

  1. A slider to switch b/w sessions
  2. A pop-over infobox on the slider to show who made that change
  3. A button to switch to that selected revision.
  4. The colored-differences shown in side-by-side manner.

To make all this a bit easier, SQLAlchemy-Continuum provides us with some nifty methods.

count_versions is a method that allows us to know the number of revisions a record has.

event = session.query(Event).get(1)  
count = count_versions(event)  # number of versions of that event

Next one is pretty cool. All the version objects have a property called as changeset which holds a dict of changed fields in that version.

event = Event(name=u'FOSSASIA 2016', description=u'FOSS Conference in Asia')  
session.add(article)  
session.commit(article)

version = event.versions[0]  # first version  
version.changeset  
# {
#   'id': [None, 1],
#   'name': [None, u'FOSSASIA 2016'],
#   'description': [None, u'FOSS Conference in Asia']
# }

event.name = u'FOSSASIA 2017'  
session.commit()

version = article.versions[1]  # second version  
version.changeset  
# {
#   'name': [u'FOSSASIA 2016', u'FOSSASIA 2017'],
# }

As you can see, dict holds the fields that changed and the content the changed (before and after). And this is what we’ll be using for generating those pretty diffs that the guys and girls over at wordpress.com have done. And for this we’ll be using two things.

  1. A library named diff-match-patch. It is a library from Google which offers robust algorithms to perform the operations required for synchronizing plain text.
  2. A small recipe from from code.activestate.com Line-based diffs with the necessary HTML markup for styling insertions and deletions.
import itertools  
import re

import diff_match_patch

def side_by_side_diff(old_text, new_text):  
    """
    Calculates a side-by-side line-based difference view.

    Wraps insertions in <ins></ins> and deletions in <del></del>.
    """
    def yield_open_entry(open_entry):
        """ Yield all open changes. """
        ls, rs = open_entry
        # Get unchanged parts onto the right line
        if ls[0] == rs[0]:
            yield (False, ls[0], rs[0])
            for l, r in itertools.izip_longest(ls[1:], rs[1:]):
                yield (True, l, r)
        elif ls[-1] == rs[-1]:
            for l, r in itertools.izip_longest(ls[:-1], rs[:-1]):
                yield (l != r, l, r)
            yield (False, ls[-1], rs[-1])
        else:
            for l, r in itertools.izip_longest(ls, rs):
                yield (True, l, r)

    line_split = re.compile(r'(?:r?n)')
    dmp = diff_match_patch.diff_match_patch()

    diff = dmp.diff_main(old_text, new_text)
    dmp.diff_cleanupSemantic(diff)

    open_entry = ([None], [None])
    for change_type, entry in diff:
        assert change_type in [-1, 0, 1]

        entry = (entry.replace('&', '&amp;')
                      .replace('<', '&lt;')
                      .replace('>', '&gt;'))

        lines = line_split.split(entry)

        # Merge with previous entry if still open
        ls, rs = open_entry

        line = lines[0]
        if line:
            if change_type == 0:
                ls[-1] = ls[-1] or ''
                rs[-1] = rs[-1] or ''
                ls[-1] = ls[-1] + line
                rs[-1] = rs[-1] + line
            elif change_type == 1:
                rs[-1] = rs[-1] or ''
                rs[-1] += '<ins>%s</ins>' % line if line else ''
            elif change_type == -1:
                ls[-1] = ls[-1] or ''
                ls[-1] += '<del>%s</del>' % line if line else ''

        lines = lines[1:]

        if lines:
            if change_type == 0:
                # Push out open entry
                for entry in yield_open_entry(open_entry):
                    yield entry

                # Directly push out lines until last
                for line in lines[:-1]:
                    yield (False, line, line)

                # Keep last line open
                open_entry = ([lines[-1]], [lines[-1]])
            elif change_type == 1:
                ls, rs = open_entry

                for line in lines:
                    rs.append('<ins>%s</ins>' % line if line else '')

                open_entry = (ls, rs)
            elif change_type == -1:
                ls, rs = open_entry

                for line in lines:
                    ls.append('<del>%s</del>' % line if line else '')

                open_entry = (ls, rs)

    # Push out open entry
    for entry in yield_open_entry(open_entry):
        yield entry

So, what we have to do is,

  1. Get the changeset from a version
  2. Run each field’s array containing the old and new text through the side_by_side_diff method.
  3. Display the output on screen.
  4. Use the markups <ins/> and <del/> to style changes.

So, we do the same for each version by looping through the versions array accessible from an event record.

For the slider, noUiSlider javascript library was used. Implementation is simple.

<div id="slider"></div>

<script type="text/javascript">  
    $(function () {
        var slider = document.getElementById('slider');
        noUiSlider.create(slider, {
            start: [0],
            step: 1,
            range: {
                'min': 0,
                'max': 5
            }
        });
    });
</script>

This would create a slider that can go from 0 to 5 and will start at position 0.

By listening to the update event of the slider, we’re able to change which revision is displayed.

slider.noUiSlider.on('update', function (values, handle) {  
    var value = Math.round(values[handle]);
    // the current position of the slider
    // do what you have to do to change the displayed revision
});

And to get the user who caused a revision, you have to access the user_idparameter of the transaction record of a particular version.

event = session.query(Event).get(1)  
version_one = event.versions[0]  
transaction = transaction_class(version_one)  
user_id = transaction.user_id

So, with the user ID, you can query the user database to get the user who made that revision.

The user_id is automatically populated if you’re using Flask, Flask-login and SQLAlchemy-Continuum’s Flask Plugin. Enabling the plugin is easy.

from sqlalchemy_continuum.plugins import FlaskPlugin  
from sqlalchemy_continuum import make_versioned

make_versioned(plugins=[FlaskPlugin()])

This is not a very detailed blog post. If you would like to see the actual implementation, you can checkout the Open Event repository over at GitHub. Specifically, the file browse_revisions.html.

The result is,

Still needs some refinements in the UI. But, it gets the job done :wink:

Continue ReadingImplementing revisioning feature in Open Event

PSLab Code Repository and Installation

PSLab  is a new addition to FOSSASIA Science Lab. This tiny pocket science lab  provides  an array of necessary equipments for doing science and engineering experiments. It can function like an oscilloscope, waveform generator, frequency counter, programmable voltage and current source and also as a data logger.

pslabdesign
New Front Panel Design
psl2
Size:62mmx78mmx13mm

The control and measurement functions are written in Python programming language. Pyqtgraph is used for plotting library. We are now working on Qt based GUI applications for various experiments.

The following are the code repositories of PSLab.

Installation

To install PSLab on Debian based Gnu/Linux system, the following dependencies must be installed.

Dependencies
============
PyQt 4.7+, PySide, or PyQt5
python 2.6, 2.7, or 3.x
NumPy, Scipy
pyqt4-dev-tools          #for pyuic4
Pyqtgraph                #Plotting library
pyopengl and qt-opengl   #for 3D graphics
iPython-qtconsole        #optional
Now clone both the repositories pslab-apps and pslab .

Libraries must be installed in the following order

1. pslab-apps

2. pslab

To install, cd into the directories

$ cd <SOURCE_DIR>

and run the following (for both the repos)

$ sudo make clean
$ sudo make 

$ sudo make install

Now you are ready with the PSLab software on your machine 🙂

For the main GUI (Control panel), you can run Experiments from the terminal.

$ Experiments

If the device is not connected the following splash screen will be displayed.

SplashNotConnected
Device not connected

After clicking OK, you will get the control panel with menus for Experiments, Controls, Advanced Controls and Help etc. (Experiments can not be accessed unless the device is connected)

controlPanelNotConnected

The splash screen and the control panel, when PSLab is connected to the pc.

SplashScreen
PSLab connected
controlpanel
Control Panel – Main GUI

From this control panel one can access controls, help files and various experiments through independent GUI’s written for each experiment.

You can help
------------

Please report a bug/install errors here 
Your suggestions to improve PSLab are welcome :)

What Next:

We are now working on a general purpose Experimental designer. This will allow selecting controls and channels and then generate a spread sheet. The columns from this spreadsheet can be selected and plotted.

 

Continue ReadingPSLab Code Repository and Installation

Code Quality in the knittingpattern Python Library

In our Google Summer of Code project a part of our work is to bring knitting to the digital age. We is Kirstin Heidler and Nicco Kunzmann. Our knittingpattern library aims at being the exchange and conversion format between different types of knit work representations: hand knitting instructions, machine commands for different machines and SVG schemata.

Cafe instructions
The generated schema from the knittingpattern library.
Cafe
The original pattern schema Cafe.

 

 

 

 

 

 

 


The image above was generated by this Python code:

import knittingpattern, webbrowser
example = knittingpattern.load_from().example("Cafe.json")
webbrowser.open(example.to_svg(25).temporary_path(".svg"))

So far about the context. Now about the Quality tools we use:

Untitled

Continuous integration

We use Travis CI [FOSSASIA] to upload packages of a specific git tag  automatically. The Travis build runs under Python 3.3 to 3.5. It first builds the package and then installs it with its dependencies. To upload tags automatically, one can configure Travis, preferably with the command line interface, to save username and password for the Python Package Index (Pypi).[TravisDocs] Our process of releasing a new version is the following:

  1. Increase the version in the knitting pattern library and create a new pull request for it.
  2. Merge the pull request after the tests passed.
  3. Pull and create a new release with a git tag using
    setup.py tag_and_deploy

Travis then builds the new tag and uploads it to Pypi.

With this we have a basic quality assurance. Pull-requests need to run all tests before they can be merge. Travis can be configured to automatically reject a request with errors.

Documentation Driven Development

As mentioned in a blog post, documentation-driven development was something worth to check out. In our case that means writing the documentation first, then the tests and then the code.

Writing the documentation first means thinking in the space of the mental model you have for the code. It defines the interfaces you would be happy to use. A lot of edge cases can be thought of at this point.

When writing the tests, they are often split up and do not represent the flow of thought any more that you had when thinking about your wishes. Tests can be seen as the glue between the code and the documentation. As it is with writing code to pass the tests, in the conversation between the tests and the documentation I find out some things I have forgotten.

When writing the code in a test-driven way, another conversation starts. I call implementing the tests conversation because the tests talk to the code that it should be different and the code tells the tests their inconsistencies like misspellings and bloated interfaces.

With writing documentation first, we have the chance to have two conversations about our code, in spoken language and in code. I like it when the code hears my wishes, so I prefer to talk a bit more.

Testing the Documentation

Our documentation is hosted on Read the Docs. It should have these properties:

  1. Every module is documented.
  2. Everything that is public is documented.
  3. The documentation is syntactically correct.

These are qualities that can be tested, so they are tested. The code can not be deployed if it does not meet these standards. We use Sphinx for building the docs. That makes it possible to tests these properties in this way:

  1. For every module there exists a .rst file which automatically documents the module with autodoc.
  2. A Sphinx build outputs a list of objects that should be covered by documentation but are not.
  3. Sphinx outputs warnings throughout the build.

testing out documentation allows us to have it in higher quality. Many more tests could be imagined, but the basic ones already help.

Code Coverage

It is possible to test your code coverage and see how well we do using Codeclimate.com. It gives us the files we need to work on when we want to improve the quality of the package.

Landscape

Landscape is also free for open source projects. It can give hints about where to improve next. Also it is possible to fail pull requests if the quality decreases. It shows code duplication and can run pylint. Currently, most of the style problems arise from undocumented tests.

Summary

When starting with the more strict quality assurance, the question arose if that would only slow us down. Now, we have learned to write properly styled pep8 code and begin to automatically do what pylint demands. High test-coverage allows us to change the underlying functionality without changing the interface and without fear we may break something irrecoverably. I feel like having a burden taken from me with all those free tools for open-source software that spare my time to set quality assurance up.

Future Work

In the future we like to also create a user interface. It is hard, sometimes, to test these. So, we plan not to put it into the package but build it on the package.

Continue ReadingCode Quality in the knittingpattern Python Library