Image Uploading in Open Event API Server

Open Event API Server manages image uploading in a very simple way. There are many APIs such as “Event API” in API Server provides you data pointer in request body to send the image URL. Since you can send only URLs here if you want to upload any image you can use our Image Uploading API. Now, this uploading API provides you a temporary URL of your uploaded file. This is not the permanent storage but the good thing is that developers do not have to do anything else. Just send this temporary URL to the different APIs like the event one and rest of the work is done by APIs.
API Endpoints which receives the image URLs have their simple mechanism.

  • Create a copy of an uploaded image
  • Create different sizes of the uploaded image
  • Save all images to preferred storage. The Super Admin can set this storage in admin preferences

To better understand this, consider this sample request object to create an event

{
  "data": {
    "attributes": {
      "name": "New Event",
      "starts-at": "2002-05-30T09:30:10+05:30",
      "ends-at": "2022-05-30T09:30:10+05:30",
      "email": "example@example.com",
      "timezone": "Asia/Kolkata",
      "original-image-url": "https://cdn.pixabay.com/photo/2013/11/23/16/25/birds-216412_1280.jpg"
    },
    "type": "event"
  }
}

I have provided one attribute as “original-image-url”, server will open the image and create different images of different sizes as

      "is-map-shown": false,
      "original-image-url": "http://example.com/media/events/3/original/eUpxSmdCMj/43c6d4d2-db2b-460b-b891-1ceeba792cab.jpg",
      "onsite-details": null,
      "organizer-name": null,
      "can-pay-by-stripe": false,
      "large-image-url": "http://example.com/media/events/3/large/WEV4YUJCeF/f819f1d2-29bf-4acc-9af5-8052b6ab65b3.jpg",
      "timezone": "Asia/Kolkata",
      "can-pay-onsite": false,
      "deleted-at": null,
      "ticket-url": null,
      "can-pay-by-paypal": false,
      "location-name": null,
      "is-sponsors-enabled": false,
      "is-sessions-speakers-enabled": false,
      "privacy": "public",
      "has-organizer-info": false,
      "state": "Draft",
      "latitude": null,
      "starts-at": "2002-05-30T04:00:10+00:00",
      "searchable-location-name": null,
      "is-ticketing-enabled": true,
      "can-pay-by-cheque": false,
      "description": "",
      "pentabarf-url": null,
      "xcal-url": null,
      "logo-url": null,
      "can-pay-by-bank": false,
      "is-tax-enabled": false,
      "ical-url": null,
      "name": "New Event",
      "icon-image-url": "http://example.com/media/events/3/icon/N01BcTRUN2/65f25497-a079-4515-8359-ce5212e9669f.jpg",
      "thumbnail-image-url": "http://example.com/media/events/3/thumbnail/U2ZpSU1IK2/4fa07a9a-ef72-45f8-993b-037b0ad6dd6e.jpg",

We can clearly see that server is generating three other images on permanent storage as well as creating the copy of original-image-url into permanent storage.
Since we already have our Storage class, all we need to do is to make the little bit changes in it due to the decoupling of the Open Event. Also, I had to work on these points below

  • Fix upload module, provide support to generate url of locally uploaded file based on static_domain defined in settings
  • Using PIL create a method to generate new image by converting first it to jpeg(lower size than png) and resize it according to the aspect ratio
  • Create a helper method to create different sizes
  • Store all images in preferred storage.
  • Update APIs to incorporate this feature, drop any URLs in image pointers except original_image_url

Support for generating locally uploaded file’s URL
Here I worked on adding support to check if any static_domain is set by a user and used the request.url as the fallback.

if get_settings()['static_domain']:
        return get_settings()['static_domain'] + \
            file_relative_path.replace('/static', '')
    url = urlparse(request.url)
    return url.scheme + '://' + url.host + file_relative_path

Using PIL create a method to create image

This method is created to create the image based on any size passed it to as a parameter. The important role of this is to convert the image into jpg and then resize it on the basis of size and aspect ratio provided.
Earlier, in Orga Server, we were directly using the “open” method to open Image files but since they are no longer needed to be on the local server, a user can provide the link to any direct image. To add this support, all we needed is to use StringIO to turn the read string into a file-like object

image_file = cStringIO.StringIO(urllib.urlopen(image_file).read())

Next, I have to work on clearing the temporary images from the cloud which was created using temporary APIs. I believe that will be a cakewalk for locally stored images since I already had this support in this method.

if remove_after_upload:
        os.remove(image_file)

Update APIs to incorporate this feature
Below is an example how this works in an API.

if data.get('original_image_url') and data['original_image_url'] != event.original_image_url:
            uploaded_images = create_save_image_sizes(data['original_image_url'], 'event', event.id)
            data['original_image_url'] = uploaded_images['original_image_url']
            data['large_image_url'] = uploaded_images['large_image_url']
            data['thumbnail_image_url'] = uploaded_images['thumbnail_image_url']
            data['icon_image_url'] = uploaded_images['icon_image_url']
        else:
            if data.get('large_image_url'):
                del data['large_image_url']
            if data.get('thumbnail_image_url'):
                del data['thumbnail_image_url']
            if data.get('icon_image_url'):
                del data['icon_image_url']

Here the method “create_save_image_sizes” provides the different URL of different images of different sizes and we clearly dropping any other images of different sizes is provided by the user.

General Suggestion
Sometimes when we work on such issues there are some of the things to take care of for example, if you checked the first snippet, I tried to ensure that you will get the URL although it is sure that static_domain will not be blank, because even if the user (admin) doesn’t fill that field then it will be filled by server hostname
A similar situation is the one where there is no record in Image Sizes table, may be server admin didn’t add one. In that case, it will use the standard sizes stored in the codebase to create different images of different sizes.

Resources:

Continue ReadingImage Uploading in Open Event API Server

Permission Manager in Open Event API Server

Open Event API Server uses different decorators to control permissions for different access levels as discussed here. Next challenging thing for permissions was reducing redundancy and ensuring permission decorators are independent of different API views. They should not look to the view for which they are checking the permission or some different logic for different views.

In API Server, we have different endpoints that leads to same Resource this way we maintain relationships between different entities but this leads to a problem where permission decorators has to work on different API endpoints that points to different or same resource and but to check a permission some attributes are required and one or more endpoints may not provide all attributes required to check a permission.

For instance, PATCH /session/id` request requires permissions of a Co-Organizer and permission decorator for this requires two things, user detail and event details. It is easy to fetch user_id from logged in user while it was challenging to get “event_id”. Therefore to solve this purpose I worked on a module named “permission_manager.py” situated at “app/api/helpers/permission_manager.py” in the codebase

Basic Idea of Permission Manager

Permission manager basically works to serve the required attributes/view_kwargs to permission decorators so that these decorators do not break

Its logic can be described as:

    1. It first sits in the middle of a request and permission decorator
    2. Evaluates the arguments passed to it and ensure the current method of the request (POST, GET, etc ) is the part of permission check or not.
    3. Uses two important things, fetch and fetch_as
      fetch => value of this argument is the URL parameter key which will be fetched from URL or the database ( if not present in URL )
      fetch_as => the value received from fetch will be sent to permission decorator by the name as the value of this option.
    4. If the fetch key is not there in URL, It uses third parameter model which is Model if the table from where this key can be fetched and then passes it to permission decorator
    5. Returns the requested view on passing access level and Forbidden error if fails

This way it ensures that if looks for the only specific type of requests allowing us to set different rules for different methods.

if 'methods' in kwargs:
        methods = kwargs['methods']

    if request.method not in methods:
        return view(*view_args, **view_kwargs)

Implementing Permission Manager

Implementing it was a simple thing,

  1. Firstly, registration of JSON API app is shifted from app/api/__init__.py to app/api/bootstrap.py so that this module can be imported anywhere
  2. Added permission manager to the app_v1 module
  3. Created permission_manager.py in app/api/helpers
  4. Added it’s usage in different APIs

An example Usage:

decorators = (api.has_permission('is_coorganizer', fetch='event_id', fetch_as="event_id", methods="POST",
                                     check=lambda a: a.get('event_id') or a.get('event_identifier')),)

Here we are checking if the request has the permission of a Co-Organizer and for this, we need to fetch event_id  from request URI. Since no model is provided here so it is required for event_id in URL this also ensures no other endpoints can leak the resource. Also here we are checking for only POST requests thus it will pass the GET requests as it is no checking.

What’s next in permission manager?

Permission has various scopes for improving, I’m still working on a module as part of permission manager which can be used directly in the middle of views and resources so that we can check for permission for specific requests in the middle of any process.

The ability to add logic so that we can leave the check on the basis of some logic may be adding some lambda attributes will work.

Resources

Continue ReadingPermission Manager in Open Event API Server

Working With Inter-related Resource Endpoints In Open Event API Server

In this blogpost I will discuss how the resource inter-related endpoints work. These are the endpoints which involve two resource objects which are also related to another same resource object.


The discussion in this post is of the endpoints related to Sessions Model. In the API server, there exists a relationship between event and sessions. Apart from these, session also has relationships with microlocations, tracks, speakers and session-types. Let’s take a look at the endpoints related with the above.

`/v1/tracks/<int:track_id>/sessions` is a list endpoint which can be used to list and create the sessions related to a particular track of an event. To get the list we define the query() method in ResourceList class as such:

The query method is executed for GET requests, so this if clause looks for track_id in view_kwargs dict. When the request is made to `/v1/tracks/<int:track_id>/sessions`, track id will be present as ‘track_id in the view_kwargs. The tracks are filtered based on the id passed here and then joined on the query with all sessions object from database.

For the POST method, we need to add the track_id from view_kwargs to pass into the track_id field of database model. This is achieved by using flask-rest-jsonapi’s before_create_object() method. The implementation for track_id is the following:

When a POST request is made to `/v1/tracks/<int:track_id>/sessions`, the view_kwargs dict will have ‘track_id’ in it. So if track_id is present in the url params, we first ensure that a track with the passed id exists, then only proceed to create a sessions object under the given track. Now the safe_query() method is a generic custom method written to check for such things. The model is passed along with the filter attribute, and a field to include in the error message. This method throws an ObjectNotFound exception if no such object exists for a given id.

We also need to take care of the permissions for these endpoints. As the decorators are called even before schema validation, it was difficult to get the event_id for permissions unless adding highly endpoint-specific code in the permission manager core, leading to loss of generality.  So the leave_if parameter of permission check was used to overcome this issue. Since the permissions manager isn’t fully developed yet, this is to be changed in the improved implementation of the permissions manager.

Similar implementations for micro locations and session types was done. All the same is not explained in this blogpost. For extended code, take a look at the source code of this schema.

For speaker relation, a few things were different. This is because speakers-sessions is a many-to-many relationship. Let’s take a look at this:

As it is a many-to-many relationship, a association_table was used with flask-sqlalchemy. So for the query() method, the same association table is queried after extracting the speaker_id from view_kwargs dict.

For the POST request on `/v1/speakers/<int:speaker_id>/sessions` , flask-rest-jsonapi’s after_create_object() method is used to insert the request in association_table. In this method the parameters are the following: self, obj, data, view_kwargs

Now view_kwargs contains the url parameters, so we make a check for speaker_id in view_wargs. If it is present, then before proceeding to insert data, we ensure that a speaker exists with that id using the safe_query() method as described above. After that, the obj argument of the method is used. This contains the object that was created in  previous method. So now once the sessions object has been created and we are sure that a speaker exists with given speaker_id, this is just to be appended to obj.speakers  so that this relationship tuple is inserted into the association table.

This updates the association table ‘speakers_session’ in this case. The other such endpoints are being worked upon in a similar fashion and will be consolidated as part of a set of robust APIs, along with the improved permissions manager for the Open Event API Server.

Additional Resources

Code, Issues and Pull Request involved

Continue ReadingWorking With Inter-related Resource Endpoints In Open Event API Server

How Meilix Generator sends Email Notifications with SendGrid

We wanted to notify the users once the build was ready for download. To solve this we attempted making an email server on Meilix Generator but that can send email when it starts but it would take around 20 minutes to get the build ready so we thought of checking the deploy link status and send email whenever the link status was available (200) but the problem with this method was that the link can be pre available if ISO is rebuilt for same event.

Then, we attempted sending mail by Travis CI but the problem in that was closed SMTP ports (they have a strict policy about that) then we thought that Travis CI can trigger the Sendgrid which can send email to the user with the help of API.

We will use this code so that once the deployment of ISO by Travis CI is done it can execute the email script which requests Sendgrid to send email to the user.

after_deploy:
  - ./mail.py

 

We can create code using code generation service of Sendgrid we are going to choose python as it is easier to manipulate strings in python and we are going to use email as an environment variable.

After generation of python 3 code from the sendgrid website we are going to edit the message and email and hide the API key as an environment variable and create an authorization string to be used there too.

The URL will be generated by the below script as the body of url remains same only two things will change the TRAVIS_TAG which is event name and date.

date = datetime.datetime.now().strftime('%Y%m%d')
url="https://github.com/xeon-zolt/meilix/releases/download/"+os.environ["TRAVIS_TAG"]+"/meilix-zesty-"+date+"-i386.iso"

 

We can use this to hide the api key and use it as an environment variable because if the api key is visible in logs anyone can use it to exploit it and use it for spamming purpose.

authorization = "Bearer " + os.environ["mail_api_key"]
headers = {
    'authorization': authorization,

 

The main thing left to edit in the script is the message which is in the payload and is a string type so we are going to use the email received by Meilix generator as an environment variable and concatenate it with the payload string the message sent is in the value which is in the HTML format and we add the generated URL in similar way we added email variable to string.

payload = "{\"personalizations\":[{\"to\":[{\"email\":\"" + os.environ["email"] + "\"}],\"subject\":\"Your ISO is Ready\"}],\"from\":{\"email\":\"xeon.harsh@gmail.com\",\"name\":\"Meilix Generator\"},\"reply_to\":{\"email\":\"xeon.harsh@gmail.com\",\"name\":\"Meilix Generator\"},\"subject\":\"Your ISO is ready\",\"content\":[{\"type\":\"text/html\",\"value\":\"<html><p>Hi,<br>Your ISO is ready<br>URL : "+url+"<br><br>Thank You,<br>Meilix Generator Team</p></html>\"}]}"

 

The sent email looks like this

References

Continue ReadingHow Meilix Generator sends Email Notifications with SendGrid

Designing A Virtual Laboratory With PSLab

What is a virtual laboratory

A virtual lab interface gives students remote access to equipment in laboratories via the Internet without having to be physically present near the equipment. The idea is that lab experiments can be made accessible to a larger audience which may not have the resources to set up the experiment at their place. Another use-case scenario is that the experiment setup must be placed at a specific location which may not be habitable.

The PSLab’s capabilities can be increased significantly by setting up a framework that allows remote data acquisition and control. It can then be deployed in various test and measurement scenarios such as an interactive environment monitoring station.

What resources will be needed for such a setup

The proposed virtual lab will be platform independent, and should be able to run in web-browsers. This necessitates the presence of a lightweight web-server software running on the hardware to which the PSLab is connected. The web-server must have a framework that must handle multiple connections, and allow control access to only authenticated users.

Proposed design for the backend

The backend framework must be able to handle the following tasks:

  • Communicate with the PSLab hardware attached to the server
  • Host lightweight web-pages with various visual aids
  • Support an authentication framework via a database that contains user credentials
  • Reply with JSON data after executing single commands on the PSLab
  • Execute remotely received python scripts, and relay the HTML formatted output. This should include plots

Proposed design for the frontend

  • Responsive, aesthetic layouts and widget styles.
  • Essential utilities such as Sign-up and Sign-in pages.
  • Embedded plots with basic zooming and panning facilities.
  • Embedded code-editor with syntax highlighting
  • WIdgets to submit the code to the server for execution, and subsequent display of received response.

A selection of tools that can assist with this project, and the purpose they will serve:

Backend

  • The Python communication library for the PSLab
  • FLASK: ‘Flask is a BSD Licensed microframework for Python based on Werkzeug, Jinja 2 and good intentions.’   . It can handle concurrent requests, and will be well suited to serve as our web server
  • MySQL: This is a database management utility that can be used to store user credentials, user scripts, queues etc
  • WerkZeug: The utilities to create and check password hashes are essential for exchanging passwords via the database
  • Json: For relaying measurement results to the client
  • Gunicorn + Nginx: Will be used when more scalable deployment is needed, and the built-in webserver of Flask is unable to handle the load.

Frontend

  • Bootstrap-css: For neatly formatted, responsive UIs
  • Jqplot: A versatile and expandable js based plotting library
  • Ace code editor: A browser based code editor with syntax highlighting, automatic indentation and other user-friendly features. Written in JS
  • Display documentation:  These can be generated server side from Markdown files using Jekyll. Several documentation files are already available from the pslab-desktop-apps, and can be reused after replacing the screenshot images only.

Flow Diagram

Recommended Reading

[1]: Tutorial series  for creating a web-app using python-flask and mysql. This tutorial will be extensively followed for creating the virtual-lab setup.

[2]: Introduction to the Virtual Labs initiative by the Govt of India

[3]: Virtual labs at IIT Kanpur

Continue ReadingDesigning A Virtual Laboratory With PSLab

Rendering Open Event Server’s API-Blueprint document

After writing the FOSSASIA‘s Open Event Server project API- Blueprint Document manually, we wanted to know how we could render the document, how to check it in an HTML-client friendly format and how to make it change the look as we go. In order to do that, we found two rendering ways.

They are:

1) The apiary editor:

This editor helps us to render API blueprints and print them in user readable API documented format. When we create the API blueprint manually, we always follow the pattern write an api blueprint i.e the name and metadata, then followed by resource groups and actions, which was already discussed in the last blog. In order to use the apiary editor, we start off by creating our first project. Initially during the our first use of this editor, we will get a default “polls and vote” example api project. This is a template we can use as guide. The pole/vote api looks something like this in the editor mode:

 

Apiary has a facility to test an API, document an API, inspect an API or simply edit an API. We first start off by creating a project “open-event-api”. Next, in the editor mode of the apiary, we add the contents of our api-blueprint documents.
Here is an example of how USERS API is rendered. If we get our request and response correctly, on clicking List All Users we will get a good 200 response like this in the editor:

However, if we tend to go off format with the api-blueprint, we get an invalid error:

The final rendering and how the API result can be seen in the document mode with the respective API’s request and response.
The document mode request and response look like this:

This rendered doc can be viewed publicly with the link got in the document mode. Similarly, we test it out in the editor for the rest of the ap. This is a simple way to render your api blueprint.

2)  The aglio renderer:

Since API blueprint is presented in the form of .apib format, the con side of it is it is not easily viewable by viewers. Even though we use apiary, view the rendered docs along with getting a shareable link, we would surely like the docs for our API server to be hosted in our server as well. So, we use Aglio exactly to do that .

It is an API Blueprint renderer which supports multiple themes. It converts the apib file into user readable formats such as pdf, html, etc. Here since we want to host it as a webpage, we render it in the form of .html.  It outputs static HTML of the result and can be served by any web host. Since API Blueprint is a Markdown-based document format, this lets us write API descriptions and documentation in a simple and straightforward way.
An example of how aglio rendered document in a three column format looks like:

The best thing about Aglio is not only does it support a lot many theme and templates, but it also allows you to provide your own custom theme and template to render the html file from the api blueprint.

How to use aglio renderer:

  • We first follow up with installation:
npm install -g aglio
  • After installation, we go to the folder the .apib file is stored and generate the HTML. There are 5 built in themes available with two column and three column layout. They are:
# Default theme
aglio -i input.apib -o output.html

-> This command takes as input the input.apib file as API Blueprint and creates a rendered output file named output.html.

 

# Use three-column layout
aglio -i input.apib --theme-template triple -o output.html

-> This command takes as input the input.apib file as API Blueprint and creates a rendered output file named output.html. However it uses the theme-template flag. The theme-template flag is used to define whether the layout of the rendered html is two column or three column. In this command, it is set as triple which means three column.

# Built-in color scheme
aglio --theme-variables slate -i input.apib -o output.html

-> Aglio has different color schemes that you can use while rendering the docs html file. Some of them are Olio, Streak, Slate, etc.

# Customize a built-in style
aglio --theme-style default --theme-style ./my-style.less -i input.apib -o output.html

-> Suppose you want to provide a syntactical style sheet such as SASS, LESS, etc. so as to define your own styling. You can do that as given in the above example. The my-style.less is a user defined syntactical stylesheet. This is then used to provide styling for the output file rendered.

# Custom layout template
aglio --theme-template /path/to/template.jade -i input.apib -o output.html

-> You can write your own custom layout template in a template.jade file and use that for generating the output.html instead of two or three column layout.

We run the build-in color scheme: aglio –theme-variables slate -i api_blueprint.apib -o output.html to generate our Open Event Server api document which we have something like this:

You can visit the live version of FOSSASIA‘s Open Event Server API Document right here: https://api.eventyay.com/

Continue ReadingRendering Open Event Server’s API-Blueprint document

Open Event Server Ticket PDF: Where and Where-not to use static frame in xhtml2pdf

One among the very important features of Open Event Server project is the tickets sales feature, where a user can buy a number of different tickets for a number of people after which he is given a link to download the ticket pdf. However, an issue concerning our Open Event Server project was that if a buyer bought different tickets at a time with different individual ticket holders, all tickets contained the same name, type and QR-Code, which in no way was acceptable since tickets and holders were different.

We use the xhtml2pdf facility in order to convert an html to pdf. An xhtml2pdf facility helps us in converting HTML contents into PDF without the use of browser ‘print’ facility. In order to do this, we use the help of pages and frames, pages being the page of the PDF document, while a frame being that part of area within the page where the contents get stored.

What is a PDF and HTML?

The basic understanding of a PDF and an HTML is that, a PDF or Portable Document Format has layout such that it is measured in terms of specific width and height. However, for an HTML or Hyper Text Markup Language, we do no not have those specific widths and heights. Rather an HTML’s width depends on a person’s device of view and height can be infinitely long as desired.

In terms of xhtml2pdf relation with pages and frames layout, we can identify with this diagram:

+-page———————+
|                                    |
|  +-content_frame-+  |
|  |                          |    |
|  |                          |    |
|  |                          |    |
|  |                          |    |
|  +———————+   |
|                                    |
+—————————-+


The stated issue was due to static frame in xhtml2pdf.

What is that you ask?

Static frames vs Content frames
xhtml2pdf uses the concept of Static Frames to define content that remains the same across different pages (like headers and footers), and uses Content Frames to position the to-be-converted HTML content.

Static Frames are defined through use of the @frame property -pdf-frame-content. Regular HTML content will not flow through Static Frames.

Content Frames are @frame objects without this property defined. Regular HTML content will flow through Content Frames.(xhtml2pdf documentation)

So, the basic idea of the use of  -pdf-frame-content  was to make the contents of the pdf static i.e. without having to continuously change alignments of the page. It made the whole pdf stay static.

This actually caused the whole pdf content to stay constant, event while the loop was going over different name, QR-code and ticket-name values. That means the first loop content stayed over even with changes in values.

Just a simple fix of not making the frame static was the solution.

Here is a few insight of what we had before that was causing the issue:

 <style>
        @page {
            size: a4 portrait;
            background-image: url('{{ base_dir }}/static/data/ticket-trans-
                                  notext.png');
            margin: 0;

            @frame col1_frame {             /* Content frame 1 */
                left: 80pt;  top: 30pt;
                height: 250pt; width: 300pt;
                -pdf-frame-content: main_content;
            }

            @frame qrcode {
                left: 455pt;  top: 50pt;
                width: 100pt; height: 120pt;
                -pdf-frame-content: qr_code;
            }
            @frame number_frame {
                 left: 455pt;  top: 25pt;
                width: 100pt; height: 30pt;
                -pdf-frame-content: number_content;
            }


             @frame col2_frame {
                left: 440pt; top: 170pt;
                -pdf-frame-content: personal_content;
            }
        }


Here,
main_content, qrcode, personal_content and number_content? are simply the class you want to make static.
The following was done to fix it:

 <style>
        @page {
            size: a4 portrait;
            background-image: url('{{ base_dir }}/static/data/ticket-trans-
                                  notext.png');
            margin: 0;

            @frame col1_frame {             /* Content frame 1 */
                left: 80pt;  top: 30pt;
                height: 250pt; width: 300pt;
            }

            @frame qrcode {
                left: 455pt;  top: 50pt;
                width: 100pt; height: 120pt;
            }
            @frame number_frame {
                 left: 455pt;  top: 25pt;
                width: 100pt; height: 30pt;
                -pdf-frame-content: number_content;
            }


             @frame col2_frame {
                left: 440pt; top: 170pt;
            }
        }



But the main question was, ‘where and where-not to use static frame’.
So the simple answer to that is, when you want the contents to stay exactly the same over the different pages of the pdf (for example the company logo or the administrator signature, etc), then you can happily get on with -pdf-frame-content. i.e:

          -pdf-frame-content: <class_name>


If not, or you need to loop over different values, then use Content frames.

Also, for more detailed understanding of PDF and HTML pages and frames, we have this good documentation written about it here: https://github.com/xhtml2pdf/xhtml2pdf/blob/master/doc/source/format_html.rst#static-frames-vs-content-frames

Continue ReadingOpen Event Server Ticket PDF: Where and Where-not to use static frame in xhtml2pdf

Documenting APIs with Yaydoc

API Documentation is a quick and concise way to tell a user about how to use a library or work with a program. It details classes, functions, parameters, return types and more. Courtesy of Sphinx, Yaydoc had build in support for Documenting APIs for Python based projects right from it’s inception. Sphinx has a built in tool autodoc which provides certain directives such as autoclass, automodule, etc which can be used to automatically extract docstrings from all specified Python packages and modules and use it to generate API documentation. As a user of Yaydoc you could add ReST sources files with appropriate directives provided by autodoc and we would handle the rest. As part of enhancing this feature we wanted to do three things.

  • Enhance support for Python
  • Extend API documentation to other languages apart from Python
  • Automate the process of generating ReST source files

For Enhancing support for python projects, we implemented a few things.

Since autodoc imports the modules it needs to document, There could be import errors if a dependency was not met. To fix this issue, Now a user can specify certain modules to be mocked. This would really come in handy with projects depending on packages with third party C extensions such as numpy, scipy, etc.

{% if mock_modules %}
mock_modules = [name.strip() for name in '{{ mock_modules }}'.split(',')]
sys.modules.update((mod_name, mock.Mock()) for mod_name in mock_modules)
{% endif %}

Apart from this, if we detect a setup.py in the repository or a requirements.txt, we automatically try to install from it to meet dependencies.

# autodoc imports the module while building source files. To avoid
# ImportError, install any packages in requirements.txt of the project
# if available
if [ -f $ROOT_DIR/setup.py ]; then
  pip install $ROOT_DIR/
elif [ -f $ROOT_DIR/requirements.txt ]; then
  pip install -q -r $ROOT_DIR/requirements.txt
fi

We also crawl the repository to detect any packages and add them to sys.path. With these changes, a user can expected generated API docs without having to extend conf.py.

{% if autoapi_python == 'true' %}
for (dirpath, dirnames, filenames) in os.walk('{{ root_dir }}'):
    # Directory contains __init__.py. It should be a python package
    if '__init__.py' in filenames:
        # appending instead of inserting at front so that user
        # cannot overwrite some of our own modules.
        sys.path.append(os.path.abspath(os.path.dirname(dirpath)))
{% endif %}

The second goal is a no brainer. We would like to support as many languages as we can. With this week’s update, Java has been added to the officially supported list of languages for which Yaydoc can generate full API documentation without any manual intervention. To extract API documentation for java source files, we used a sphinx extension named javasphinx. From the official javasphinx docs,

javasphinx is a Sphinx extension that provides a Sphinx domain for documenting Java projects and a javasphinx-apidoc command line tool for automatically generating API documentation from existing Java source code and Javadoc documentation.

javasphinx-apidoc -o source/ $ROOT_DIR/$AUTOAPI_JAVA_PATH/
sphinx-apidoc -o source/ $ROOT_DIR/$AUTOAPI_PYTHON_PATH/

For the third goal, we use the tools sphinx-apidoc and javasphinx-apidoc to generate source files.

Resources

Continue ReadingDocumenting APIs with Yaydoc

Save Server Response to File Using Python in Open Event Android App Generator

The Open Event Android project helps event organizers to generate Apps (apk format) using Open Event App generator for their events/conferences by providing API endpoint or zip generated using Open Event server.

The Open Event Android project has an assets folder which contains sample JSON files for the event, which are used to load the data when there is no Internet connection. When an organizer generates an app with the help of an API endpoint, then all the assets, e.g. images, are removed and because of it generated don’t have sample JSON files. So if the device is offline then it will not load the data from the assets (issue #1606). The app should contain sample event JSON files and should be able to load the data without the Internet connection.

One solution to this problem is to fetch all the event data (JSON files) from the server and save it to the assets folder while generating the app using Open Event App generator. So in this blog post, I explain how to make a simple request using Python and save the response in a file.

Making a simple request

To fetch the data we need to make a request to the server and the server will return the response for that request. Here’s step to make a simple request,

Import Requests module using,

import requests

The requests module has get, post, put, delete, head, options to make a request. Example:

requests.get(self.api_link + '/' + end_point)

Here all method returns requests.Response object. We can get all the information we need from this object.

So we can make response object like this,

response = requests.get(self.api_link + '/' + end_point)

Saving response in a file

For saving response in a file we need to open/create a file for writing.

Create or open file

The open() method is used for opening file. It takes two arguments one is file name and second in access mode which determines the mode in which the file has to be opened, i.e., read, write, append, etc.

file = open(path/to/the/file, "w+")

Here “w+” overwrites the existing file if the file exists. If the file does not exist, it creates a new file for reading and writing.

Write and close file

Now write the content of the response using the write() method of file object then close the file object using the close()  method. The close() method flushes any unwritten information and closes the file object.

file.write(response.text)
file.close()

Here response.text gives the content of the response in Unicode format.

Conclusion

The requests.Response object contains much more data like encoding, status_code, headers etc. For more info about request and response refer additional resources given below.

Additional Resources

Continue ReadingSave Server Response to File Using Python in Open Event Android App Generator

Create Scraper in Javascript for Loklak Scraper JS

Loklak Scraper JS is the latest repository in Loklak project. It is one of the interesting projects because of expected benefits of Javascript in web scraping. It has a Node Javascript engine and is used in Loklak Wok project as bundled package. It has potential to be used in different repositories and enhance them.

Scraping in Python is easy (at least for Pythonistas) as one needs to just import Request library and BeautifulSoup library (lxml as better option), write some lines of code using Request library to get webpage and some lines of bs4 to walk through html and scrape data. This sums up to about less than a hundred lines of coding, where as Javascript coding isn’t easily readable (at least to me) as compared to Python. But it has an advantage, it can easily deal with Javascript in the pages we are scraping. This is one of the motive, Loklak Scraper JS repository was created and we contributed and worked on it.

I recently coded a Javascript scraper in loklak_scraper_js repository. While coding, I found it’s libraries similar to the libraries, I use to code in Python. Therefore, this blog is for Pythonistas how they can start scraping in Javascript as they finish reading and also contribute to Loklak Scraper JS.

First, replace Python interpreter, Request and Beautifulsoup library with Node JS interpreter, Request and Cheerio JS library.

1) Node JS Interpreter: Node JS Interpreter is used to interpret Javascript files. This is different from Python as it deals with the project instead of a module in case of Python. The most compatible Node for most of the libraries is 6.0.0 , where as latest version available(as I checked) is 8.0.0

TIP: use `–save` with npm like here while installing a library.

2) Request Library :- This is used to load webpage to be processed. Similar to one in Python.

Request-promise library, a wrapper around Request with implementation of Bluebird library, improves readability and makes code cleaner (how?).

 

3) Cheerio Library:- A Pythonista (a rookie one) can call it twin of BeautifulSoup Library. But this is faster and is Javascript. It’s selector implementation is nearly identical to jQuery’s.

Let us code a basic Javascript scraper. I will take TimeAndDate scraper from loklak_scraper_js as example here. It inputs place and outputs its local time.

Step#1: fetching HTML from webpage with the help of Request library.

We input url to Request function to fetch the webpage and is saved to `html` variable. This scrapeTimeAndDate() function scrapes data from html

url = "http://www.timeanddate.com/worldclock/results.html?query=London";

request(url, function(error, response, body) {

 if(error) {

    console.log("Error: " + error);

    process.exit(-1);

 }

 html = body;

 scrapeTimeAndDate()

});

 

Step#2: To scrape important data from html using Cheerio JS

list of date and time of locations is embedded in table tag, So we will iterate through <td> and extract text.

  1. a) Load html to Cheerio as we do in beautifulsoup

In Python

soup = BeautifulSoup(html,'html5lib')

 

In Cheerio JS

$ = cheerio.load(html);

 

  1. b) This line finds first tr tag in table tag.

var htmlTime = $("table").find('tr');

 

  1. c) Iterate through td tags data by using each() function. This function acts as loop (in Python) iterating through list of elements in which data will be extracted.

htmlTime.each(function (index, element) {      

  // in python, we will use loop, `for element from elements:`

  tag = $(element).find("td");    // in python, `tag = soup.find_all('td')`

  if( tag.text() != "") {

    .

    .

    //EXTRACT DATA

    .

    .

  } else {

    //go to next td tag

    tag = tag.next();

  }

}

 

  1. d) To extract data

Cheerio JS loads html and uses DOM model traverse through. DOM model considers html is tree. So, go to the tag, and scrape data you want.

//extract location(text) enclosed in tag

location = tag.text();

//go to next tag

tag = tag.next();

//extract time(text) enclosed in tag

time = tag.text();

//save in dictionary like in python

loc_list["location"] = location;

loc_list["time"] = time;

 

Some other useful functions:-

1) $(selector, [context], [root])

returns object of selector(any tag) with class or id inside root

2) $(“table”).attr(name, value)

To get tag object having attribute having `value`

3) obj.html()

To get html enclosed in tags

For more just drop in here

Step#3: Execute scraper using command

node <scrapername>.js

 

Hoping that this blog is able to  how to scrape in Javascript by finding similarities with Python.

Resources:

Continue ReadingCreate Scraper in Javascript for Loklak Scraper JS