Create Event by Importing JSON files in Open Event Server

Apart from the usual way of creating an event in  FOSSASIA’s Orga Server project by using POST requests in Events API, another way of creating events is importing a zip file which is an archive of multiple JSON files. This way you can create a large event like FOSSASIA with lots of data related to sessions, speakers, microlocations, sponsors just by uploading JSON files to the system. Sample JSON file can be found in the open-event project of FOSSASIA. The basic workflow of importing an event and how it works is as follows:

  • First step is similar to uploading files to the server. We need to send a POST request with a multipart form data with the zipped archive containing the JSON files.
  • The POST request starts a celery task to start importing data from JSON files and storing them in the database.
  • The celery task URL is returned as a response to the POST request. You can use this celery task for polling purposes to get the status. If the status is FAILURE, we get the error text along with it. If status is SUCCESS we get the resulting event data
  • In the celery task, each JSON file is read separately and the data is stored in the db with the proper relations.
  • Sending a GET request to the above mentioned celery task, after the task has been completed returns the event id along with the event URL.

Let’s see how each of these points work in the background.

Uploading ZIP containing JSON Files

For uploading a zip archive instead of sending a JSON data in the POST request we send a multipart form data. The multipart/form-data format of sending data allows an entire file to be sent as a data in the POST request along with the relevant file informations. One can know about various form content types here .

An example cURL request looks something like this:

curl -H "Authorization: JWT <access token>" -X POST -F 'file=@event1.zip' http://localhost:5000/v1/events/import/json

The above cURL request uploads a file event1.zip from your current directory with the key as ‘file’ to the endpoint /v1/events/import/json. The user uploading the feels needs to have a JWT authentication key or in other words be logged in to the system as it is necessary to create an event.

@import_routes.route('/events/import/<string:source_type>', methods=['POST'])
@jwt_required()
def import_event(source_type):
    if source_type == 'json':
        file_path = get_file_from_request(['zip'])
    else:
        file_path = None
        abort(404)
    from helpers.tasks import import_event_task
    task = import_event_task.delay(email=current_identity.email, file=file_path,
                                   source_type=source_type, creator_id=current_identity.id)
    # create import job
    create_import_job(task.id)

    # if testing
    if current_app.config.get('CELERY_ALWAYS_EAGER'):
        TASK_RESULTS[task.id] = {
            'result': task.get(),
            'state': task.state
        }
    return jsonify(
        task_url=url_for('tasks.celery_task', task_id=task.id)
    )


After the request is received we check if a file exists in the key ‘file’ of the form-data. If it is there, we save the file and get the path to the saved file. Then we send this path over to the celery task and run the task with the
.delay() function of celery. After the celery task is started, the corresponding data about the import job is stored in the database for future debugging and logging purposes. After this we return the task url for the celery task that we started.

Celery Task to Import Data

Just like exporting of event, importing is also a time consuming task and we don’t want other application requests to be paused because of this task. Hence, we use a celery queue to execute this task. Whenever an import task is started, it is added to the celery queue. When it comes to the front of the queue it is executed.

For importing, we have created a celery task, import.event which calls the import_event_task_base() function that uses the import helper functions to get the data from JSON files imported and saved in the DB. After the task is completed, we update the import job data in the table with the status as either SUCCESS or FAILURE depending on the outcome of the celery task.

As a result of the celery task, the newly created event’s id and the frontend link from where we can visit the url is returned. This along with the status of the celery task is returned as the response for a GET request on the celery task. If the celery task fails, then the state is changed to FAILURE and the error which the celery faced is returned as the error message in the result key. We also print an error traceback in the celery worker.

@celery.task(base=RequestContextTask, name='import.event', bind=True, throws=(BaseError,))
def import_event_task(self, file, source_type, creator_id):
    """Import Event Task"""
    task_id = self.request.id.__str__()  # str(async result)
    try:
        result = import_event_task_base(self, file, source_type, creator_id)
        update_import_job(task_id, result['id'], 'SUCCESS')
        # return item
    except BaseError as e:
        print(traceback.format_exc())
        update_import_job(task_id, e.message, e.status if hasattr(e, 'status') else 'failure')
        result = {'__error': True, 'result': e.to_dict()}
    except Exception as e:
        print(traceback.format_exc())
        update_import_job(task_id, e.message, e.status if hasattr(e, 'status') else 'failure')
        result = {'__error': True, 'result': ServerError().to_dict()}
    # send email
    send_import_mail(task_id, result)
    # return result
    return result

 

Save Data from JSON

In import helpers, we have the functions which perform the main task of reading the JSON files, creating sqlalchemy model objects from them and saving them in the database. There are few global dictionaries which help maintain the order in which the files are to be imported and saved and also the file vs model mapping. The first JSON file to be imported is the event JSON file. Since all the other tables to be imported are related to the event table so first we read the event JSON file. After that the order in which the files are read is as follows:

  1. SocialLink
  2. CustomForms
  3. Microlocation
  4. Sponsor
  5. Speaker
  6. Track
  7. SessionType
  8. Session

This order helps maintain the foreign constraints. For importing data from these files we use the function create_service_from_json(). It sorts the elements in the data list  based on the key “id”. It then loops over all the elements or dictionaries contained in the data list. In each iteration delete the unnecessary key-value pairs from the dictionary. Then set the event_id for that element to the id of the newly created event from import instead of the old id present in the data.  After all this is done, create a model object based on the mapping with the filename with the dict data. Then save that model data into the database.

def create_service_from_json(task_handle, data, srv, event_id, service_ids=None):
    """
    Given :data as json, create the service on server
    :service_ids are the mapping of ids of already created services.
        Used for mapping old ids to new
    """
    if service_ids is None:
        service_ids = {}
    global CUR_ID
    # sort by id
    data.sort(key=lambda k: k['id'])
    ids = {}
    ct = 0
    total = len(data)
    # start creating
    for obj in data:
        # update status
        ct += 1
        update_state(task_handle, 'Importing %s (%d/%d)' % (srv[0], ct, total))
        # trim id field
        old_id, obj = _trim_id(obj)
        CUR_ID = old_id
        # delete not needed fields
        obj = _delete_fields(srv, obj)
        # related
        obj = _fix_related_fields(srv, obj, service_ids)
        obj['event_id'] = event_id
        # create object
        new_obj = srv[1](**obj)
        db.session.add(new_obj)
        db.session.commit()
        ids[old_id] = new_obj.id
        # add uploads to queue
        _upload_media_queue(srv, new_obj)

    return ids


After the data has been saved, the next thing to do is upload all the media files to the server. This we do using the
_upload_media_queue()  function. It takes paths to upload the files to from the storage.py helper file for APIs. Then it uploads the files using the various helper functions to the static data storage services like AWS S3, Google storage, etc.

Other than this, the import helpers also contains the function to create an import job that keeps a record of all the imports along with the task url and the user id of the user who started the importing task. It also stores the status of the task. Then there is the get_file_from_request()  function which saves the file that is uploaded through the POST request and returns the path to that file.

Get Response about Event Imported

The POST request returns a task url of the form /v1/tasks/ebe07632-392b-4ae9-8501-87ac27258ce5. To get the final result, you need to keep polling this URL. To know more about polling read my previous blog about exporting event or visit this link. So when the task is completed you would get a “result” key along with the status. The state can either be SUCCESS or FAILURE. If it is a FAILURE you will get a corresponding error message due to which the celery task failed. If it is a success then you get data related to the corresponding event that was created because of import. The data returned are the event id, event name and the event url which you can use to visit the event from the frontend. This data is also sent to the user as an email and notification.

An example response looks something like this:

{ 
    “result”: {
“event_name” : “FOSSASIA 2016”,
     “id” : “24”,
     “url” : “https://eventyay.com/events/ab3de6
},
    “state” : “SUCCESS”
}

The corresponding event name and the url is also sent to the user who started the import task. From the frontend, one can use the object value of the result to show the name of the event that is imported along with providing the event url. Since the id and identifier are both present in the result returned one can also make use of them to send GET, PATCH and other API requests to the events/ endpoint and get the corresponding relationship urls from it to query the other APIs. Thus, the entire data that is imported gets available to the frontend as well.

 

Reference Links:

 

Continue ReadingCreate Event by Importing JSON files in Open Event Server

Using Custom Forms In Open Event API Server

One feature of the  Open Event management system is the ability to add a custom form for an event. The nextgen API Server exposes endpoints to view, edit and delete forms and form-fields. This blogpost describes how to use a custom-form in Open Event API Server.

Custom forms allow the event organizer to make a personalized forms for his/her event. The form object includes an identifier set by the user, and the form itself in the form of a string. The user can also set the type for the form which can be either of text or checkbox depending on the user needs. There are other fields as well, which are abstracted. These fields include:

  • id : auto generated unique identifier for the form
  • event_id : id of the event with which the form is associated
  • is_required : If the form is required
  • is_included : if the form is to be included
  • is_fixed : if the form is fixedThe last three of these fields are boolean fields and provide the user with better control over forms use-cases in the event management.

Only the event organizer has permissions to edit or delete these forms, while any user who is logged in to eventyay.com can see the fields available for a custom form for an event.

To create a custom-form for event with id=1, the following request is to be made:
POST  https://api.eventyay.com/v1/events/1/custom-forms?sort=type&filter=[]

with all the above described fields to be included in the request body.  For example:

{
 "data": {
   "type": "custom_form",
   "attributes": {
     "form": "form",
     "type": "text",
     "field-identifier": "abc123",
     "is-required": "true",
     "is-included": "false",
     "is-fixed": "false"
   }
 }
}

The API returns the custom form object along with the event relationships and other self and related links. To see what the response looks like exactly, please check the sample here.

Now that we have created a form, any user can get the fields for the same. But let’s say that the event organiser wants to update some field or some other attribute for the form, he can make the following request along with the custom-form id.

PATCH https://api.eventyay.com/v1/custom-forms/1

(Note: custom-form id must be included in both the URL as well as request body)

Similarly, to delete the form,
DELETE https://api.eventyay.com/v1/custom-forms/1     can be used.

Resources

Continue ReadingUsing Custom Forms In Open Event API Server

Skill Editor in SUSI Skill CMS

SUSI Skill CMS is a web application built on ReactJS framework for creating and editing SUSI skills easily. It follows an API centric approach where the SUSI Server acts as an API server. In this blogpost we will see how to add a component which can be used to create a new skill SUSI Skill CMS.

For creating any skill in SUSI we need four parameters i.e model, group, language, skill name. So we need to ask these 4 parameters from the user. For input purposes we have a common card component which has dropdowns for selecting models, groups and languages, and a text field for skill name input.

<SelectField
    floatingLabelText="Model"
    value={this.state.modelValue}
    onChange={this.handleModelChange}
>
    {models}
</SelectField>
<SelectField
    floatingLabelText="Group"
    value={this.state.groupValue}
    onChange={this.handleGroupChange}
>
    {groups}
</SelectField>
<SelectField
    floatingLabelText="Language"
    value={this.state.languageValue}
    onChange={this.handleLanguageChange}
>
    {languages}
</SelectField>
<TextField
    floatingLabelText="Enter Skill name"
    floatingLabelFixed={true}
    hintText="My SUSI Skill"
    onChange={this.handleExpertChange}
/>
<RaisedButton label="Save" backgroundColor="#4285f4" labelColor="#fff" style={{marginLeft:10}} onTouchTap={this.saveClick} />

This is the card component where we get the user input. We have API endpoints on SUSI Server for getting the list of models, groups and languages. Using those APIs we inflate the dropdowns.
Then the user needs to edit the skill. For editing of skills we have used Ace Editor. Ace is an code
editor written in pure javascript. It matches the features native editors like Sublime and TextMate.

To use Ace we need to install the component.

npm install react-ace --save                        

This command will install the dependency and update the package.json file in our project with this dependency.

To use this editor we need to import AceEditor and place it in the render function of our react class.

<AceEditor
    mode=" markup"
    theme={this.state.editorTheme}
    width="100%"
    fontSize={this.state.fontSizeCode}
    height= "400px"
    value={this.state.code}
    name="skill_code_editor"
    onChange={this.onChange}
    editorProps={{$blockScrolling: true}}
/>

Now we have a page that looks something like this

Now we need to handle the click event when a user clicks on the save button.

First we check if the user is logged in or not. For this we check if we have the required cookies and the access token of the user.

 if(!cookies.get('loggedIn')) {
            notification.open({
                message: 'Not logged In',
                description: 'Please login and then try to create/edit a skill',
                icon: <Icon type="close-circle" style={{ color: '#f44336' }} />,
            });
            return 0;
        }

If the user is not logged in then we show him a error notification and asks him to login.

Then we check if he has filled all the required fields like name of the skill etc. and after that we call an API Endpoint on SUSI Server that will finally store the skill in the skill_data_repo.

let url= “http://api.susi.ai/cms/modifySkill.json”
$.ajax({
    url:url,
    dataType: 'jsonp',
    jsonp: 'callback',
    crossDomain: true,
    success: function (data) {
        console.log(data);
        if(data.accepted===true){
            notification.open({
                message: 'Accepted',
                description: 'Your Skill has been uploaded to the server',
                icon: <Icon type="check-circle" style={{ color: '#00C853' }} />,
            });
           }
    }
});

In the success function of ajax call we check if accepted parameter is true from the server or not. If accepted is true then we show user a notification with a message that “Your Skill has been uploaded to the server”.

To see this component running please visit http://skills.susi.ai/skillEditor.

Resources

Material-UI: http://www.material-ui.com/

Ace Editor: https://github.com/securingsincity/react-ace

Ajax: http://api.jquery.com/jquery.ajax/

Universal Cookies: https://www.npmjs.com/package/universal-cookie

Continue ReadingSkill Editor in SUSI Skill CMS

Uploading Images to SUSI Server

SUSI Skill CMS is a web app to create and modify SUSI Skills. It needs API Endpoints to function and SUSI Server makes it possible. In this blogpost, we will see how to add a servlet to SUSI Server to upload images and files.

The CreateSkillService.java file is the servlet which handles the process of creating new Skills. It requires different user roles to be implemented and hence it extends the AbstractAPIHandler.

Image upload is only possible via a POST request so we will first override the doPost method in this servlet.

  @Override
  protected void doPost(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
  resp.setHeader("Access-Control-Allow-Origin", "*"); // enable CORS

resp.setHeader enables the CORS for the servlet. This is required as POST requests must have CORS enables from the server. This is an important security feature that is provided by the browser.

        Part file = req.getPart("image");
        if (file == null) {
            json.put("accepted", false);
            json.put("message", "Image not given");
        }

Image upload to servers is usually a Multipart Request. So we get the part which is named as “image” in the form data.

When we receive the image file, then we check if the image with the same name exists on the server or not.

Path p = Paths.get(language + File.separator + “images/” + image_name);

        if (image_name == null || Files.exists(p)) {
                json.put("accepted", false);
                json.put("message", "The Image name not given or Image with same name is already present ");
            }

If the same file is present on the server then we return an error to the user requesting to give a unique filename to upload.

Image image = ImageIO.read(filecontent);
BufferedImage bi = this.createResizedCopy(image, 512, 512, true);
if(!Files.exists(Paths.get(language.getPath() + File.separator + "images"))){
   new File(language.getPath() + File.separator + "images").mkdirs();
           }
ImageIO.write(bi, "jpg", new File(language.getPath() + File.separator + "images/" + image_name));

Then we read the content for the image in an Image object. Then we check if images directory exists or not. If there is no image directory in the skill path specified then create a folder named “images”.

We usually prefer square images at the Skill CMS. So we create a resized copy of the image of 512×512 dimensions and save that copy to the directory we created above.

BufferedImage createResizedCopy(Image originalImage, int scaledWidth, int scaledHeight, boolean preserveAlpha) {
        int imageType = preserveAlpha ? BufferedImage.TYPE_INT_RGB : BufferedImage.TYPE_INT_ARGB;
        BufferedImage scaledBI = new BufferedImage(scaledWidth, scaledHeight, imageType);
        Graphics2D g = scaledBI.createGraphics();
        if (preserveAlpha) {
            g.setComposite(AlphaComposite.Src);
        }
        g.drawImage(originalImage, 0, 0, scaledWidth, scaledHeight, null);
        g.dispose();
        return scaledBI;
    }

The function above is used to create a  resized copy of the image of specified dimensions. If the image was a PNG then it also preserves the transparency of the image while creating a copy.

Since the SUSI server follows an API centric approach, all servlets respond in JSON.

       resp.setContentType("application/json");
       resp.setCharacterEncoding("UTF-8");
       resp.getWriter().write(json.toString());’

At last, we set the character encoding and the character set of the output. This helps the clients to parse the data easily.

To see this endpoint in live send a POST request at http://api.susi.ai/cms/createSkill.json.

Resources

Apache Docs: https://commons.apache.org/proper/commons-fileupload/using.html

Multipart POST Request Tutorial: http://www.codejava.net/java-se/networking/upload-files-by-sending-multipart-request-programmatically

Java File Upload tutorial: https://ursaj.com/upload-files-in-java-with-servlet-api

Jetty Project: https://github.com/jetty-project/

Continue ReadingUploading Images to SUSI Server

Enhancing LoklakWordCloud app present on Loklak apps site

LoklakWordCloud app is presently hosted on loklak apps site. Before moving into the content of this blog, let us get a brief overview of the app. What does the app do? The app generates a word cloud using twitter data returned by loklak based on the query word provided by the user. The user enters a word in the input field and presses the search button. After that a word cloud is created using the content (text body, hashtags and mentioned) of the various tweets which contains the user provided query word.

In my previous post I wrote about creating the basic functional app. In this post I will be describing the next steps that have been implemented in the app.

Making the word cloud clickable

This is one of the most important and interesting features added to the app. The words in the cloud are now clickable.Whenever an user clicks on a word present in the cloud, the cloud is replaced by the word cloud of that selected word. How do we achieve this behaviour? Well, for this we use Jqcloud’s handler feature. While creating the list of objects for each word and its frequency, we also specify a handler corresponding to each of the word. The handler is supposed to handle a click event. Whenever a click event occurs, we set the value of $scope.tweet to the selected word and invoke the search function, which calls the loklak API and regenerates the word cloud.

for (var word in $scope.wordFreq) {
            $scope.wordCloudData.push({
                text: word,
                weight: $scope.wordFreq[word],
                handlers: {
                    click: function(e) {
                        $scope.tweet = e.target.textContent;
                        $scope.search();
                    }
                }
            });
        }

As it can be seen in the above snippet, handlers is simply an JavaScript object, which takes a function for the click event. In the function we pass the word selected as value of the tweet variable and call search method.

Adding filters to the app

Previously the app generated word cloud using the entire tweet content, that is, hashtags, mentions and tweet body. Thus the app was not flexible. User was not able to decide on which field he wants his word cloud to be generated. User might want to generate his  word cloud using only the hashtags or the mentions or simply the tweet body. In order to make this possible, filters have been introduced. Now we have filters for hashtags, mentions, tweet body and date.

<div class="col-md-6 tweet-filters">
              <strong>Filters</strong>
              <hr>
              <div class="filters">
                <label class="checkbox-inline"><input type="checkbox" value="" ng-model="hashtags">Hashtags</label>
                <label class="checkbox-inline"><input type="checkbox" value="" ng-model="mentions">Mentions</label>
                <label class="checkbox-inline"><input type="checkbox" value="" ng-model="tweetbody">Tweet body</label>
              </div>
              <div class="filter-all">
                <span class="select-all" ng-click="selectAll()"> Select all </span>
              </div>
            </div>

We have used checkboxes for the individual filters and have kept an option to select all the filters at once. Next we require to hook this HTML to AngularJS code to make the filters functional.

if ($scope.hashtags) {
                tweet.hashtags.forEach(function (hashtag) {
                    $scope.filteredWords.push("#" + hashtag);
                });
            }

            if ($scope.mentions) {
                tweet.mentions.forEach(function (mention) {
                    $scope.filteredWords.push("@" + mention);
                });
            }

In the above snippet, before adding the hashtags to the list of filtered words, we first make sure that the checkbox for hashtags is selected. Once we find out the the variable bound to the hashtags checkbox is true, we proceed further and add the hashtags associated with a given tweet to the list of filteredWords. The same strategy is applied for both mentions (shown in the snippet) and tweet bodies.

Adding error notification

Next, we handle certain errors to notify the users that there is problem in their input. Such cases include empty input. If user provides empty input then we notify him or her and break the search. Next we check whether From date is before To date or not. If From date is after To date then we notify the user about the problem.

if ($scope.tweet === "" || $scope.tweet === undefined) {
            $scope.error = "Please enter a valid query word";
            $scope.showError();
            return;
}

In the above snippet we check for empty or undefined input and display snackbar along with error accordingly.

if ((sinceDate !== "" && sinceDate !== undefined) && (endDate !== "" && endDate !== undefined)) {
            var date1 = new Date(sinceDate);
            var date2 = new Date(endDate);
            if (date1 > date2) {
                $scope.error = "To date should be after From date";
                $scope.showError();
                return;
            }
        }

The above snippet compares date. For comparing dates, first we fetch the values entered (via jquery date widget) into the respective input fields and then create JavaScript Date objects out of them. Finally we compare those Date objects to find out if there is any error or not.

Now it might happen that a particular search is taking a long time (perhaps due to network problem), however the user becomes impatient and tries to search again. In that case we need to inform the user that the previous search is still going on. For this purpose we use a boolean variable  to keep track whether the previous search is completed or still going on. If the previous search is going on and user tries to make a new search then we provide a proper notification and prevent the user from making further searches.

Finally we need to make sure that the user is online and has an active internet connection before the search can take place and Loklak API can be called. For this we have used navigator. We have polled the onLine property of navigator to find out whether the user is online or not. If the user is offline then we inform him that we cannot initiate a search due to internet connectivity problem.

if ($scope.isLoading === true) {
            $scope.error = "Previous search not completed. Please wait...";
            $scope.showError();
            return;
        }
        if (!navigator.onLine) {
            $scope.error = "You are currently offline. Please check your internet connection!";
            $scope.showError();
            return;
        }

Important resources

  • View the app source here.
  • View loklak apps site source here.
  • View Loklak API documentation here
  • View Jqcloud documentation here.
  • Learn more about AngularJS here.
Continue ReadingEnhancing LoklakWordCloud app present on Loklak apps site

Writing Dredd Test for Event Topic-Event Endpoint in Open Event API Server

The API Server exposes a large set of endpoints which are well documented using apiary’s API Blueprint. Ton ensure that these documentations describe exactly what the API does, as in the response made to a request, testing them is crucial. This testing is done through Dredd Documentation testing with the help of FactoryBoy for faking objects.

In this blogpost I describe how to use FactoryBoy to write Dredd tests for the Event Topic- Event endpoint of Open Event API Server.

The endpoint for which tests are described here is this: For testing this endpoint, we need to simulate the API GET request by making a call to our database and then compare the response received to the expected response written in the api_blueprint.apib file. For GET to return some data we need to insert an event with some event topic in the database.

The documentation for this endpoint is the following:

To add the event topic and event objects for GET events-topics/1/events, we use a hook. This hook is written in hook_main.py file and is run before the request is made.

We add this decorator on the function which will add objects to the database. This decorator basically traverses the APIB docs following level with number of ‘#’ in the documentation to ‘>’ in the decorator. So for
 we have,

Now let’s write the method itself. In the method here, we first add the event topic object using EventTopic Factory defined in the factories/event-topic.py file, the code for which can be found here.

Since the endpoint also requires some event to be created in order to fetch events related to an event topic, we add an event object too based on the EventFactoryBasic class in factories/event.py  file. [Code]

To fetch the event related to a topic, the event must be referenced in that particular event topic. This is achieved by passing event_topic_id=1 when creating the event object, so that for the event that is created by the constructor, event topic is set as id = 1.
event = EventFactoryBasic(event_topic_id=1)
In the EventFactoryBasic class, the event_topic_id is set as ‘None’, so that we don’t have to create event topic for creating events in other endpoints testing also. This also lets us to not add event-topic as a related factory. To add event_topic_id=1 as the event’s attribute, an event topic with id = 1 must be already present, hence event_topic object is added first.
After adding the event object also, we commit both of these into the database. Now that we have an event topic object with id = 1, an event object with id = 1 , and the event is related to that event topic, we can make a call to GET event-topics/1/events and get the correct response.

Related:

Continue ReadingWriting Dredd Test for Event Topic-Event Endpoint in Open Event API Server

Developing LoklakWordCloud app for Loklak apps site

LoklakWordCloud app is an app to visualise data returned by loklak in form of a word cloud.

The app is presently hosted on Loklak apps site.

Word clouds provide a very simple, easy, yet interesting and effective way to analyse and visualise data. This app will allow users to create word cloud out of twitter data via Loklak API.

Presently the app is at its very early stage of development and more work is left to be done. The app consists of a input field where user can enter a query word and on pressing search button a word cloud will be generated using the words related to the query word entered.

Loklak API is used to fetch all the tweets which contain the query word entered by the user.

These tweets are processed to generate the word cloud.

Related issue: https://github.com/fossasia/apps.loklak.org/pull/279

Live app: http://apps.loklak.org/LoklakWordCloud/

Developing the app

The main challenge in developing this app is implementing its prime feature, that is, generating the word cloud. How do we get a dynamic word cloud which can be easily generated by the user based on the word he has entered? Well, here comes in Jqcloud. An awesome lightweight Jquery plugin for generating word clouds. All we need to do is provide list of words along with their weights.

Let us see step by step how this app (first version) works. First we require all the tweets which contain the entered word. For this we use Loklak search service. Once we get all the tweets, then we can parse the tweet body to create a list of words along with their frequency.

var url = "http://35.184.151.104/api/search.json?callback=JSON_CALLBACK&count=100&q=" + query;
        $http.jsonp(url)
            .then(function (response) {
                $scope.createWordCloudData(response.data.statuses);
                $scope.tweet = null;
            });

Once we have all the tweets, we need to extract the tweet texts and create a list of valid words. What are valid words? Well words like ‘the’, ‘is’, ‘a’, ‘for’, ‘of’, ‘then’, does not provide us with any important information and will not help us in doing any kind of analysis. So there is no use of including them in our word cloud. Such words are called stop words and we need to get rid of them. For this we are using a list of commonly used stop words. Such lists can be very easily found over the internet. Here is the list which we are using. Once we are able to extract the text from the tweets, we need to filter stop words and insert the valid words into a list.

 tweet = data[i];
            tweetWords = tweet.text.replace(", ", " ").split(" ");

            for (var j = 0; j < tweetWords.length; j++) {
                word = tweetWords[j];
                word = word.trim();
                if (word.startsWith("'") || word.startsWith('"') || word.startsWith("(") || word.startsWith("[")) {
                    word = word.substring(1);
                }
                if (word.endsWith("'") || word.endsWith('"') || word.endsWith(")") || word.endsWith("]") ||
                    word.endsWith("?") || word.endsWith(".")) {
                    word = word.substring(0, word.length - 1);
                }
                if (stopwords.indexOf(word.toLowerCase()) !== -1) {
                    continue;
                }
                if (word.startsWith("#") || word.startsWith("@")) {
                    continue;
                }
                if (word.startsWith("http") || word.startsWith("https")) {
                    continue;
                }
                $scope.filteredWords.push(word);
            }

What are we actually doing in the above snippet? We are simply iterating over each of the statuses returned by Loklak API. For each tweet, first we are splitting the text into words and then we are iterating over those words. For a given word we do a number of checks. First we check if the word begins or ends with a special character, for example quotation marks or brackets. If so we remove those character as it will cause trouble in calculating frequencies. Next we also check if the word is beginning with ‘#’ or ‘@’. If it is true, then we discard such words as we are handling hashtags and mentions separately. Finally we check whether the word is a stop word or not. If it is a stop word then we discard it. If a word passes all the checks, we add it to our list of valid words.

Once we are done with the tweet bodies, next we need to handle hashtags and mentions.

tweet.hashtags.forEach(function (hashtag) {
                $scope.filteredWords.push("#" + hashtag);
            });

            tweet.mentions.forEach(function (mention) {
                $scope.filteredWords.push("@" + mention);
            });

The above code simply iterates over the hashtags and mentions and inserts them into the filteredWords list. We have handled hashtags and mentions separately so that we can apply filters in future.

Once we are done with generating list of valid words, we need to calculate weight for each of the word. Here weight is nothing but the number of times a particular word is present in the list. We calculate this using JavaScript object. We iterate over the list of valid words. If word is not present in the object (or dictionary as you wish to call it), we create a new key by the name of that word and set its value to one. If a word is already present as a key, then we simply increment its value by one.

for (var word in $scope.wordFreq) {
            $scope.wordCloudData.push({
                text: word,
                weight: $scope.wordFreq[word]
            });
        }

The above code snippet calculates the frequency of each word by the process mentioned above.

Now we are all set to generate our word cloud. We simply use Jqcloud’s interface to configure it with the words and their respective frequencies, provide a list of color codes for a color gradient, and set autoResize to true so that our word cloud resizes itself when the screen size changes.

$scope.generateWordCloud = function() {
        if ($scope.wordCloud === null) {
            $scope.wordCloud = $('.wordcloud').jQCloud($scope.wordCloudData, {
                colors: ["#D50000", "#FF5722", "#FF9800", "#4CAF50", "#8BC34A", "#4DB6AC", "#7986CB", "#5C6BC0", "#64B5F6"],
                fontSize: {
                    from: 0.06,
                    to: 0.01
                },
                autoResize: true
            });
        } else {
            $scope.wordCloud = $(".wordcloud").jQCloud('update', $scope.wordCloudData);
        }
    }

Whenever the user searches for a new word, we simply update the existing word cloud with the cloud of the new word.

Future roadmap

  • Make the words in the cloud clickable. On clicking a word, the cloud should get replaced by the selected word’s cloud.
  • Add filters for hashtags, mentions, date.
  • Add option for exporting the cloud to an image, so that user’s can also use this app as a tool to generate word clouds as images and save them.
  • Add a loader and error notification for invalid or empty input.

Important resources

  • View the app source code here.
  • Learn more about Loklak API here.
  • Learn more about Jqcloud here.
  • Learn more about AngularJS here.
Continue ReadingDeveloping LoklakWordCloud app for Loklak apps site

Export an Event using APIs of Open Event Server

We in FOSSASIA’s Open Event Server project, allow the organizer, co-organizer and the admins to export all the data related to an event in the form of an archive of JSON files. This way the data can be reused in some other place for various different purposes. The basic workflow is something like this:

  • Send a POST request in the /events/{event_id}/export/json with a payload containing whether you require the various media files.
  • The POST request starts a celery task in the background to start extracting data related to event and jsonifying them
  • The celery task url is returned as a response. Sending a GET request to this url gives the status of the task. If the status is either FAILED or SUCCESS then there is the corresponding error message or the result.
  • Separate JSON files for events, speakers, sessions, micro-locations, tracks, session types and custom forms are created.
  • All this files are then archived and the zip is then served on the endpoint /events/{event_id}/exports/{path}
  • Sending a GET request to the above mentioned endpoint downloads a zip containing all the data related to the endpoint.

Let’s dive into each of these points one-by-one

POST request ( /events/{event_id}/export/json)

For making a POST request you firstly need a JWT authentication like most of the other API endpoints. You need to send a payload containing the settings for whether you want the media files related with the event to be downloaded along with the JSON files. An example payload looks like this:

{
   "image": true,
   "video": true,
   "document": true,
   "audio": true
 }

def export_event(event_id):
    from helpers.tasks import export_event_task

    settings = EXPORT_SETTING
    settings['image'] = request.json.get('image', False)
    settings['video'] = request.json.get('video', False)
    settings['document'] = request.json.get('document', False)
    settings['audio'] = request.json.get('audio', False)
    # queue task
    task = export_event_task.delay(
        current_identity.email, event_id, settings)
    # create Job
    create_export_job(task.id, event_id)

    # in case of testing
    if current_app.config.get('CELERY_ALWAYS_EAGER'):
        # send_export_mail(event_id, task.get())
        TASK_RESULTS[task.id] = {
            'result': task.get(),
            'state': task.state
        }
    return jsonify(
        task_url=url_for('tasks.celery_task', task_id=task.id)
    )


Taking the settings about the media files and the event id, we pass them as parameter to the export event celery task and queue up the task. We then create an entry in the database with the task url and the event id and the user who triggered the export to keep a record of the activity. After that we return as response the url for the celery task to the user.

If the celery task is still underway it show a response with ‘state’:’WAITING’. Once, the task is completed, the value of ‘state’ is either ‘FAILED’ or ‘SUCCESS’. If it is SUCCESS it returns the result of the task, in this case the download url for the zip.

Celery Task to Export Event

Exporting an event is a very time consuming process and we don’t want that this process to come in the way of user interaction with other services. So we needed to use a queueing system that would queue the tasks and execute them in the background with disturbing the main worker from executing the other user requests. We have used celery to queue tasks in the background and execute them without disturbing the other user requests.

We have created a celery task namely “export.event” which calls the event_export_task_base() which in turn calls the export_event_json() where all the jsonification process is carried out. To start the celery task all we do is export_event_task.delay(event_id, settings) and it return a celery task object with a task id that can be used to check the status of the task.

@celery.task(base=RequestContextTask, name='export.event', bind=True)
def export_event_task(self, email, event_id, settings):
    event = safe_query(db, Event, 'id', event_id, 'event_id')
    try:
        logging.info('Exporting started')
        path = event_export_task_base(event_id, settings)
        # task_id = self.request.id.__str__()  # str(async result)
        download_url = path

        result = {
            'download_url': download_url
        }
        logging.info('Exporting done.. sending email')
        send_export_mail(email=email, event_name=event.name, download_url=download_url)
    except Exception as e:
        print(traceback.format_exc())
        result = {'__error': True, 'result': str(e)}
        logging.info('Error in exporting.. sending email')
        send_export_mail(email=email, event_name=event.name, error_text=str(e))

    return result


After exporting a path to the export zip is returned. We then get the downloading endpoint and return it as the result of the celery task. In case there is an error in the celery task, we print an entire traceback in the celery worker and return the error as a result.

Make the Exported Zip Ready

We have a separate export_helpers.py file in the helpers module of API for performing various tasks related to exporting all the data of the event. The most important function in this file is the export_event_json(). This file accepts the event_id and the settings dictionary. In the export helpers we have global constant dictionaries which contain the order in which the fields are to appear in the JSON files created while exporting.

Firstly, we create the directory for storing the exported JSON and finally the archive of all the JSON files. Then we have a global dictionary named EXPORTS which contains all the tables and their corresponding Models which we want to extract from the database and store as JSON.  From the EXPORTS dict we get the Model names. We use this Models to make queries with the given event_id and retrieve the data from the database. After retrieving data, we use another helper function named _order_json which jsonifies the sqlalchemy data in the order that is mentioned in the dictionary. After this we download the media data, i.e. the slides, images, videos etc. related to that particular Model depending on the settings.

def export_event_json(event_id, settings):
    """
    Exports the event as a zip on the server and return its path
    """
    # make directory
    exports_dir = app.config['BASE_DIR'] + '/static/uploads/exports/'
    if not os.path.isdir(exports_dir):
        os.mkdir(exports_dir)
    dir_path = exports_dir + 'event%d' % int(event_id)
    if os.path.isdir(dir_path):
        shutil.rmtree(dir_path, ignore_errors=True)
    os.mkdir(dir_path)
    # save to directory
    for e in EXPORTS:
        if e[0] == 'event':
            query_obj = db.session.query(e[1]).filter(
                e[1].id == event_id).first()
            data = _order_json(dict(query_obj.__dict__), e)
            _download_media(data, 'event', dir_path, settings)
        else:
            query_objs = db.session.query(e[1]).filter(
                e[1].event_id == event_id).all()
            data = [_order_json(dict(query_obj.__dict__), e) for query_obj in query_objs]
            for count in range(len(data)):
                data[count] = _order_json(data[count], e)
                _download_media(data[count], e[0], dir_path, settings)
        data_str = json.dumps(data, indent=4, ensure_ascii=False).encode('utf-8')
        fp = open(dir_path + '/' + e[0], 'w')
        fp.write(data_str)
        fp.close()
    # add meta
    data_str = json.dumps(
        _generate_meta(), sort_keys=True,
        indent=4, ensure_ascii=False
    ).encode('utf-8')
    fp = open(dir_path + '/meta', 'w')
    fp.write(data_str)
    fp.close()
    # make zip
    shutil.make_archive(dir_path, 'zip', dir_path)
    dir_path = dir_path + ".zip"

    storage_path = UPLOAD_PATHS['exports']['zip'].format(
        event_id=event_id
    )
    uploaded_file = UploadedFile(dir_path, dir_path.rsplit('/', 1)[1])
    storage_url = upload(uploaded_file, storage_path)

    return storage_url


After we receive the json data from the _order_json() function, we create a dump of the json using json.dumps with an indentation of 4 spaces and utf-8 encoding. Then we save this dump in a file named according to the model from which the data was retrieved. This process is repeated for all the models that are mentioned in the EXPORTS dictionary. After all the JSON files are created and all the media is downloaded, we make a zip of the folder.

To do this we use shutil.make_archive. It creates a zip and uploads the zip to the storage service used by the server such as S3, google storage, etc. and returns the url for the zip through which it can be accessed.

Apart from this function, the other major function in this file is to create an export job entry in the database so that we can keep a track about which used started a task related to which event and help us in debugging and security purposes.

Downloading the Zip File

After the exporting is completed, if you send a GET request to the task url, you get a response similar to this:

{
   "result": {
     "download_url": "http://localhost:5000/static/media/exports/1/zip/OGpMM0w2RH/event1.zip"
   },
   "state": "SUCCESS"
 }

So on opening the download url in the browser or using any other tool, you can download the zip file.

One big question however remains is, all the workflow is okay but how do you understand after sending the POST request, that the task is completed and ready to be downloaded? One way of solving this problem is a technique known as polling. In polling what we do is we send a GET request repeatedly after every fixed interval of time. So, what we do is from the POST request we get the url for the export task. You keep polling this task url until the state is either “FAILED” or “SUCCESS”. If it is a SUCCESS you append the download url somewhere in your website which can then clicked to download the archived export of the event.

 

Reference:

 

Continue ReadingExport an Event using APIs of Open Event Server

Uploading Files via APIs in the Open Event Server

There are two file upload endpoints. One is endpoint for image upload and the other is for all other files being uploaded. The latter endpoint is to be used for uploading files such as slides, videos and other presentation materials for a session. So, in FOSSASIA’s Orga Server project, when we need to upload a file, we make an API request to this endpoint which is turn uploads the file to the server and returns back the url for the uploaded file. We then store this url for the uploaded file to the database with the corresponding row entry.

Sending Data

The endpoint /upload/file  accepts a POST request, containing a multipart/form-data payload. If there is a single file that is uploaded, then it is uploaded under the key “file” else an array of file is sent under the key “files”.

A typical single file upload cURL request would look like this:

curl -H “Authorization: JWT <key>” -F file=@file.pdf -x POST http://localhost:5000/v1/upload/file

A typical multi-file upload cURL request would look something like this:

curl -H “Authorization: JWT <key>” -F files=@file1.pdf -F files=@file2.pdf -x POST http://localhost:5000/v1/upload/file

Thus, unlike other endpoints in open event orga server project, we don’t send a json encoded request. Instead it is a form data request.

Saving Files

We use different services such as S3, google cloud storage and so on for storing the files depending on the admin settings as decided by the admin of the project. One can even ask to save the files locally by passing a GET parameter force_local=true. So, in the backend we have 2 cases to tackle- Single File Upload and Multiple Files Upload.

Single File Upload

if 'file' in request.files:
        files = request.files['file']
        file_uploaded = uploaded_file(files=files)
        if force_local == 'true':
            files_url = upload_local(
                file_uploaded,
                UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
            )
        else:
            files_url = upload(
                file_uploaded,
                UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
            )


We get the file, that is to be uploaded using
request.files[‘file’] with the key as ‘file’ which was used in the payload. Then we use the uploaded_file() helper function to convert the file data received as payload into a proper file and store it in a temporary storage. After this, if force_local is set as true, we use the upload_local helper function to upload it to the local storage, i.e. the server where the application is hosted, else we use whatever service is set by the admin in the admin settings.

In uploaded_file() function of helpers module, we extract the filename and the extension of the file from the form-data payload. Then we check if the suitable directory already exists. If it doesn’t exist, we create a new directory and then save the file in the directory

extension = files.filename.split('.')[1]
        filename = get_file_name() + '.' + extension
        filedir = current_app.config.get('BASE_DIR') + '/static/uploads/'
        if not os.path.isdir(filedir):
            os.makedirs(filedir)
        file_path = filedir + filename
        files.save(file_path)


After that the upload function gets the settings key for either s3 or google storage and then uses the corresponding functions to upload this temporary file to the storage.

Multiple File Upload

 elif 'files[]' in request.files:
        files = request.files.getlist('files[]')
        files_uploaded = uploaded_file(files=files, multiple=True)
        files_url = []
        for file_uploaded in files_uploaded:
            if force_local == 'true':
                files_url.append(upload_local(
                    file_uploaded,
                    UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
                ))
            else:
                files_url.append(upload(
                    file_uploaded,
                    UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
                ))


In case of multiple files upload, we get a list of files instead of a single file. Hence we get the list of files sent as form data using
request.files.getlist(‘files[]’). Here ‘files’ is the key that is used and since it is an array of file content, hence it is written as files[]. We again use the uploaded_file() function to get back a list of temporary files from the content that has been uploaded as form-data. After that we loop over all the temporary files that are stored in the variable files_uploaded in the above code. Next, for every file in the list of temporary files, we use the upload() helper function to save these files in the storage system of the application.

In the uploaded_file() function of the helpers module, since this time there are multiple files and their content sent, so things work differently. We loop over all the files that are received and for each of these files we find their filename and extension. Then we create directories to save these files in and then save the content of the file with the corresponding filename and extension. After the file has been saved, we append it to a list and finally return the entire list so that we can get a list of all files.

if multiple:
        files_uploaded = []
        for file in files:
            extension = file.filename.split('.')[1]
            filename = get_file_name() + '.' + extension
            filedir = current_app.config.get('BASE_DIR') + '/static/uploads/'
            if not os.path.isdir(filedir):
                os.makedirs(filedir)
            file_path = filedir + filename
            file.save(file_path)
            files_uploaded.append(UploadedFile(file_path, filename))


The
upload() function then finally returns us the urls for the files after saving them.

API Response

The file upload endpoint either returns a single url or a list of urls depending on whether a single file was uploaded or multiple files were uploaded. The url for the file depends on the storage system that has been used. After the url or list of urls is received, we jsonify the entire response so that we can send a proper JSON response that can be parsed properly in the frontend and used for saving corresponding information to the database using the other API services.

A typical single file upload response looks like this:

{
     "url": "https://xyz.storage.com/asd/fgh/hjk/12332233.docx"
 }

Multiple file upload response looks like this:

{
     "url": [
         "https://xyz.storage.com/asd/fgh/hjk/12332233.docx",
         "https://xyz.storage.com/asd/fgh/hjk/66777777.ppt"
     ]
 }

You can find the related documentations and example payloads on how to use this endpoint to upload files here: http://open-event-api.herokuapp.com/#upload-file-upload.

 

Reference:

Continue ReadingUploading Files via APIs in the Open Event Server