Adding download feature to LoklakWordCloud app on Loklak apps site

One of the most important and useful feature that has recently been added to LoklakWordCloud app is enabling the user to download the generated word cloud as a png/jpeg image. This feature will allow the user to actually use this app as a tool to generate a word cloud using twitter data and save it on their disks for future use.

All that the user needs to do is generate the word cloud, choose an image type (png or jpeg) and click on export as image, a preview of the image to be downloaded will be displayed. Just hit enter and the word cloud will be saved on your disk. Thus users will not have to use any alternative process like taking a screenshot of the word cloud generated, etc.

Presently the complete app is hosted on Loklak apps site.

How does it work?

What we are doing is, we are exporting a part of the page (a div) as image and saving it. Apparently it might seem that we are taking a screenshot of a particular portion of a page and generating a download link. But actually it is not like that. The word cloud that is being generated by this app via Jqcloud is actually a collection of HTML nodes. Each node contains a word (part of the cloud) as a text content with some CSS styles to specify the size and color of that word. As user clicks on export to image option, the app traverses the div containing the cloud. It collects information about all the HTML nodes present under that div and creates a canvas representation of the entire div. So rather than taking a screenshot of the div, the app recreates the entire div and presents it to us. This entire process is accomplished by a lightweight JS library called html2canvas.

Let us have a look into the code that implements the download feature. At first we need to create the UI for the export and download option. User should be able to choose between png and jpeg before exporting to image. For this we have provided a dropdown containing the two options.

<div class="dropdown type" ng-if="download">
                <div class="dropdown-toggle select-type" data-toggle="dropdown">
                  {{imageType}}
                <span class="caret"></span></div>
                <ul class="dropdown-menu">
                  <li ng-click="changeType('png', 'png')"><a href="">png</a></li>
                  <li ng-click="changeType('jpeg', 'jpg')"><a href="">jpeg</a></li>
                </ul>
              </div>
              <a class="export" ng-click="export()" ng-if="download">Export as image</a>

In the above code snippet, firstly we create a dropdown menu with two list items, png and jpeg. With each each list item we attach a ng-click event which calls changeType function and passes two parameters, image type and extension.

The changeType function simply updates the current image type and extension with the selected ones.

$scope.changeType = function(type, ext) {
        $scope.imageType = type;
        $scope.imageExt = ext;
    }

The ‘export as image’ on clicking calls the export function. The export function uses html2canvas library’s interface to generate the canvas representation of the word cloud and also generates the download link and attaches it to the modal’s save button (described below). After everything is done it finally opens a modal with preview image and save option.

$scope.export = function() {
        html2canvas($(".wordcloud"), {
          onrendered: function(canvas) {
            var imgageData = canvas.toDataURL("image/" + $scope.imageType);
            var regex = /^data:image\/jpeg/;
            if ($scope.imageType === "png") {
                regex = /^data:image\/png/;
            }
            var newData = imgageData.replace(regex, "data:application/octet-stream");
            canvas.style.width = "80%";
            $(".wordcloud-canvas").html(canvas);
            $(".save-btn").attr("download", "Wordcloud." + $scope.imageExt).attr("href", newData);
            $("#preview").modal('show');
          },
          background: "#ffffff"
        });
    }

At the very beginning of this function, a call is made to html2canvas module and the div containing the word cloud is passed as a parameter. An object is also passed which contains a callback function defined for onrendered key. Inside the callback function we check the current image type and generate the corresponding url from the canvas. We display this canvas in the modal and set this download url as the href value of the modal’s save button.

Finally we display the modal.

The modal simply contains the preview image and a button to save the image on disk.

A sample image produced by the app is shown below.

Important resources

  • Know more about html2canvas here.
  • Know more about Jqcloud here.
  • View the app source here.
  • View loklak apps site source here.
  • View Loklak API documentation here
  • Learn more about AngularJS here.

Developing LoklakWordCloud app for Loklak apps site

LoklakWordCloud app is an app to visualise data returned by loklak in form of a word cloud.

The app is presently hosted on Loklak apps site.

Word clouds provide a very simple, easy, yet interesting and effective way to analyse and visualise data. This app will allow users to create word cloud out of twitter data via Loklak API.

Presently the app is at its very early stage of development and more work is left to be done. The app consists of a input field where user can enter a query word and on pressing search button a word cloud will be generated using the words related to the query word entered.

Loklak API is used to fetch all the tweets which contain the query word entered by the user.

These tweets are processed to generate the word cloud.

Related issue: https://github.com/fossasia/apps.loklak.org/pull/279

Live app: http://apps.loklak.org/LoklakWordCloud/

Developing the app

The main challenge in developing this app is implementing its prime feature, that is, generating the word cloud. How do we get a dynamic word cloud which can be easily generated by the user based on the word he has entered? Well, here comes in Jqcloud. An awesome lightweight Jquery plugin for generating word clouds. All we need to do is provide list of words along with their weights.

Let us see step by step how this app (first version) works. First we require all the tweets which contain the entered word. For this we use Loklak search service. Once we get all the tweets, then we can parse the tweet body to create a list of words along with their frequency.

var url = "http://35.184.151.104/api/search.json?callback=JSON_CALLBACK&count=100&q=" + query;
        $http.jsonp(url)
            .then(function (response) {
                $scope.createWordCloudData(response.data.statuses);
                $scope.tweet = null;
            });

Once we have all the tweets, we need to extract the tweet texts and create a list of valid words. What are valid words? Well words like ‘the’, ‘is’, ‘a’, ‘for’, ‘of’, ‘then’, does not provide us with any important information and will not help us in doing any kind of analysis. So there is no use of including them in our word cloud. Such words are called stop words and we need to get rid of them. For this we are using a list of commonly used stop words. Such lists can be very easily found over the internet. Here is the list which we are using. Once we are able to extract the text from the tweets, we need to filter stop words and insert the valid words into a list.

 tweet = data[i];
            tweetWords = tweet.text.replace(", ", " ").split(" ");

            for (var j = 0; j < tweetWords.length; j++) {
                word = tweetWords[j];
                word = word.trim();
                if (word.startsWith("'") || word.startsWith('"') || word.startsWith("(") || word.startsWith("[")) {
                    word = word.substring(1);
                }
                if (word.endsWith("'") || word.endsWith('"') || word.endsWith(")") || word.endsWith("]") ||
                    word.endsWith("?") || word.endsWith(".")) {
                    word = word.substring(0, word.length - 1);
                }
                if (stopwords.indexOf(word.toLowerCase()) !== -1) {
                    continue;
                }
                if (word.startsWith("#") || word.startsWith("@")) {
                    continue;
                }
                if (word.startsWith("http") || word.startsWith("https")) {
                    continue;
                }
                $scope.filteredWords.push(word);
            }

What are we actually doing in the above snippet? We are simply iterating over each of the statuses returned by Loklak API. For each tweet, first we are splitting the text into words and then we are iterating over those words. For a given word we do a number of checks. First we check if the word begins or ends with a special character, for example quotation marks or brackets. If so we remove those character as it will cause trouble in calculating frequencies. Next we also check if the word is beginning with ‘#’ or ‘@’. If it is true, then we discard such words as we are handling hashtags and mentions separately. Finally we check whether the word is a stop word or not. If it is a stop word then we discard it. If a word passes all the checks, we add it to our list of valid words.

Once we are done with the tweet bodies, next we need to handle hashtags and mentions.

tweet.hashtags.forEach(function (hashtag) {
                $scope.filteredWords.push("#" + hashtag);
            });

            tweet.mentions.forEach(function (mention) {
                $scope.filteredWords.push("@" + mention);
            });

The above code simply iterates over the hashtags and mentions and inserts them into the filteredWords list. We have handled hashtags and mentions separately so that we can apply filters in future.

Once we are done with generating list of valid words, we need to calculate weight for each of the word. Here weight is nothing but the number of times a particular word is present in the list. We calculate this using JavaScript object. We iterate over the list of valid words. If word is not present in the object (or dictionary as you wish to call it), we create a new key by the name of that word and set its value to one. If a word is already present as a key, then we simply increment its value by one.

for (var word in $scope.wordFreq) {
            $scope.wordCloudData.push({
                text: word,
                weight: $scope.wordFreq[word]
            });
        }

The above code snippet calculates the frequency of each word by the process mentioned above.

Now we are all set to generate our word cloud. We simply use Jqcloud’s interface to configure it with the words and their respective frequencies, provide a list of color codes for a color gradient, and set autoResize to true so that our word cloud resizes itself when the screen size changes.

$scope.generateWordCloud = function() {
        if ($scope.wordCloud === null) {
            $scope.wordCloud = $('.wordcloud').jQCloud($scope.wordCloudData, {
                colors: ["#D50000", "#FF5722", "#FF9800", "#4CAF50", "#8BC34A", "#4DB6AC", "#7986CB", "#5C6BC0", "#64B5F6"],
                fontSize: {
                    from: 0.06,
                    to: 0.01
                },
                autoResize: true
            });
        } else {
            $scope.wordCloud = $(".wordcloud").jQCloud('update', $scope.wordCloudData);
        }
    }

Whenever the user searches for a new word, we simply update the existing word cloud with the cloud of the new word.

Future roadmap

  • Make the words in the cloud clickable. On clicking a word, the cloud should get replaced by the selected word’s cloud.
  • Add filters for hashtags, mentions, date.
  • Add option for exporting the cloud to an image, so that user’s can also use this app as a tool to generate word clouds as images and save them.
  • Add a loader and error notification for invalid or empty input.

Important resources

  • View the app source code here.
  • Learn more about Loklak API here.
  • Learn more about Jqcloud here.
  • Learn more about AngularJS here.

Visualising Tweet Statistics in MultiLinePlotter App for Loklak Apps

MultiLinePlotter app is now a part of Loklak apps site. This app can be used to compare aggregations of tweets containing a particular query word and visualise the data for better comparison. Recently there has been a new addition to the app. A feature for showing tweet statistics like the maximum number of tweets (along with date) containing the given query word and the average number of tweets over a period of time. Such statistics is visualised for all the query words for better comparison.

Related issue: https://github.com/fossasia/apps.loklak.org/issues/236

Obtaining Maximum number of tweets and average number of tweets

Before visualising the statistics we need to obtain them. For this we simply need to process the aggregations returned by the Loklak API. Let us start with maximum number of tweets containing the given keyword. What we actually require is what is the maximum number of tweets that were posted and contained the user given keyword and on which date the number was maximum. For this we can use a function which will iterate over all the aggregations and return the largest along with date.

$scope.getMaxTweetNumAndDate = function(aggregations) {
        var maxTweetDate = null;
        var maxTweetNum = -1;

        for (date in aggregations) {
            if (aggregations[date] > maxTweetNum) {
                maxTweetNum = aggregations[date];
                maxTweetDate = date;
            }
        }

        return {date: maxTweetDate, count: maxTweetNum};
    }

The above function maintains two variables, one for maximum number of tweets and another for date. We iterate over all the aggregations and for each aggregation we compare the number of tweets with the value stored in the maxTweetNum variable. If the current value is more than the value stored in that variable then we simply update it and keep track of the date. Finally we return an object containing both maximum number of tweets and the corresponding date.Next we need to obtain average number of tweets. We can do this by summing up all the tweet frequencies and dividing it by number of aggregations.

$scope.getAverageTweetNum = function(aggregations) {
        var avg = 0;
        var sum = 0;

        for (date in aggregations) {
            sum += aggregations[date];
        }

        return parseInt(sum / Object.keys(aggregations).length);
    }

The above function calculates average number of tweets in the way mentioned before the snippet.

Next for every tweet we need to store these values in a format which can easily be understood by morris.js. For this we use a list and store the statistics values for individual query words as objects and later pass it as a parameter to morris.

var maxStat = $scope.getMaxTweetNumAndDate(aggregations);
        var avg = $scope.getAverageTweetNum(aggregations);

        $scope.tweetStat.push({
            tweet: $scope.tweet,
            maxTweetCount: maxStat.count,
            maxTweetOn: maxStat.date,
            averageTweetsPerDay: avg,
            aggregationsLength: Object.keys(aggregations).length
        });

We maintain a list called tweetStat and the list contains objects which stores the query word and the corresponding values.

Apart from plotting these statistics, the app also displays the statistics when user clicks on an individual treat present in the search record section. For this we filter tweetStat list mentioned above and get the required object corresponding to the query word the user selected bind it to angular scope. Next we display it using HTML.

<div class="tweet-stat max-tweet">
                  <div class="stat-label"> <h4>Maximum number of tweets containing '{{modalHeading}}' :</h4></div>
                  <div class="stat-value"> <strong>{{selectedTweetStat.maxTweetCount}}</strong> tweets on
                    <strong>{{selectedTweetStat.maxTweetOn}}</strong>
                  </div>
                </div>

Finally we need to plot the statistics. For this we use a function called plotStatGraph dedicated only for plotting statistics graph. We pass the tweetStat list as a parameter to morris and configure all the other parameters.

$scope.plotStatGraph = function() {
        $scope.plotStat = new Morris.Bar({
            element: 'graph',
            data: $scope.tweetStat,
            xkey: 'tweet',
            ykeys: ['maxTweetCount', 'averageTweetsPerDay'],
            labels: ['Maximum no. of tweets : ', 'Average no. of tweets/day'],
            parseTime: false,
            hideHover: 'auto',
            resize: true,
            stacked: true,
            barSizeRatio: 0.40
        });
        $scope.graphLoading = false;
    }

But now we have two graphs. One for showing variations in aggregation and the other for showing statistics. How do we manage them? Somehow we need to show them in the same page as this is a single page app. Also we need to avoid vertical scrolling as it would degrade both UI and UX. So we need to implement a switching mechanism. The user should be able to switch between the two graph views as per their wish. How to achieve that? Well, for this we maintain a global variable which will keep track of the current plot type. If the current graph type is aggregations then we call the function to plot aggregations otherwise we call the above mentioned function to plot statistics.

$scope.plotData = function() {
        $(".plot-data").html("");
        if ($scope.currentGraphType === "aggregations") {
            $scope.plotAggregationGraph();
        } else {
            $scope.plotStatGraph();
        }
    }

Lastly we integrate this state variable (currentGraphType) with the UI so that users can easily toggle between graph views with just a click.

<div class="switch" ng-click="toggle()">
                <span ng-if="queryRecords.length !== 0" class="glyphicon glyphicon-stats"></span>
              </div>

Important resources

Developing MultiLinePlotter App for Loklak

MultiLinePlotter is a web application which uses Loklak API under the hood to plot multiple tweet aggregations related to different user provided query words in the same graph. The user can give several query words and multiple lines for different queries will be plotted in the same graph. In this way, users will be able to compare tweet distribution for various keywords and visualise the comparison. All the searched queries are shown under the search record section. Clicking on a record causes a dialogue box to pop up where the individual tweets related to the query word is displayed. Users can also remove a series from the plot dynamically by just pressing the Remove button beside the query word in record section. The app is presently hosted on Loklak apps site.

Related issue – https://github.com/fossasia/apps.loklak.org/issues/225

Getting started with the app

Let us delve into the working of the app. The app uses Loklak aggregation API to get the data.

A call to the API looks something like this:

http://api.loklak.org/api/search.json?q=fossasia&source=cache&count=0&fields=created_at

A small snippet of the aggregation returned by the above API request is shown below.

"aggregations": {"created_at": {
    "2017-07-03": 3,
    "2017-07-04": 9,
    "2017-07-05": 12,
    "2017-07-06": 8,
}}

The API provides a nice date v/s number of tweets aggregation. Now we need to plot this. For plotting Morris.js has been used. It is a lightweight javascript library for visualising data.

One of the main features of this app is addition and removal of multiple series from the graph dynamically. How do we achieve that? Well, this can be achieved by manipulating the morris.js data list whenever a new query is made. Let us understand this in steps.

At first, the data is fetched using angular HTTP service.

$http.jsonp('http://api.loklak.org/api/search.json?callback=JSON_CALLBACK',
            {params: {q: $scope.tweet, source: 'cache', count: '0', fields: 'created_at'}})
                .then(function (response) {
                    $scope.getData(response.data.aggregations.created_at);
                    $scope.plotData();
                    $scope.queryRecords.push($scope.tweet);
                });

Once we get the data, getData function is called and the aggregation data is passed to it. The query word is also stored in queryRecords list for future use.

In order to plot a line graph morris.js requires a data object which will contain the required values for a series. Given below is an example of such a data object.

data: [
    { x: '2006', a: 100, b: 90 },
    { x: '2007', a: 75,  b: 65 },
    { x: '2008', a: 50,  b: 40 },
    { x: '2009', a: 75,  b: 65 },
],

For every ‘x’, ‘a’ and ‘b’ will be plotted. Thus two lines will be drawn. Our app will also maintain a data list like the one shown above, however, in our case, the data objects will have a variable number of keys. One key will determine the ‘x’ value and other keys will determine the ordinates (number of tweets).

All the data objects present in the data list needs to be updated whenever a new search is done.

The getData function does this for us.

var value = $scope.tweet;
        for (date in aggregations) {
            var present = false;
            for (var i = 0; i < $scope.data.length; i++) {
                var item = $scope.data[i];
                if (item['day'] === date) {
                    item[[value]] = aggregations[date];
                    $scope.data[i] = item
                    present = true;
                    break;
                }
            }
            if (!present) {
                $scope.data.push({day: date, [value]: aggregations[date]});
            }
        }


The for loop in the above code snippet updates the global data list used by morris.js. It simply iterates over the dates in the aggregation, extracts the object corresponding to a particular date, adds the new query word as a key and, the number of tweets on that date as the value.If a date is not already present in the list, then it inserts a new object corresponding to the date and query word. Once our data list is updated, we are ready to redraw the graph with the updated data. This is done using plotData function. The plotData function simply checks the user selected graph type. If the selected type is aggregations then it calls plotAggregationGraph() to redraw the aggregations plot.

$scope.remove = function(record) {
        $scope.queryRecords = $scope.queryRecords.filter(function(e) {
            return e !== record });

        $scope.data.forEach(function(item) {
            delete item[record];
        });

        $scope.data = $scope.data.filter(function(item) {
            return Object.keys(item).length !== 1;
        });

        $scope.ykeys = $scope.ykeys.filter(function(item) {
            return item !== record;
        });

        $scope.labels = $scope.labels.filter(function(item) {
            return item !== record;
        });

        $scope.plotData();
}

The above function simply scans the data list, filters the objects which contains selected record as a key and removes them using filter method of javascript arrays. It also removes the corresponding labels and entries from labels and ykeys arrays. Finally, it once again calls plotData function to redraw the plot.

Given below is a sample plot generated by this app with the query words – google, android, microsoft, samsung.

 

Conclusion

This blog post explained how multiple series are plotted dynamically in the MultiLinePlotter app. Apart from aggregations plot it also plots tweet statistics like maximum tweets and average tweets containing a query word and visualises them using stacked bar chart. I will be discussing about them in my subsequent blogs.

Important resources