Iterating the Loklak Server data

Loklak Server is amazing for what it does, but it is more impressive how it does the tasks. Iterators are used for and how to use them, but this project has a customized iterator that iterates Twitter data objects. This iterator is Timeline.java .

Timeline implements an interface iterable (isn’t it iterator?). This interface helps in using Timeline as an iterator and add methods to modify, use or create the data objects. At present, it only iterates Twitter data objects. I am working on it to modify it to iterate data objects from all web scrapers.

The following is a simple example of how an iterator is used.

// Initializing arraylist
List<String> stringsList = Arrays.asList("foo", "bar", "baz");

// Using iterator to display contents of stringsList
System.out.print("Contents of stringsList: ");

Iterator iter = al.iterator();
while(iter.hasNext()) {
    System.out.print(iter.next() + " ");
}

 

This iterator can only iterate data the way array does. (Then why do we need it?) It does the task of iterating objects perfectly, but we can add more functionality to the iterator.

 

Timeline iterator iterates the MessageEntry objects i.e. superclass of TwitterTweet objects. According to Javadocs, “Timeline is a structure which holds tweet for the purpose of presentation, There is no tweet retrieval method here, just an iterator which returns the tweets in reverse appearing order.”

Following are some of the tasks it does:

  1. As an iterator:

This basic use of Timeline is to iterate the MessageEntry objects. It not only iterates the data objects, but also fetches them (See here).

// Declare Timeline object according to order the data object has been created
Timeline tline = new Timeline(Timeline.parseOrder("created_at"));

// Adding data objects to the timeline
tline.add(me1);
tline.add(me2);
.
.
.
// Outputing all data objects as array of JSON objects
for (MessageEntry me: tline) {
    JSONArray postArray = new JSONArray();
    for (MessageEntry post : this) {
        postArray.put(post.toJSON());
    }
}

 

  1. The order of iterating the data objects

Timeline can arrange and iterate the data objects according to the date of creation of the twitter post, number of retweets or number of favourite counts. For this there is an Enum declaration of Order in the Timeline class which is initialized during creation of Timeline object. [link]

    Timeline tline = new Timeline(Timeline.parseOrder("created_at"));

 

  1. Pagination of data objects

There is an object cursor, some methods, including getter and setters to support pagination of the data objects. It is only internally implemented, but can also be used to return a section of the result.

  1. writeToIndex method

This method can be used to write all data fetched by Timeline iterator to ElasticSearch for indexing and to dump that can be used for testing. Thus, indexing of data can concurrently be done while it is iterated. It is implemented here.

  1. Other methods

It also has methods to output all data as JSON and customized method to add data to Timeline keeping user object and Data separate, etc. There are a bit more things in this iterable class which shall be explored instead.

 

Resources:

Published by

Vibhor Verma

Enthusiast | Passionate | on my way