Dispatching a search action to get result for a query in Loklak Search

For a new comer in Loklak, it’s not really easy to understand the complete codebase to just get results for a query without explicitly using navigation route i.e. ‘/search’. In order to help newcomers to easily understand a simple way to get results for a query in the codebase, I am writing this blog post.

Setting up a query

A sample searchQuery will be created of type Query. And following discussion will provide a way to get the results for a sample query – ‘from:Fossasia’ (Last 30 days tweets from FOSSASIA).

searchQuery: Query = {
    displayString: ‘from:FOSSASIA,
    queryString: ‘from:FOSSASIA’,
    routerString: ‘from:FOSSASIA’,
    filter: { video: false, image: false },
    location: null,
    timeBound: { since: null, until: null },
    from: true
}

 

Process discussed here can be used to get results in any class file within Loklak Search.

Using necessary imports

First step would be to add all necessary imports.

import { OnInit } from ‘@angular/core’;
import { Store } from ‘@ngrx/store’;
import { Observable } from ‘rxjs/Observable’;
import * as fromRoot from ‘../../reducers’;
import { Query } from ‘../../models/query’;
import * as searchAction from ‘../../actions/api’;
import { ApiResponseResult } from ‘../../models/api-response’;

 

Note: No need to import OnInit, if the class already implements it (OnInit is one of the Angular’s Lifecycle Hooks which controls the changes when the component containing it gets loaded initially).

The Query is used to define the type of searchQuery variable created above. And ApiResponseResult is the type of result which will be obtained on searching the searchQuery.

Declaring response variables and store object

Next step would be to declare the variables with the type of response and create store object of type Store with the current State of the application.

public isSearching$: Observable<boolean>;
public apiResponseResults$: Observable<ApiResponseResult[]>;
constructor(
    private store: Store<fromRoot.State>
) { }

 

isSearching$ holds an Observable which can be subscribed to get a boolean value which states the current status of searching of query. apiResponseResults$ holds the actual response of query.

Dispatching a SearchAction

The most crucial part comes here to dispatch a SearchAction with the created searchQuery as payload. A new method would be created with void as return type to dispatch the SearchAction.

private queryFromURL(): void {
    this.store.dispatch(
        new searchAction.SearchAction(this.searchQuery)
    );
}

Selecting and storing results from store

A new method would be created to store the current searching status along with the response of input query in isSearching$ and apiResponseResults$ respectively. Now the selector for the reducer function corresponding to the search and api response entities would be called.

private getDataFromStore(): void {
    this.isSearching$ = 
        this.store.select(fromRoot.getSearchLoading);
    this.apiResponseResults$ =   
        this.store.select(
            fromRoot.getApiResponseEntities
    );
}

Dispatching and Selecting on view Init

create ngOnInit() method, If you have not already done and call the respective dispatching and select method inside it respectively.

ngOnInit() {
    this.queryFromURL();
    this.getDataFromStore();
}

How should one test the results?

Call these lines inside any method and look for the results on browser console window.

console.log(this.isSearching$);
console.log(this.apiResponseResults$);

Resources

Continue Reading

Adding ‘Scroll to Top’ in Loklak Search

It is really important for a Progressive Web Application like Loklak to be user friendly. To enhance the user experience and help user scroll up through the long list of results, a new ‘Scroll to Top’ feature has been added to Loklak.

Integration Process

The feature is built using HostListener and Inject property of Angular, which allows to use Window and Document functionalities similar to Javascript.

Adding HostListener

First step would be to import HostListener and Inject, and create property of document using with Inject.

import { HostListener, Inject } from ‘@angular/core’;
navIsFixed: boolean;
constructor(
    @Inject(DOCUMENT) private document: Document
) { }

 

Next step would be to create method to detect the current state of window, and scroll to the top of window.

@HostListener(‘window:scroll’, [])
onWindowScroll() {
    if (window.pageYOffset
        || document.documentElement.scrollTop
        || document.body.scrollTop > 100) {
                this.navIsFixed = true;
    } else if (this.navIsFixed && window.pageYOffset
        || document.documentElement.scrollTop
            || document.body.scrollTop < 10) {
                    this.navIsFixed = false;
            }
        } scrollToTop() {
            (function smoothscroll() {
                const currentScroll =
                    document.documentElement.scrollTop
                        || document.body.scrollTop;
            if (currentScroll > 0) {
            window.requestAnimationFrame(smoothscroll);
            window.scrollTo(0,
                currentScroll  (currentScroll / 5));
        }
    })();
}

 

The last step would be to add user interface part i.e. HTML and CSS component for the Scroll to Top feature.

Creating user interface

HTML

HTML portion contains a div tag with class set to CSS created and a button with click event attached to it to call the scrollToTop() method created above.

CSS

The CSS class corresponding to the div tag provides the layout of the button.

button {
   display: block;
   margin: 32px auto;
   font-size: 1.1em;
   padding: 5px;
   font-weight: 700;
   background: transparent;
   border: none;
   cursor: pointer;
   height: 40px;
   width: 40px;
   font-size: 30px;
   color: white;
   background-color: #d23e42;
   border-radius: 76px;
   &:focus {
       border: none;
       outline: none;
   }
}
.scroll-to-top {
  position: fixed;
  bottom: 15px;
  right: 15px;
  opacity: 0;
  transition: all .2s ease-in-out;
}
.show-scroll {
  opacity: 1;
  transition: all .2s ease-in-out;
}

 

The button style creates a round button with an interactive css properties. scroll-to-top class defines the behaviour of the div tag when user scrolls on the results page.

How to use?

Goto Loklak and search a query. On results page, scroll down the window to see upward arrow contained in a red circle like:

On clicking the button, user should be moved to the top of results page.

Resources

Continue Reading

Adding Speech Component in Loklak Search

Speech recognition service for voice search is already embedded in Loklak. Now the idea is to use this service and create a new separate component for voice recognition with an interactive and user friendly interface. This blog will cover every single portion of an Angular’s redux based component from writing actions and reducers to the use of created component in other required components.

Creating Action and Reducer Function

The main idea to create an Action is to control the flow of use of Speech Component in Loklak Search. The Speech Component will be called on and off based on this Action.

Here, the first step is to create speech.ts file in actions folder with the following code:

import { Action } from '@ngrx/store';

export const ActionTypes = {
   MODE_CHANGE: '[Speech] Change',
};

export class SearchAction implements Action {
   type = ActionTypes.MODE_CHANGE;

   constructor(public payload: any) {}
}

export type Actions
   = SearchAction;

 

In the above segment, only one action (MODE_CHANGE) has been created which is like a boolean value which is being returned as true or false i.e. whether the speech component is currently in use or not. This is a basic format to be followed in creating an Action which is being followed in Loklak Search. The next step would be to create speech.ts file in reducers folder with the following code:

import { Action } from '@ngrx/store';
import * as speech from '../actions/speech';
export const MODE_CHANGE = 'MODE_CHANGE';

export interface State {
   speechStatus: boolean;
}
export const initialState: State = {
   speechStatus: false
};
export function reducer(state: State = initialState,
   action: speech.Actions): State {
        switch (action.type) {
            case speech.ActionTypes.MODE_CHANGE: {
                const response = action.payload;
                return Object.assign({}, state,
                {speechStatus: response});
       }
       default: {
           return state;
       }
   }
}
export const getspeechStatus = (state: State) =>
    state.speechStatus;

 

It follows the format of reducer functions created in Loklak Search. Here, the main key point is the state creation and type of value it is storing i.e. State is containing a speechStatus of type boolean. Defining an initial state with speechStatus value false (Considering initially the Speech Component will not be in use). The reducer function a new state by toggling the input state based on the type of Action created above and it returns the input state by default. At last wrapping the state as a function and returning the state’s speechStatus value.

Third and last step in this section would be to create a selector for the above reducer function in the root reducer index file.

Import and add speech from speech reducer file into the general state in root reducer file. And at last export the created selector function for speech reducer.

import * as fromSpeech from './speech';
export interface State {
   ...
   speech: fromSpeech.State;
}
export const getSpeechState = (state: State) =>
    state.speech;
export const getspeechStatus = createSelector(
    getSpeechState, fromSpeech.getspeechStatus);

Creating Speech Component

Now comes the main part to create and define the functioning of Speech Component. For creating the basic Speech Component, following command is used:

ng generate component app/speech --module=app

 

It will automatically create and provide Speech Component in app.module.ts. The working structure of Speech Component has been followed as of Google’s voice recognition feature for voice searching. Rather than providing description of each single line of code the following portion will cover the main code responsible for the functioning of Speech Component.

Importing and defining speech service in constructor:

import {
    SpeechService
} from '../services/speech.service';
constructor(
       private speech: SpeechService,
       private store: Store<fromRoot.State>,
       private router: Router
   ) {
       this.resultspage =this.router.url
       .toString().includes('/search');
       if (this.resultspage) {
           this.shadowleft = '-103px';
           this.shadowtop = '-102px';
       }
       this.speechRecognition();
}
speechRecognition() {
    this.speech.record('en_US').subscribe(voice =>
        this.onquery(voice));
}

 

When the Speech Component is called, speechRecognition() method will start recording speech (It will use the record() method from speech service to record the user voice).

For fluctuating border height and color of voice search icon, a resettimer() method is created.

randomize(min, max) {
    let x;
    x = (Math.random() * (max - min) + min);
    return x;
}
resettimer(recheck: boolean = false) {
    this.subscription.unsubscribe();
    this.timer = Observable.timer(0, 100);
    this.subscription = this.timer.subscribe(t => {
    this.ticks = t;
        if (t % 10 === 0 && t <= 20) {
            this.buttoncolor = '#f44';
            this.miccolor = '#fff';
            this.borderheight =
                this.randomize(0.7, 1);
            if (this.resultspage) {
                this.borderheight =
                    this.randomize(0.35, 0.5);
            }
            if (!recheck) {
                this.resettimer(true);
            }
        }
        if (t === 20) {
            this.borderheight = 0;
        }
        if (t === 30) {
            this.subscription.unsubscribe();
            this.store.dispatch(new speechactions
            .SearchAction(false));
        }
    });
}

 

The randomize() method provides a random number between min and max value.

To put on check and display status as message on things like whether microphone is working, or user has spoken something, or if the speech is being recorded, based on the time elapsed in calling of speech component and actual voice recording, the following portion of code is written in ngOnInit() method.

ngOnInit() {
    this.timer = Observable.timer(1500, 2000);
    this.subscription = this.timer.subscribe(t => {
        this.ticks = t;
        if (t === 1) {
            this.message = 'Listening...';
        }
        if (t === 4) {
            this.message = 'Please check your 
            microphone and volume levels.';
            this.miccolor = '#C2C2C2';
        }
        if (t === 6) {
            this.subscription.unsubscribe();
            this.store.dispatch(new speechactions
                .SearchAction(false));
        }
    });
}

 

The logic can be understood as if the elapsed time is 1 sec, it means it is listening to the speaker’s voice. And if the elapsed time is 4 sec, it means there is something wrong and user will be asked to check for the microphone and volume levels. At last if it tends to 6 seconds, then the Speech Component will be called off with the dispatched Action as false which is defined above (That means it is no longer in use).

Embed Speech Component in main App Component

Now comes the last part to use the created featured component in the required place. Code below describes embedding Speech Component in App Component.

Import SpeechService and required modules.

import {
    SpeechService
} from './services/speech.service';
import { Observable } from 'rxjs/Observable';

 

hidespeech will be used to store the current status of Speech Component (whether its in use or not), and completeQuery$ and searchData store the voice recorded in form Observable and String. completeQuery$ is optional (If the Speech Component is unable to track voice of speaker by any means, then it will not contain any value and hence searchData will be empty).

hidespeech: Observable<any>;
completeQuery$: Observable<any>;
searchData: String;

 

Creating speech parameter in constructor and store the current status of speech and store it into hidespeech. Based on the subscribed value of hidespeech, speech service’s stoprecord() will be called (To stop recording when the speech recognition completes). After recording stops, store the whole query in completeQuery$.

constructor (
    private speech: SpeechService
) {
    this.hidespeech = store.select(
        fromRoot.getspeechStatus);
    this.hidespeech.subscribe(hidespeech => {
        if (!hidespeech) {
            this.speech.stoprecord();
        }
    });
    this.completeQuery$ = store.select(
        fromRoot.getQuery);
    this.completeQuery$.subscribe(data => {
        this.searchData = data;
    });
}

 

Add the Speech Component in app.component.html. Now the main logic of calling Speech Component will be based on the subscribed observable value of hidespeech (If false then call Speech Component else not).

<app-speech *ngIf="hidespeech|async"></app-speech>

Using Speech Component in Home and FeedHeader Component

Import Speech Service and speech Action created above, and create hidespeech to store the current status of Speech Component.

import * as speechactions from '../../actions/speech';
import {
    SpeechService
} from '../../services/speech.service';
hidespeech: Observable<boolean>;

 

Create speech parameter of type SpeechService and store the current status of Speech Component in hidespeech. Dispatch speechactions.SearchAction (payload as true) for inferring that the Speech Component is currently in use.

constructor(
    private speech: SpeechService
) {
    this.hidespeech = store
        .select(fromRoot.getspeechStatus);
}
speechRecognition() {
    this.store.dispatch(
        new speechactions.SearchAction(true));
}

How to use the Speech Component?

Goto Loklak and click on Voice Input Icon. It will popup a screen as below.

Now, speak something to search. E.g. Google, the screen will turn into something like below with the spelled value displayed on screen.

If something goes wrong (Microphone did not work, low volume levels or unrecognisable voice), then screen will show something like:

On successful recognition of speech, the query will be set and the results will be shown as

Similar process is being followed on results page to make a search query using voice.

Resources

Continue Reading

Adding new test cases for increasing test coverage of Loklak Server

It is a good practice to have test cases covering major portion of actual code base. The idea was same to add new test cases in Loklak Server to increase its test coverage. The results were quite amazing with a significant increase of about 3% in total test coverage of the overall project. And about 80-100% increase in the test coverage of individual files for which tests have been written.

Integration Process

For integration, a total of 6 new test cases have been written:

  1. ASCIITest
  2. GeoLocationTest
  3. CommonPatternTest
  4. InstagramProfileScraperTest
  5. EventBriteCrawlerServiceTest
  6. LocationWiseTimeServiceTest

For increasing code coverage, Java docs have been written which increase the lines of code being covered.

Implementation

Basic implementation of adding a new test case is same. Here’s is an example of EventBriteCrawlerServiceTest implementation. This can be used as a reference for adding a new test case in the project.

Prerequisite: If the test file being written tests any additional external service (e.g. EventBriteCrawlerServiceTest tests any event being created on EventBrite platform) then, a corresponding new test service or test event should be written beforehand on the given platform. For EventBriteCrawlerServiceTest, a test-event has been created.

The given below steps will be followed for creating a new test case (EventBriteCrawlerServiceTest), assuming the prerequisite step has been done:

  • A new java file is generated in test/org/loklak/api/search/ as EventBriteCrawlerServiceTest.java.
  • Define package for the test file and import EventBriteCrawlerService.java file which has to be tested along with necessary packages and methods.

package org.loklak.api.search;

import org.loklak.api.search.EventBriteCrawlerService;
import org.loklak.susi.SusiThought;
import org.junit.Test;
import org.json.JSONArray;
import org.json.JSONObject;
import static org.junit.Assert.assertEquals;
import static org.junit.Assert.assertTrue;
import static org.junit.Assert.assertNotNull;

 

  • Define the class and methods that need to be tested. It is a good idea to break the testing into several test methods rather than testing more features in a single method.

public class EventBriteCrawlerServiceTest {
  @Test
  public void apiPathTest() { }

  @Test
  public void eventBriteEventTest() { }

  @Test
  public void eventBriteOrganizerTest() { }

  @Test
  public void eventExtraDetailsTest() { }
}

 

  • Create an object of EventBriteCrawlerService in each test method.

EventBriteCrawlerService eventBriteCrawlerService = new EventBriteCrawlerService();

 

  • Call specific method of EventBriteCrawlerService that needs to be tested in each of the defined methods of test file and assert the actualresult to the expectedresult.

@Test
public void apiPathTest() {
    EventBriteCrawlerService eventBriteCrawlerService = 
        new EventBriteCrawlerService();

     assertEquals("/api/eventbritecrawler.json",
        eventBriteCrawlerService.getAPIPath());
}

 

  • For methods fetching an actual result (Integration tests) of event page, define and initialise an expected set of results and assert the expected result with the actual result by parsing the json result.

@Test
public void eventBriteOrganizerTest() {
    EventBriteCrawlerService eventBriteCrawlerService =
        new EventBriteCrawlerService();
    String eventTestUrl = "https://www.eventbrite.com/e/
        testevent-tickets-46017636991";
    SusiThought resultPage = eventBriteCrawlerService
        .crawlEventBrite(eventTestUrl);
    JSONArray jsonArray = resultPage.getData();
    JSONObject organizer_details =
        jsonArray.getJSONObject(1);
    String organizer_contact_info = organizer_details
        .getString("organizer_contact_info");
    String organizer_link = organizer_details
        .getString("organizer_link");
    String organizer_profile_link = organizer_details
        .getString("organizer_profile_link");
    String organizer_name = organizer_details
        .getString("organizer_name");

    assertEquals("https://www.eventbrite.com
        /e/testevent-tickets-46017636991
        #lightbox_contact",
        organizer_contact_info);
    assertEquals("https://www.eventbrite.com
        /e/testevent-tickets-46017636991
        #listing-organizer", organizer_link);
    assertEquals("", organizer_profile_link);
    assertEquals("aurabh Srivastava", organizer_name);
}

 

  • If the test file is testing the harvester, then import and add the test class in TestRunner.java file. e.g.

import org.loklak.harvester.TwitterScraperTest;

@RunWith(Suite.class)
@Suite.SuiteClasses({
  TwitterScraperTest.class
})

Testing

The last step will be to check if the test case passes with integration of other tests. The gradle will test the test case along with other tests in the project.

./gradlew build

Resources

Continue Reading

Adding ‘Voice Search’ Feature in Loklak Search

It is beneficial to have a voice search feature embedded in a search engine like loklak.org based on PWA (Progressive Web Application) architecture. This will allow users to make query using voice on phone or computer. For integrating voice search, JavaScript Web Speech API (also known as webkitSpeechRecognition) is used which allows us to add speech recognition feature in any website or a web application.

Integration Process

For using webkitSpeechRecognition, a separate typescript service is being created along with its relevant unit test

  1. speech.service.ts
  2. speech.service.spec.ts

The main idea to implement the voice search was

  1. To create an injectable speech service.
  2. Write methods for starting and stopping voice record in it.
  3. Import and inject the created speech service into root app module.
  4. Use the created speech service into required HomeComponent and FeedHeaderComponent.

Structure of Speech Service

The speech service mainly consists of two methods

  • record() method which accepts lang as string input which specifies the language code for speech recording and returns a string Observable.

   record(lang: string): Observable<string> {
           return Observable.create(observe => {
                 const webkitSpeechRecognition }: IWindow = 
                        <IWindow>window;
                 this.recognition = new webkitSpeechRecognition();
                 this.recognition.continuous = true;
                 this.recognition.interimResults = true;
                 this.recognition.onresult = take => 
                 this.zone.run(()                            
                 observe.next(take
                 .results.item
                 (take.results.
                 length - 1).
                 item(0).
                 transcript);
                 this.recognition
                 .onerror = 
                 err => observe
                 .error(err);
                 this.recognition.onend 
                 = () => observe
                 .complete();
                 this.recognition.lang = 
                 lang;
                 this.recognition.start( 
             );
        });
     }

 

Using observe.complete() allows speech recognition action to stop when the user stops speaking.

  • stoprecord() method which is used to stop the current instance of recording.

 stoprecord() {
       if (this.recognition) {
           this.recognition.stop();
       }
   }

 

stop() method is used to stop the current instance of speech recognition.

Using Speech Service in required Component

In Loklak Search, the speech service have been included in two main components i.e. HomeComponent and FeedHeaderComponent. The basic idea of using the created speech service in these components is same.

In the TypeScript (not spec.ts) file of the two components, firstly import the SpeechService and create its object in constructor.

constructor( private speech: SpeechService ) { }

 

Secondly, define a speechRecognition() method and use the created instance or object of SpeechService in it to record speech and set the query as the recorded speech input. Here, default language has been set up as ‘en_US’ i.e. English.

 stoprecord() {
       if (this.recognition) {
           this.recognition.stop();
       }
   }

 

After user clicks on speech icon, this method will be called and the recorded speech will be set as the query and there will be router navigation to the /search page where the result will be displayed.

Resources

Continue Reading

Installing the Loklak Search and Deploying it to Surge

The Loklak search creates a website using the Loklak server as a data source. The goal is to get a search site, that offers timeline search as well as custom media search, account and geolocation search.

In order to run the service, you can use the API of http://api.loklak.org or install your own Loklak server data storage engine. Loklak_server is a server application which collects messages from various social media tweet sources, including Twitter. The server contains a search index and a peer-to-peer index sharing interface. All messages are stored in an elasticsearch index.

The site of this repo is deployed on the GitHub gh-pages branch and automatically deployed here: http://loklak.org

In this blog, we will talk about how to install Loklak_Search locally and deploying it to Surge (Static web publishing for Front-End Developers).

How to clone the repository

Sign up / Login to GitHub and head over to the Loklak_Search repository. Then follow these steps.

  1. Go ahead and fork the repository
https://github.com/fossasia/loklak_search
  1.   Get the clone of the forked version on your local machine using
git clone https://github.com/<username>/loklak_search.git
  1.   Add upstream to synchronize repository using
git remote add upstream https://github.com/fossasia/loklak_search.git

Getting Started

The Loklak search application basically consists of the following :

  1. First, we will need to install angular-cli by using the following command:
npm install -g @angular/[email protected]

2. After installing angular-cli we need to install our required node modules, so we will do that by using the following command:

npm install

3. Deploy locally by running this

ng serve

Go to localhost:4200 where the application will be running locally.

How to Deploy Loklak Search on Surge :

Surge is the technology which publishes or generates the static web-page demo link, which makes it easier for the developer to deploy their web-app. There are a lot of benefits of using surge over generating demo link using GitHub pages.

  1. We need to install surge on our machine. Type the following in your Linux terminal:
npm install –global surge

This installs the Surge on your machine to access Surge from the command line.

  1. In your project directory just run
surge
  1. After this, it will ask you three parameters, namely
Email
Password
Domain

After specifying all these three parameters, the deployment link with the respective domain is generated.

Auto deployment of Pull Requests using Surge :

To implement the feature of auto-deployment of pull request using surge, one can follow up these steps:

  • Create a pr_deploy.sh file
  • The pr_deploy.sh file will be executed only after success of Travis CI i.e. when Travis CI passes by using command bash pr_deploy.sh
#!/usr/bin/env bash
if [ “$TRAVIS_PULL_REQUEST” == “false” ]; then
echo “Not a PR. Skipping surge deployment.”
exit 0
fi
npm i -g surge
export [email protected]
# Token of a dummy account
export SURGE_TOKEN=d1c28a7a75967cc2b4c852cca0d12206
export DEPLOY_DOMAIN=https://pr-${TRAVIS_PULL_REQUEST}-fossasia-LoklakSearch.surge.sh
surge –project ./dist –domain $DEPLOY_DOMAIN;

Here, Travis CI is first installing surge locally by npm i -g surge  and then we are exporting the environment variables SURGE_LOGIN , SURGE_TOKEN and DEPLOY_DOMAIN.

Now, execute pr_deploy.sh file from .travis.yml by using command bash pr_deploy.sh

Resources

Continue Reading

Join Codeheat Coding Contest 2017

Codeheat is a coding contest for developers interested in contributing to Open Source software and hardware projects at FOSSASIA.  Join development of real world software applications, build up your developer profile, learn new new coding skills, collaborate with the community and make new friends from around the world! Sign up for #CodeHeat here now and follow Codeheat on Twitter.

The contest runs until 1st February 2018. All FOSSASIA projects take part in Codeheat including:

Grand prize winners will be invited to present their work at the FOSSASIA OpenTechSummit in Singapore from March 23rd -25th 2018 and will get 600 SGD in travel funding to attend, plus a free speaker ticket and beautiful Swag.

Our jury will choose three winners from the top 10 contributors according to code quality and relevance of commits for the project. The jury also takes other contributions like submitted weekly scrum reports and monthly technical blog posts into account, but of course awesome code is the most important item on the list.

Other participants will have the chance to win Tshirts, Swag and vouchers to attend Open Tech events in the region and will get certificates of participation.

codeheat-logo

Team mentors and jury members from 10 different countries support participants of the contest.

Participants should take the time to read through the contest FAQ and familiarize themselves with the introductory information and Readme.md of each project before starting to work on an issue.

Developers interested in the contest can also contact mentors through project channels on the FOSSASIA gitter.

Additional Links

Website: codeheat.org

Codeheat Twitter: twitter.com/codeheat_

Codeheat Facebook: facebook.com/codeheat.org

Participating Projects: All FOSSASIA Repositories on GitHub at github.com/fossasia

Continue Reading

Autolinker Component in Loklak Search

In Loklak Search the post items contain links, which are either internal or external. These links include the hashtags, mentions, and URLs. From the backend server we just received the message in the plain text format, and thus there is need to parse the plain text and render it as clickable links. These clickable links can be either internal or external. Thus we need an auto-linker component, which takes the text and render it as links.

The API of the Component

The component takes as the input the plain text, then four arrays of strings. Each containing the text to be linked. These are hashtags, mentions, links and the unshorten attribute which is used to unshorten the shortened URLs in the post. These attributes are used by the component to render the text in the appropriate format.

export class FeedLinkerComponent implements OnInit {
@Input() text: string;
@Input() hashtags: string[] = new Array<string>();
@Input() mentions: string[] = new Array<string>();
@Input() links: string[] = new Array<string>();
@Input() unshorten: Object = {};
}

The Logic of the Component

The basic logic of the component works as the following, we divide the text into chunks known as shards, we have three basic data structures for the component to work

  • The ShardType which is the type of the chunk it specifies whether it is plain, hashtags, mentions, and links.
  • The Shard which is the simple object containing the text to show, its type and the link it refers to

The StringIndexdChunks, they are utilized to index the chunks in the order in which they appear in the text.

const enum ShardType {
plain, // 0
link, // 1
hashtag, // 2
mention // 3
}

class Shard {
constructor (
public type: ShardType = ShardType.plain,
public text: String = '',
public linkTo: any = null,
public queryParams: any = null
) { }
}

interface StringIndexedChunks {
index: number;
str: string;
type: ShardType;
}

First we have a private method of the component which searches for all the elements (strings) in the text. Here we have an array which maintains the index of those chunks in the text.

private generateShards() {
const indexedChunks: StringIndexedChunks[] = [];

this.hashtags.forEach(hashtag => {
const indices = getIndicesOf(this.text, `#${hashtag}`, false);
indices.forEach(idx => {
indexedChunks.push({index: idx, str: `#${hashtag}`, type: ShardType.hashtag});
});
});

this.mentions.forEach(mention => {
const indices = getIndicesOf(this.text, `@${mention}`, false);
indices.forEach(idx => {
indexedChunks.push({index: idx, str: `@${mention}`, type: ShardType.mention});
});
});
}

Then we sort the chunks according to their indexes in the text. This gives us sorted array which consists of all the chunks sorted according to the indexes as they appear in the text.

indexedChunks.sort((a, b) => { return (a.index > b.index) ? 1 : (a.index < b.index) ? -1 : 0; });

The next part of the logic is to generate the shard array, an array which contains each chunk, once. To do this we iterate over the Sorted Indexed array created in the previous step and use it split the text into chunks. We iterate over the text and take substrings using the indexes of each element.

let startIndex = 0;
const endIndex = this.text.length;

indexedChunks.forEach(element => {
if (startIndex !== element.index) {
const shard = new Shard(ShardType.plain, this.text.substring(startIndex, element.index));
this.shardArray.push(shard);
startIndex = element.index;
}
if (startIndex === element.index) {
const str = this.text.substring(startIndex, element.index + element.str.length);
const shard = new Shard(element.type, str);
switch (element.type) {
case ShardType.link: {
if (this.unshorten[element.str]) {
shard.linkTo = str;
shard.text = this.unshorten[element.str];
}
else {
shard.linkTo = str;
}
break;
}

case ShardType.hashtag: {
shard.linkTo = ['/search'];
shard.queryParams = { query : str };
break;
}

case ShardType.mention: {
shard.linkTo = ['/search'];
shard.queryParams = { query : `from:${str.substring(1)}` };
break;
}
}
this.shardArray.push(shard);
startIndex += element.str.length;
}
});

if (startIndex !== endIndex) {
const shard = new Shard(ShardType.plain, this.text.substring(startIndex));
this.shardArray.push(shard);
}

After this we have generated the chunks of the text, now the only task is to write the view of the component which uses this Shard Array to render the linked elements.

<div class="textWrapper">
<span *ngFor="let shard of shardArray">
<span *ngIf="shard.type === 0"> <!-- Plain -->
{{shard.text}}
</span>
<span *ngIf="shard.type === 1"> <!-- URL Links -->
<a>{{shard.text}}</a>
</span>
<span *ngIf="shard.type === 2"> <!-- Hashtag -->
<a [routerLink]="shard.linkTo" [queryParams]="shard.queryParams">{{shard.text}}</a>
</span>
<span *ngIf="shard.type === 3"> <!-- Mention -->
<a [routerLink]="shard.linkTo" [queryParams]="shard.queryParams">{{shard.text}}</a>
</span>
</span>
</div>
  • This renders the chunks and handles the links of both internal and external type.
  • It also also makes sure that the links get unshortened properly using the unshorten API property.
  • Uses routerLink, angular property to link in application URLs, for asynchronous reloading while clicking links.

Resources and Links

This component is inspired from the two main open source libraries.

Earlier these libraries were used in the project, but as the need of unshortening and asynchronous linking appeared in the application, a custom implementation was needed to be implemented.

Continue Reading

Indexing for multiscrapers in Loklak Server

I recently added multiscraper system which can scrape data from web-scrapers like YoutubeScraper, QuoraScraper, GithubScraper, etc. As scraping is a costly task, it is important to improve it’s efficiency. One of the approach is to index data in cache. TwitterScraper uses multiple sources to optimize the efficiency.

This system uses Post message holder object to store data and PostTimeline (a specialized iterator) to iterate the data objects. This difference in data structures from TwitterScraper leads to the need of different approach to implement indexing of data to ElasticSearch (currently in review process).

These are the following changes I made while implementing ‘indexing of data’ in the project.

1) Writing of data is invoked only using PostTimeline iterator

In TwitterScraper, the data is written in message holder TwitterTweet. So all the tweets are written to index as they are created. Here, when the data is scraped, Writing of the posts is initiated. Scraping of data is considered a heavy process. This approach keeps lower resource usage in average traffic on the server.

protected Post putData(Post typeArray, String key, Timeline2 postList) {
   if(!"cache".equals(this.source)) {
       postList.writeToIndex();
   }
   return this.putData(typeArray, key, postList.toArray());
}

2) One object for holding a message

During the implementation, I kept the same message holder Post and post-iterator PostTimeline from scraping to indexing of data. This helps to keep the structure uniform. Earlier approach involves different types of message wrappers in the way. This approach cuts the processes for looping and transitioning of data structures.

3) Index a list, not a message

In TwitterScraper, as the messages are enqueued in the bulk to be indexed. But in this approach, I have enqueued the complete lists. This approach delays the indexing till the scraper is done with processing the html.

Creating the queue of postlists:

// Add post-lists to queue to be indexed
queueClients.incrementAndGet();
try {
    postQueue.put(postList);
} catch (InterruptedException e) {
DAO.severe(e);
}
queueClients.decrementAndGet();

 

Indexing of the posts in postlists:

// Start indexing of data in post-lists
for (Timeline2 postList: postBulk) {
    if (postList.size() < 1) continue;
    if(postList.dump) {
        // Dumping of data in a file
        writeMessageBulkDump(postList);
    }
    // Indexing of data to ElasticSearch
    writeMessageBulkNoDump(postList);
}

 

4) Categorizing the input parameters

While searching the index, I have divided the query parameters from scraper into 3 categories. The input parameters are added to those categories (implemented using map data structure) and thus data fetched are according to them. These categories are:

// Declaring the QueryBuilder
BoolQueryBuilder query = new BoolQueryBuilder();

 

a) Get the parameter– Get the results for the input fields in map getMap.

// Result must have these fields. Acts as AND operator
if(getMap != null) {
    for(Map.Entry<String, String> field : getMap.entrySet()) {
        query.must(QueryBuilders.termQuery(
field.getKey(), field.getValue()));
    }
}

 

b) Don’t get the parameter- Don’t get the results for the input fields in map notGetMap.

// Result must not have these fields.
if(notGetMap != null) {
    for(Map.Entry<String, String> field : notGetMap.entrySet()) {
        query.mustNot(QueryBuilders.termQuery(
                field.getKey(), field.getValue()));
    }
}

 

c) Get if possible- Get the results with the input fields if they are present in the index.

// Result may preferably also get these fields. Acts as OR operator
if(mayAlsoGetMap != null) {
    for(Map.Entry<String, String> field : mayAlsoGetMap.entrySet()) {
        query.should(QueryBuilders.termQuery(
                field.getKey(), field.getValue()));

    }
}

 

By applying these changes, the scrapers are shifted from a message indexing to list of messages indexing. This way we are keeping load on RAM low, but the aggregation of latest scraped data may be affected. So there will be a need to workaround to solve this issue while scraping itself.

References

Continue Reading

Restoring State after Orientation Change in Loklak Wok Android

During orientation change i.e. from portrait to landscape mode in Android, the current activity restarts again. As the activity restarts again, all the defined variables loose their previous value, for example the scroll position of a RecyclerView, or the data in the rows of RecyclerView etc. Just imagine a user searched some tweets in Loklak Wok Android, and as the user’s phone is in “Auto rotation” mode, the orientation changes from portrait to landscape. As a result of this, the user loses the search result and has to do the search again. This leads to a bad UX.

Saving state in onSavedInstanceState

The state of the app can be saved by inserting values in a Bundle object in onSavedInstanceState callback. Inserting values is same as adding elements to a Map in Java. Methods like putDouble, putFloat, putChar etc. are used where the first parameter is a key and the second parameter is the value we want to insert.

@Override
public void onSaveInstanceState(Bundle outState) {
   if (mLatitude != null && mLongitude != null) {
       outState.putDouble(PARCELABLE_LATITUDE, mLatitude);
       outState.putDouble(PARCELABLE_LONGITUDE, mLongitude);
   }
...
}

 

The values can be retrieved back when onCreate or onCreateView of the Activity or Fragment is called. Bundle object in the callback parameter is checked, whether it is null or not, if not the values are retrieved back using the keys provided at the time of inserting. The latitude and longitude of a location in TweetPostingFragment are retrieved in the same fashion

public void onViewCreated(View view, @Nullable Bundle savedInstanceState) {
   ...
   if (savedInstanceState != null) { // checking if bundle is null
       // extracting from bundle
       mLatitude = savedInstanceState.getDouble(PARCELABLE_LATITUDE);
       mLongitude = savedInstanceState.getDouble(PARCELABLE_LONGITUDE);
       // use extracted value
   }
}

Restoring Custom Objects, using Parcelable

But what if we want to restore custom object(s). A simple option can be serializing the objects using the native Java Serialization or libraries like Gson. The problem in these cases is performance, they are quite slow. Parcelable can be used, which leads the pack in performance and moreover it is provided by Android SDK, on top of that, it is simple to use.

The objects of class which needs to be restored implements Parcelable interface and the class must provide a static final object called CREATOR which implements Parcelable.Creator interface.

writeToParcel and describeContents method need to be override to implement Parcelable interface. In writeToParcel method the member variables are put inside the parcel, in our case describeContents method is not used, so, simply 0 is returned. Status class which stores the data of a searched tweet implements parcelable.

@Override
public int describeContents() {
   return 0;
}

@Override
public void writeToParcel(Parcel dest, int flags) {
   dest.writeString(mText);
   dest.writeInt(mRetweetCount);
   dest.writeInt(mFavouritesCount);
   dest.writeStringList(mImages);
   dest.writeParcelable(mUser, flags);
}

 

NOTE: The order in which variables are pushed into Parcel needs to be maintained while variables are extracted from the parcel to recreate the object. This is the reason why no “key” is required to push data into a parcel as we do in bundle.

The CREATOR object implements the creation of object from a Parcel. The CREATOR object overrides two methods createFromParcel and newArray. createFromParcel is the method in which we implement the way an object is created from a parcel.

public static final Parcelable.Creator<Status> CREATOR = new Creator<Status>() {
   @Override
   public Status createFromParcel(Parcel source) {
       return new Status(source); // a private constructor to create object from parcel
   }

   @Override
   public Status[] newArray(int size) {
       return new Status[size];
   }
};

 

The private constructor, note that the order in which variables were pushed is maintained while retrieving the values.

private Status(Parcel source) {
   mText = source.readString();
   mRetweetCount = source.readInt();
   mFavouritesCount = source.readInt();
   mImages = source.createStringArrayList();
   mUser = source.readParcelable(User.class.getClassLoader());
}

 

The status objects are restored the same way, latitude and longitude were restored. putParcelableArrayList in onSaveInstaceState and getParcelableArrayList in onCreateView methods are used to push into Bundle object and retrieve from Bundle object respectively.

@Override
public void onSaveInstanceState(Bundle outState) {
   super.onSaveInstanceState(outState);
   ArrayList<Status> searchedTweets = mSearchCategoryAdapter.getStatuses();
   outState.putParcelableArrayList(PARCELABLE_SEARCHED_TWEETS, searchedTweets);
   ...
}


// retrieval of the pushed values in bundle
@Override
public View onCreateView(LayoutInflater inflater, ViewGroup container,
                            Bundle savedInstanceState) {
   ...
   if (savedInstanceState != null) {
       ...
       List<Status> searchedTweets =
               savedInstanceState.getParcelableArrayList(PARCELABLE_SEARCHED_TWEETS);
       mSearchCategoryAdapter.setStatuses(searchedTweets);
   }
   ...
   return view;
}

Resources:

Continue Reading
Close Menu
%d bloggers like this: