Configurable Services in Loklak Search

Loklak search being an angular application has a concept of wiring down the code in the special form of classes called Services. These serviced have important characteristics, which make them a powerful feature of angular.

  • Services are shared common object wired together by Dependency Injection.
  • Services are lazily instantiated at the runtime.

 

The DI and the instantiation part of a service are handled by angular itself so we don’t have to bother about it. The parts of the services we are always concerned about is the logical part of the service. As the services are the sharable code at the time of writing a service we have to be 100% sure that this is the part of the code which we want to share with our components, else this can lead to the bad implementation of architecture which makes application harder to debug.

Now, the next question which arises is how services are different from something like redux state? Well, the difference lies in the word itself, services don’t have a persistent state of themselves. They are just a set of methods to separate a common piece of code from all the components into one class. These services have functions which take an input, processes them and spit an output.

Services in Loklak Search

So in loklak search, the main services are the ones which on request, fetch data from the backend API and return the data to the requester. All the services in loklak search have a fixed well-defined task, i.e. to use the API and get the data. This is how all the services must be designed, with a specific set of goals. A service should never try to do what is not necessary, or in other words, each service should have one and only one aim and it should do it nicely.

In loklak search, the services are classified by the API endpoints they hit to retrieve data. These services receive the query to be searched from the requested and they send the AJAX request to correct API endpoint and return the fetched data. This is the common structure of all the Loklak services, they all have a fetchQuery() method which takes a string argument query and requests the API for that query and after completion, it either returns the correct response from the API or throws an error if something goes wrong.

@Injectable()
class SearchService() {
public fetchQuery( query: string ) {  }
private extractData( response ) {  }
private handleError( error ) {  }
}

Problems faced in this structure

This simple structure was good enough for the application in the basic levels, but as the number of features in the application increase, our simple service becomes less and less flexible as the fetchQuery() method takes only a query string as an argument and requests the API for that query, along with some query parameters. These query parameters are the additional information given to the server to process and respond to our request in a particular way, like a number of results to be fetched, aggregations to be carried out, and much more. In the current implementation, the setting up these parameters were solely done by the service itself, so these parameters were fixed inside the service and there was no easy way to modify them. This reduced the flexibility of the service as all the requesters were bound to a fixed set of parameters, thus lacking the usability of the service in other places of the application.

 

Solution – Service Configs

The solution to this problem of service customizability is the Service Config classes. The objects of these classes contain the information about the query parameters which various requesters can configure according to their specific needs, and our services will simply configure the query params accordingly. This idea of having a shared structure for the service configurations plays very nicely with our scenario where we want extra control over the parameters which our service is configuring.

@Injectable()
class SearchService() {
public fetchQuery( query: string, config: SearchServiceConfig ) {  }
private extractData( response ) {  }
private handleError( error ) {  }
}

This small modification to our service structure enables us to have the amount of control which we required. The config class is a fairly simple one.

export class SearchServiceConfig {
private count: number;
private source: Source;
private fields: Set<AggregationFields>;
private aggregationLimit: number;
private maximumRecords: number;
private startRecord: number;
private timezoneOffset: string;
private filters: Set<Filter>;

// Other methods to get/set these attributes
}

Now any requester will instantiate a new object of this class and will set the attributes according to his needs then this object is passed to the fetchQuery() method of our function. Which designs the request to be sent accordingly.

Conclusion

In conclusion, i would like to mention the how these attributes are chosen to be a part of the Config and not as a query string. Our API endpoints accept the query string along with some attributes which filter out the results or run aggregations in various fields. So we should have all these attributes in our config as these all properties may vary according to the requesters need. Therefore, this idea of configurable services makes us not only better reuse the existing models and services in multiple situations but also make us write better predictable code.

Resources and Links

Query Model Structure of Loklak Search

Need to restructure

The earlier versions of loklak search applications had the issues of breaking changes whenever any new feature was added in the application. The main reason for these unintended bugs was identified to be the existing query structure. The query structure which was used in the earlier versions of the application only comprised of the single entity a string named as queryString.

export interface Query {
 queryString: string;
}

This simple query string property was good enough for simple string based searches which were the goals of the application in the initial phases, but as the application progressed we realized this simple string based implementation is not going to be feasible for the long term. As there are only a limited things we can do with strings. It becomes extremely difficult to set and reset the portions of the string according to the requirements. This was the main vision for the alternate architecture design scheme was to the ease of enabling and disabling the features on the fly.

Application Structure

Therefore, to overcome the difficulties faced with the simple string based structure we introduced the concept of an attribute based structure for queries. The attribute based structure is simpler to understand and thus easier to maintain the state of the query in the application.

export interface Query {
 displayString: string;
 queryString: string;
 routerString: string;
 filter: FilterList;
 location: string;
 timeBound: TimeBound;
 from: boolean;
}

The reason this is called an attribute based structure is that here each property of an interface is independent of one another. Thus each one can be thought of as a separate little key placed on the query, but each of these keys are different and are mutually exclusive. What this means is, if I want to write an attribute query, then it does not matter to me which other attributes are already present on the query. The query will eventually be processed and sent to the server and the corresponding valid results if exists will be shown to the user.

Now the question arises how do we modify the values of these attributes? Now before answering this I would like to mention that this interface is actually instantiated in the the Redux state, so now our question automatically gets answered, the modification to redux state corresponding to the query structure will be done by specific reducers meant for modification of each attribute. These reducers are again triggered by corresponding actions.

export const ActionTypes = {
 VALUE_CHANGE: '[Query] Value Change',
 FILTER_CHANGE: '[Query] Filter Change',
 LOCATION_CHANGE: '[Query] Location Change',
 TIME_BOUND_CHANGE: '[Query] Time Bound Change',
};

This ActionTypes object contains the the corresponding actions which are used to trigger the reducers. These actions can be dispatched in response to any user interaction by any of the components, thus modifying a particular state attribute via the reducer.

Converting from object to string

Now for our API endpoint to understand our query we need to send the proper string in API accepted format. For this there is need to convert dynamically from query state to query string, for this we need a simple function which take in query state as an input return the query string as output.

export function parseQueryToQueryString(query: Query): string {
 let qs: string;
 qs = query.displayString;
 if (query.location) {
qs += ` near:${query.location);

 if (query.timeBound.since) {
   qs += ` since:${parseDateToApiAcceptedFormat(query.timeBound.since)}`;

 if (query.timeBound.until) {
   qs += ` until:${parseDateToApiAcceptedFormat(query.timeBound.until)}`;

 return qs;
}

In this function we are just checking and updating the query string according to the various attributes set in the structure, and then returning the query string. So if eventually we have to convert to the string, then what is the advantage of this approach? The main advantage of this approach is that we know the query structure beforehand and we use the structure to build the string not just randomly selecting and removing pieces of information from a string. Whenever we update any of the attribute of the query state, the query is generated fresh, and not modifying the old string.

Conclusion

This approach makes the application to be able to modify the search queries sent to server in a streamlined and logical way, just by using simple data structure. This query model has provided us with a lot of advantages which are visible in the aspect of application stability and performance. This model has cuts out dirty regex matching, of typed queries and thus again help us to make simpler queries.

Resources and Links

Using JS Profiler to Solve App Slowness in Loklak Search

The loklak search web application had an issue of application slowdown in the development environment. There was a highly prominent lag and a slowdown in the application while typing a search query in the search box. This issue was being faced from but the reason for the lad and slowdown was unknown. So this blog explains how the issue was identified, what were the reasons behind it, and how the issue was solved.

The most important aspect of fixing any bug in the application is to first identify it. Here in our case what helped us to identify the issue was the proper use of JS profiler to find the hidden issue underneath.

Reporting of the issue

The issue was observed and reported in the issue #365. On discussion it was observed that the issue was not being observed by all the users but only some of them, this made it really interesting as the application, on the whole, was being shipped uniformly to all the users but still only fractions of them were facing this issue, and all the people who were experiencing this were the developers of the application

Detection

For the detection of the issue, it was important to have a starting point to search issue. Here the only thing which we knew about the issue was that it was being faced only when typing in the application search box. So this was our starting ground. The most important tool while solving such issues is the browser itself. The dev tools in the most modern browsers have a built in Javascript Profiler. A profiler is a tool which dynamically at the run time records the activities happening in the various threads of the application, most importantly main thread. So we can find the JS profiler in browser dev tools, in Chrome 59 it is in Performance tab.

In every profiler there is a record button (marked above) pressing this will start monitoring all the activities on various JS threads and other pipelines. So this profiler and understanding its logs is the key to solve any performance related issue in any web application.

So we started by hitting the record button and then typing in the search box. And this is the output which we saw for profiling. This output contains a lot of numbers and to make sense of all of them it is important to understand the important bits of it as the output, on the whole, can be a lot overwhelming.

Observations

We will go piece by piece and will try to make sense of the output and if possible identify the reason for our original issue.

 

  • This is the timeline strip which represents the overall view of the main thread of JS. The yellow portions show the percentage of CPU utilization with respect to time, here we can clearly see the consistent blocking CPU utilization by the main thread over a span of 10 seconds (during which we typed). The high CPU utilization starts with our first keystroke.
  • The red rectangles at the top represent the dropped frames, the frames which are never rendered due to heavy load, and the green line at the bottom represent the frames per second which you can see is practically zero. This implies that the CPU is entirely hogged up by the main thread and there is no room left for rendering frames.
  • Now, this is the Interactions panel this indicates all the user generated events which are triggered. It clearly shows that the key up and key character events are getting triggered but are taking a lot of processing time. The first key press, key character and key up events run normally. But then each subsequent event hogs up the CPU.
  • The above observation indicates what ever is the issue in the application it is in between the registering of an event and its final processing.
  • So the natural question arises what we do during this processing. On looking at the code it seems like we dispatch a Redux Action that the query string has updated, according to the action reducer updates the state. Now, this implies that our reducer is taking the time to update the state. But this does not make sense, as reducers are the pure functions, i.e. they don’t make any blocking calls. So they can’t hog up all the CPU cycles for 10 seconds.
  • So we need to investigate further to identify what is happening after the action is dispatched, to the store. This is the function call stack trace on the main thread. What we see here is each nested function call as we move down the tree.
  • The above trace shows that what all is happening when a key is pressed. There is again a key point to observe that the time to process the events is increasing per keystroke, for the first key stroke it starts from 112ms and jumps to 400ms in for the subsequent key presses, which causes the slowdown.
  • Now we can see in the above stack trace and find out the Rate determining step for each event processing. The RDS of a step is the slowest step which causes the whole process to execute slowly.
  • Now as we move down the tree we find that all the steps are taking very nominal time but some function down in the tree is causing the issues. So we move down the tree to identify the rate determining function call.
  • As we move down the tree we first find our action dispatch call as expected, the processing of this call is not taking much time but some of the functions in that process are causing our issue, thus we continue to move down the tree.
  • Here we find our culprit which is causing our issue the slowest step is a method which is making a call to our StoreDevtools module and notifying it about the latest store update. This notifying action is taking the most time.
  • As we move down the tree we first find our action dispatch call as expected, the processing of this call is not taking much time but some of the functions in that process are causing our issue, thus we continue to move down the tree.
  • Here we find our culprit which is causing our issue the slowest step is a method which is making a call to our StoreDevtools module and notifying it about the latest store update. This notifying action is taking the most time.
  • So the issue was caused by the import of dev-tools module in our app module
    StoreDevtoolsModule.instrumentOnlyWithExtension();
  • Now another important observation here is that why only a few people were facing this issue
    • The only people who faced this issue were the developers working on the project because as the import statement says “instrumentOnlyWithExtension” it specifies that this module will only run when there is Devtools Extension installed.
    • And this is the case with the application developers only.

Fixing the issue

  • For fixing the issue it was decided that for best performance the StoreDevTools module will not be included in the package by default.
  • The module can be selectively imported for testing purposes as and when required by the developers
  • It will always be removed while merging into the main branch.

Results

Now after all these efforts and fixing it is worth mentioning the speed gains which were achieved after removing store dev-tools module.

  • This the profiler output after running the profiling on similar queries as before. Without even going into detail we can see it is much cleaner now.
  • These are the main thread and frame rates achieved after the removal of the module. We can clearly see the main thread is completely free and we thus achieve high frame rates.
  • The last important observation is that the action dispatching which was taking about 350ms earlier is now taking only 2.5ms. Such a performance boost is huge at about 140x at each key press.

Conclusion

When a web application has slow response or lag, it makes the users leave the application altogether, and this causes the huge setback to the efforts which we developers make in increasing the user interaction. Thus, any issue on client side must be solved as quickly as possible. The most important tool with which we use to test and predict what is going wrong with the application is the browser itself. The browser and the dev-tools provide us important tools required to optimize the application for improving the performance.

Resources and links

Routing in Loklak Search SPA Hosted on GitHub-Pages

Single page applications like Loklak Search, are the ones which only have a single physical HTML page but has many different logical pages. This is achieved with the help of JavaScript, and angular provides strong abstraction to help the developers write the logical pages and Angular Router manages which page to display when a particular view is demanded. The Angular Router manages the URL state and the state of corresponding components.

Overview:

Now it’s important to understand how the routing works. In every web application structure, there is an index.html file, for SPA this is the only one single physical HTML page. This page is served when the base url is requested, ie. the web server gives this file to the client when the client requests for the base url of the application, in our case http://loklak.net/, This index.html page contains all the JS and CSS required by the application to work, which are loaded, as normal HTTP requests. When the javascript is loaded, the control of the application is governed by Angular which manages all the components, views and routes. When a user demands some route, the angular updates the view with the required components that are required to show the contents of that route.

 

It’s important to note here that no further simple requests are made by the client for any resource, angular is the one which controls all such requests. When we define routes in angular, say /search, we tell it what is the component which should be shown to the user when the route is /search, whenever the user navigates to such route angular shows the appropriate component, there is no request made by the client to the server for any such routes.

 

In Fact, the server doesn’t even know of such route, as they are defined in angular itself not, on the server side. The important fact is now what happens when the client refreshes the page when route /search is the URL, now the request is sent to the server which doesn’t know which page to serve when such request comes, so it will throw a 404: Page Not Found error.

Solution:

The solution to this situation is pretty straight forward, the server instead of throwing the 404 page, returns with the same index.html page. This makes the problem go away entirely, the server responds with the same page always, this page has the javascript to load and give control to angular, which when loaded, resolves which page to show according to the URL. This is a clean simple solution to the SPA routing problem. But this only works when you have access to Server Code, and you can request the server to respond with the same page whatever be the URL.

The problem with gh-pages:

But Loklak Search is deployed on gh-pages, and thus we don’t have any access to the server side code, to display the page of our requirements. GitHub pages just serve the HTML file which matches the corresponding URL, plain simple server. Now if on gh-pages we request for /search route it will respond with a 404 Error, as it is unable to find the search.html file in the directory.

How it is achieved in Loklak Search?

Loklak search is also deployed on gh-pages, to have a workaround to this problem we take advantage of the page served by gh-pages when the requested page is not found. When 404 error occur gh-pages responds with the 404.html file if it exists in the repository, else it responds with a default 404 page. We use this 404.html page to be able to load our main index.html page.

 

Here we have our 404.html file which is served by gh-pages whenever some url which is not unknown is requested

  • Here we first store the actual location which was requested by the user in the session storage.
  • Then we do a Redirective Refresh using the meta tag, to the root page ie. to load index.html which is known to gh-pages and is served.

https://gist.github.com/hemantjadon/9581dd0b4907c567b2a90eb949c5cbbc.js

<!-- 404.html -->
<!doctype html>
<html>
   <head>
  <!-- This stores the URL the user was attempting to go to in sessionStorage, and then redirects all 404 responses to the app’s index.html page -->
      <scrpt>
         sessionStorage.redirect = location.href;
      </scrpt>
         <meta http-equiv="refresh" content="0;URL='http://loklak.net/'">
      </meta>
   </head>
   <body>
   </body>
</html>

And then in our index.html after loading the angular and all the resources, we run a function, it takes out the redirect key stored in session storage, and compare it to the current location.

If the locations don’t match, we replace the history state with the with the redirect and this time as angular is already loaded it knows which page to show to the user.

<!-- index.html -->

...

...

<scrpt>

   (function(){
     var redirect = sessionStorage.redirect;
     delete sessionStorage.redirect;
     if (redirect && redirect != location.href) {
        history.replaceState(null, null, redirect);
     }
   })();

</scrpt>

...

...

Resources and links

Continuous Deployment Implementation in Loklak Search

In current pace of web technology, the quick response time and low downtime are the core goals of any project. To achieve a continuous deployment scheme the most important factor is how efficiently contributors and maintainers are able to test and deploy the code with every PR. We faced this question when we started building loklak search.

As Loklak Search is a data driven client side web app, GitHub pages is the simplest way to set it up. At FOSSASIA apps are developed by many developers working together on different features. This makes it more important to have a unified flow of control and simple integration with GitHub pages as continuous deployment pipeline.

So the broad concept of continuous deployment boils down to three basic requirements

  1. Automatic unit testing.
  2. The automatic build of the applications on the successful merge of PR and deployment on the gh-pages branch.
  3. Easy provision of demo links for the developers to test and share the features they are working on before the PR is actually merged.

Automatic Unit Testing

At Loklak Search we use karma unit tests. For loklak search, we get the major help from angular/cli which helps in running of unit tests. The main part of the unit testing is TravisCI which is used as the CI solution. All these things are pretty easy to set up and use.

Travis CI has a particular advantage which is the ability to run custom shell scripts at different stages of the build process, and we use this capability for our Continuous Deployment.

Automatic Builds of PR’s and Deploy on Merge

This is the main requirement of the our CD scheme, and we do so by setting up a shell script. This file is deploy.sh in the project repository root.

There are few critical sections of the deploy script. The script starts with the initialisation instructions which set up the appropriate variables and also decrypts the ssh key which travis uses for pushing the repo on gh-pages branch (we will set up this key later).

  • Here we also check that we run our deploy script only when the build is for Master Branch and we do this by early exiting from the script if it is not so.
#!/bin/bash

SOURCE_BRANCH="master"
TARGET_BRANCH="gh-pages"

# Pull requests and commits to other branches shouldn't try to deploy.
if [ "$TRAVIS_PULL_REQUEST" != "false" -o "$TRAVIS_BRANCH" != "$SOURCE_BRANCH" ]; then
echo "Skipping deploy; The request or commit is not on master"
exit 0
fi

 

  • We also store important information regarding the deploy keys which are generated manually and are encrypted using travis.
# Save some useful information
REPO=`git config remote.origin.url`
SSH_REPO=${REPO/https:\/\/github.com\[email protected]:}
SHA=`git rev-parse --verify HEAD`

# Decryption of the deploy_key.enc
ENCRYPTED_KEY_VAR="encrypted_${ENCRYPTION_LABEL}_key"
ENCRYPTED_IV_VAR="encrypted_${ENCRYPTION_LABEL}_iv"
ENCRYPTED_KEY=${!ENCRYPTED_KEY_VAR}
ENCRYPTED_IV=${!ENCRYPTED_IV_VAR}
openssl aes-256-cbc -K $ENCRYPTED_KEY -iv $ENCRYPTED_IV -in deploy_key.enc -out deploy_key -d

chmod 600 deploy_key
eval `ssh-agent -s`
ssh-add deploy_key

 

  • We clone our repo from GitHub and then go to the Target Branch which is gh-pages in our case.
# Cloning the repository to repo/ directory,
# Creating gh-pages branch if it doesn't exists else moving to that branch
git clone $REPO repo
cd repo
git checkout $TARGET_BRANCH || git checkout --orphan $TARGET_BRANCH
cd ..

# Setting up the username and email.
git config user.name "Travis CI"
git config user.email "$COMMIT_AUTHOR_EMAIL"

 

  • Now we do a clean up of our directory here, we do this so that fresh build is done every time, here we protect our files which are static and are not generated by the build process. These are CNAME and 404.html
# Cleaning up the old repo's gh-pages branch except CNAME file and 404.html
find repo/* ! -name "CNAME" ! -name "404.html" -maxdepth 1  -exec rm -rf {} \; 2> /dev/null
cd repo

git add --all
git commit -m "Travis CI Clean Deploy : ${SHA}"

 

  • After checking out to our Master Branch we do an npm install to install all our dependencies here and then do our project build. Then we move our files generated by the ng build to our gh-pages branch, and then we make a commit, to this branch.
git checkout $SOURCE_BRANCH
# Actual building and setup of current push or PR.
npm install
ng build --prod --aot
git checkout $TARGET_BRANCH
mv dist/* .
# Staging the new build for commit; and then committing the latest build
git add .
git commit --amend --no-edit --allow-empty

 

  • Now the final step is to push our build files to gh-pages branch and as we only want to put the build there if the code has actually changed, we make sure by adding that check.
# Deploying only if the build has changed
if [ -z `git diff --name-only HEAD HEAD~1` ]; then

echo "No Changes in the Build; exiting"
exit 0

else
# There are changes in the Build; push the changes to gh-pages
echo "There are changes in the Build; pushing the changes to gh-pages"

# Actual push to gh-pages branch via Travis
git push --force $SSH_REPO $TARGET_BRANCH
fi

 

Now this 70 lines of code handle all our heavy lifting and automates a large part of our CD. This makes sure that no incorrect builds are entering the gh-pages branch and also enabling smoother experience for both developers and maintainers.

The important aspect of this script is ability to make sure Travis is able to push to gh-pages. This requires the proper setup of Keys, and it definitely is the trickiest part the whole setup.

  • The first step is to generate the SSH key. This is done easily using terminal and ssh-keygen.
$ ssh-keygen -t rsa -b 4096 -C "[email protected]

 

  • I would recommend not using any passphrase as it will then be required by Travis and thus will be tricky to setup.
  • Now, this generates the RSA public/private key pair.
  • We now add this public deploy key to the settings of the repository.
  • After setting up the public key on GitHub we give the private key to Travis so that Travis is able to push on GitHub.
  • For doing this we use the Travis Client, this helps to encrypt the key properly and send the key and iv to the travis. Which then using these values is able to decrypt the private key.
$ travis encrypt-file deploy_key
encrypting deploy_key for domenic/travis-encrypt-file-example
storing result as deploy_key.enc
storing secure env variables for decryption

Please add the following to your build script (before_install stage in your .travis.yml, for instance):

    openssl aes-256-cbc -K $encrypted_0a6446eb3ae3_key -iv $encrypted_0a6446eb3ae3_key -in super_secret.txt.enc -out super_secret.txt -d

Pro Tip: You can add it automatically by running with --add.

Make sure to add deploy_key.enc to the git repository.
Make sure not to add deploy_key to the git repository.
Commit all changes to your .travis.yml.

 

  • Make sure to add deploy_key.enc to git repository and not to add deploy_key to git.

And after all these steps everything is done our client-side web application will deploy on every push on the master branch.

These steps are required only one time in project life cycle. At loklak search, we haven’t touched the deploy.sh since it was written, it’s a simple script but it does all the work of Continuous Deployment we want to achieve.

Generation of Demo Links and Test Deployments

This is also an essential part of the continuous agile development that developers are able to share what they have built and the maintainers to review those features and fixes. This becomes difficult in a web application as the fixes and features are more often than not visual and attaching screenshots with every PR become the hassle. If the developers are able to deploy their changes on their gh-pages and share the demo links with the PR then it’s a big win for development at a faster pace.

Now, this step is highly specific for Angular projects while there are are similar approaches for React and other frameworks as well, if not we can build the page easily and push our changes to gh-pages of our fork.

We use @angular/cli for building project then use angular-cli-ghpages npm package to actually push to gh-pages branch of the fork. These commands are combined and are provided as node command npm run deploy. And this makes our CD scheme complete.

Conclusion

Clearly, the continuous deployment scheme has a lot of advantages over the other methods especially in the client side web apps where there are a lot of PR’s. This essentially eliminates all the deployment hassles in a simple way that any deployment doesn’t require any manual interventions. The developers can simply concentrate on coding the application and maintainers can just simply review the PR’s by seeing the demo links and then merge when they feel like the PR is in good shape and the deployment is done all by the Shell Script without requiring the commands from a developer or a maintainer.

Links

Loklak Search GitHub Repository: https://github.com/fossasia/loklak_search

Loklak Search Application: http://loklak.net/

Loklak Search TravisCI: https://travis-ci.org/fossasia/loklak_search/

Deploy Script: https://github.com/fossasia/loklak_search/blob/master/deploy.sh

Further Resources