Adding Yarn as new Dependency Manager  along with NPM in Susper

Dependency managers are software modules that coordinate the integration of external libraries or packages into larger application stack. Dependency managers use configuration files like composer.json, package.json, build.gradle or pom.xml to determine: What dependency to get, What version of the dependency in particular and, Which repository to get them from. Currently SUSPER has only NPM as a dependency manager which is used to install all dependencies. In this blog, I will describe how we have added facebook’s Yarn as a new dependency manager in Susper Lets checkout Yarn in detail: Yarn is a fast and good alternative to NPM. One of the great advantages of Yarn is that while remaining compatible with the npm registry, it replaces the workflow for npm client or other package managers Yarn was created by Facebook, to solve some particular problems that were faced while using NPM. Yarn was developed to deal with inconsistency in dependency installation while scaling and to increase speed. What is advantages of using Yarn? Improving Network performance:Queuing up the requests and avoiding requests waterfalls helps to maximize network utilization. Checks Package Integrity:Package integrity is checked after each install to avoid corrupt packages installation. Checks Package Integrity:Package integrity is checked after each install to avoid corrupt packages installation. Caching: Yarn helps to install the dependencies without an internet connection if the dependency has been previously installed on the system. This is done by caching. Lock File: Lock files are used to make sure that the node_modules directory has the exact same structure on all development environments. Source: https://yarnpkg.com/en/ How Yarn is installed along with NPM in SUSPER? Installing Yarn is super easy. Here are the steps to setup Yarn along with NPM and begin using it as dependency manager. On Debian or Ubuntu Linux, we can install Yarn via our Debian package repository. We will first need to configure the repository: curl -sS https://dl.yarnpkg.com/debian/pubkey.gpg | sudo apt-key add - echo "deb https://dl.yarnpkg.com/debian/ stable main" | sudo tee /etc/apt/sources.list.d/yarn.list   Then simply use: sudo apt-get update && sudo apt-get install yarn   Note: Ubuntu 17.04 comes with cmdtest installed by default. If anyone gets any errors from installing yarn, then remove it by sudo apt remove cmdtest first. Refer to this for more information. If using nvm you can avoid the node installation by doing: sudo apt-get install --no-install-recommends yarn   Test that Yarn is installed by running: yarn --version   Now delete the node_modules folder so that all dependencies installed by npm is removed. Now use yarn command in project’s repository. yarn   Wait while dependencies are installed and then we will be done. What is happening ? Yarn has created a lock file  yarn.lock. After each operation the file is updated (installing, updating or removing packages) to keep the track of exact package version. If kept in our Git repository we can see that the exact same result in node_modules is made available to all systems. Resources Yarn: https://yarnpkg.com/en/ Announcement of Yarn: https://code.facebook.com/posts/1840075619545360 Yarn Vs NPM: https://stackoverflow.com/questions/40027819/when-to-use-yarn-over-npm-what-are-the-differences

Continue ReadingAdding Yarn as new Dependency Manager  along with NPM in Susper

Using Wikipedia API for knowledge graph in SUSPER

Knowledge Graph is way to give a brief description about search query by connecting it to a real world entity. This helps users to get information about exactly what they want. Previously Susper had a Knowledge Graph which was implemented using DBpedia API. But since DBpedia do not provide content over HTTPS connections therefore the content was blocked on susper.com and there was a need to implement the Knowledge Graph using a new API that provide contents over HTTPS. In this blog, I will describe how getting a knowledge graph was made possible using Wikipedia API. What is Wikipedia API ? The MediaWiki action API is a web service that provides convenient access to wiki features, data, and metadata over HTTP, via a URL usually at api.php. Clients request particular "actions" by specifying an action parameter, mainly action=query to get information. The endpoint : https://en.wikipedia.org/w/api.php The format : format=json This tells the API that we want data to be returned in JSON format. The action : action=query The MediaWiki web service API implements dozens of actions and extensions implement many more; the dynamically generated API help documents all available actions on a wiki. In this case, we're using the "query" action to get some information. The complete API which is used in SUSPER to extract information of a query is : https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=japan Where titles=Search_Query, here Japan How it is implemented in SUSPER? For implementing it a service has been created which fetches information by setting various URL parameters. This result can be fetched by creating an instance of service and passing search query to getsearchresults(searchquery) function. export class KnowledgeapiService { server = 'https://en.wikipedia.org'; searchURL = this.server + '/w/api.php?'; homepage = 'http://susper.com'; logo = '../images/susper.svg'; constructor(private http: Http, private jsonp: Jsonp, private store: Store<fromRoot.State>) { } getsearchresults(searchquery) { let params = new URLSearchParams(); params.set('origin', '*'); params.set('format', 'json'); params.set('action', 'query'); params.set('prop', 'extracts'); params.set('exintro', ''); params.set('explaintext', ''); params.set('titles', searchquery); let headers = new Headers({ 'Accept': 'application/json' }); let options = new RequestOptions({ headers: headers, search: params }); return this.http .get(this.searchURL, options).map(res => res.json().query.pages ).catch(this.handleError); }   Since the result obtained is an observable therefore we have to subscribe for it and then extract information to local variables in infobox.component.ts file. export class InfoboxComponent implements OnInit { public title: string; public description: string; query$: any; resultsearch = '/search'; constructor(private knowledgeservice: KnowledgeapiService, private route: Router, private activatedroute: ActivatedRoute, private store: Store<fromRoot.State>, private ref: ChangeDetectorRef) { this.query$ = store.select(fromRoot.getquery); this.query$.subscribe( query => { if (query) { this.knowledgeservice.getsearchresults(query).subscribe(res => { const pageId = Object.keys(res)[0]; if (res[pageId].extract) { this.title = res[pageId].title; this.description = res[pageId].extract; } else { this.title = ''; this.description = ''; } }); } }); } The variable title and description are used to display results on results page. <div *ngIf="this.description" class="card">   <div>     <h2><b>{{this.title}}</b></h2>     <p>{{this.description | slice:0:600}}<a href='https://en.wikipedia.org/wiki/{{this.title}}'>..more at Wikipedia</a></p>   </div> </div> Resources 1.MediaWiki API : https://www.mediawiki.org/wiki/API:Main_page 2.Stackoverflow : https://stackoverflow.com/questions/8555320/is-there-a-clean-wikipedia-api-just-for-retrieve-content-summary 3.Angular Docs : https://angular.io/tutorial/toh-pt4

Continue ReadingUsing Wikipedia API for knowledge graph in SUSPER

Integrating YaCy Grid Locally with Susper

The YaCy Grid is the second-generation implementation of YaCy, a peer-to-peer search engine.The search results can be improved to a great extent by using YaCy-Grid as the new backend for SUSPER. YaCy Grid is the best choice for distributed search topology. The legacy YaCy is made for decentralised and also distributed network. While both the networks are distributed,the YaCy-Grid is centralized and legacy YaCy is decentralized. YaCy Grid facilitates a lot with scaling that will be in our hand and can be done in all aspects​(loading, parsing, indexing) with computing power we choose. In YaCy,Solr is embedded. But in YaCy Grid,we will get elasticsearch cluster.​They are both built around the core underlying search library Lucene.But ​elasticsearch will help us to scale almost indefinitely. In this blog, I will show you how to integrate YaCy Grid with Susper locally and how to use it to fetch results. Implementing YaCy Grid with Susper: Before using YaCy Grid we need to first setup YaCy Grid and crawl to url using crawl start API, more information about that can be found here Implementing YaCy Grid with Susper and Setting up YaCy Grid locally. So, once we are done with setup and crawling, we need to begin using its APIs in Susper. Following are some easy steps in which we can show results from YaCy Grid in a separate tab is Susper. Step 1: Creating a service to fetch results: In order to fetch results from local YaCy Grid server we need to create a service to fetch results from local YaCy Grid server. Here is the class in grid-service.ts which fetches results for us. export class GridSearchService { server = 'http://127.0.0.1:8100'; searchURL = this.server + '/yacy/grid/mcp/index/yacysearch.json?query='; constructor(private http: Http, private jsonp: Jsonp, private store: Store<fromRoot.State>) { } getSearchResults(searchquery) { return this.http .get(this.searchURL+searchquery).map(res => res.json() ).catch(this.handleError); }   Step 2: Modifying results.component.ts file In order to get results from grid-service.ts in results.component.ts we must need to create an instance of the service and use this instance to get the results and store it in variables results.component.ts file and then use these variables to show results in results template. Following is the code that does this for us ngOnInit() { this.grid.getSearchResults(this.searchdata.query).subscribe(res=>{ this.gridResult=res.channels; }); }   gridClick(){ this.getPresentPage(1); this.resultDisplay = 'grid'; this.totalgridresults=this.gridResult[0].totalResults; this.gridmessage='About ' + this.totalgridresults + ' results'; this.gridItems=this.gridResult[0].items; console.log(this.gridItems); }   Step 3: Creating a New tab to show results from YaCy Grid: Now we need to create a tab in the template where we can use local variables in results.component.ts to show the results following the current design pattern here is the code for that <li [class.active_view]="Display('grid')" (click)="gridClick()">YaCy_Grid</li> <!--YaCy Grid--> <div class="container-fluid"> <div class="result message-bar" *ngIf="totalgridresults > 0 && Display('grid')"> {{gridmessage}} </div> <div class="autocorrect"> <app-auto-correct [hidden]="hideAutoCorrect"></app-auto-correct> </div> </div> <div class="grid-result" *ngIf="Display('grid')"> <div class="feed container"> <div *ngFor="let item of gridItems" class="result"> <div class="title"> <a class="title-pointer" href="{{item.link}}" [style.color]="themeService.titleColor">{{item.title}}</a> </div> <div class="link"> <p [style.color]="themeService.linkColor">{{item.link}}</p> </div> <div class="description"> <p [style.color]="themeService.descriptionColor">{{item.pubDate|date:'MMMM d, yyyy'}} - {{item.description}}</p> </div> </div> </div> </div> <!-- END -->   Step 4: Starting YaCy Grid Locally:…

Continue ReadingIntegrating YaCy Grid Locally with Susper

Removing vulnerable dependencies from SUSPER

A vulnerability is a problem in a project's code that could be exploited to damage the confidentiality, integrity, or availability of the project or other projects that use its code. Depending on the severity level and the way your project uses the dependency, vulnerabilities can cause a range of problems for your project or the people who use it.GitHub tracks public vulnerabilities in Ruby gems and NPM packages on MITRE's Common Vulnerabilities and Exposures (CVE) List. What were  vulnerabilities in SUSPER ? SUSPER was having vulnerability in Gemfile.lock, Gemfile.lock makes our application a single package of both your own code and the third-party code it ran the last time you know for sure that everything worked. Specifying exact versions of the third-party code you depend on in your Gemfile would not provide the same guarantee, because gems usually declare a range of versions for their dependencies. What were vulnerable dependencies in Gemfile.lock ? Two dependency namely Nokogiri and Yajl-Ruby were having security vulnerability. Nokogiri is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s many features is the ability to search documents via XPath or CSS3 selectors whereas Yajl-Ruby gem is a C binding to the excellent YAJL JSON parsing and generation library. Older versions of both the dependencies were having security vulnerability. Security alerts for a vulnerable dependency in our repository include a severity level and a link to the affected file in our project. When available, the alerts also include a link to the CVE record and a suggested fix. What was the suggested fix ? One way to fix this problem was to update the vulnerable dependencies to latest versions. The versions of Nokogiri and Yajl-Ruby which were used in SUSPER are: Nokogiri (~>1.5) Yajl-Ruby (1.1.0) What are the best ways to update dependencies without breaking the project ? The best way to update a dependency is to check where those dependencies are used in project and what are breaking changes which are introduced within the dependencies. How vulnerable dependencies were updated ? Firstly we updated the Bundler the tool we use to update our gems in Gemfile.lock,from version 1.13.6 to 1.16.0. We then updated Nokogiri dependency and other sub dependencies using  bundle update nokogiri i.e: mini_portile2 (2.1.0) -> mini_portile2 (2.3.0) nokogiri (1.6.8.1) ->nokogiri (1.8.2) Then we checked the project for integrity , and the project was working well. We then tried to update Yajl-Ruby, but there was a problem in updating Yajl-Ruby, We later found that Yajl-Ruby was replaced by many other dependencies. We therefore updated whole Gemfile.lock . Following are two simple steps to update Gemfile.lock bundle update bundle install   We later checked that whether the new dependencies do not break the current project and we found that there were no breaking changes involved in updated dependencies. Security alerts for vulnerable dependencies list the affected dependency and, in some cases, use machine learning to suggest a fix from the GitHub community. By default, we receive a weekly email summarizing security alerts for up to…

Continue ReadingRemoving vulnerable dependencies from SUSPER

Setting up YaCy Grid locally

SUSPER is a search interface that uses P2P search engine YaCy . Search results are displayed using Solr server which is embedded into YaCy. The retrieval of search results is done using YaCy search API. When a search request is made in one of the search templates, an HTTP request is made to YaCy and the response is done in JSON. In this blog post I will show how to setup YaCy Grid locally. What is YaCy Grid ? The YaCy Grid is the second-generation implementation of YaCy, a peer-to-peer search engine. The required storage functions of the YaCy Grid are:  An asset storage, basically a file sharing environment for YaCy components,an ftp server is used for asset storage.  A message system providing an Enterprise Integration Framework using a message-oriented middleware,RabbitMQ message queues for the message system.  A database system providing search-engine related retrieval functions.It uses Elasticsearch for database operations. How to setup YaCy Grid locally ? YaCy Grid have 4 components MCP(Master Connect Program), Loader, Crawler and  Parser. Clone all the components using --recursive flag. git clone --recursive https://github.com/yacy/yacy_grid_mcp.git git clone --recursive https://github.com/yacy/yacy_grid_parser.git git clone --recursive https://github.com/yacy/yacy_grid_crawler.git git clone --recursive https://github.com/yacy/yacy_grid_loader.git  Now to starting YaCy Grid requires starting Elasticsearch, RabbitMQ with Username `anonymous` and Password `yacy` and an ftp server(it can be omitted as MCP can take over).  All the above steps can also be done in a single step by running a python script in `bin` folder `run_all.py`  Working of `run_all.py` in yacy_grid_mcp: if not checkportopen(9200):    print "Elasticsearch is not running"    mkapps()    elasticversion = 'elasticsearch-5.6.5'    if not os.path.isfile(path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz'):        print('Downloading ' + elasticversion)        urllib.urlretrieve ('https://artifacts.elastic.co/downloads/elasticsearch/' + elasticversion + '.tar.gz', path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz')    if not os.path.isdir(path_apphome + '/data/mcp-8100/apps/elasticsearch'):        print('Decompressing' + elasticversion)        os.system('tar xfz ' + path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz -C ' + path_apphome + '/data/mcp-8100/apps/')        os.rename(path_apphome + '/data/mcp-8100/apps/' + elasticversion, path_apphome + '/data/mcp-8100/apps/elasticsearch')    # run elasticsearch    print('Running Elasticsearch')    os.chdir(path_apphome + '/data/mcp-8100/apps/elasticsearch/bin')    os.system('nohup ./elasticsearch &')   Checks whether Elasticsearch is running or not, if not then runs Elasticsearch. if checkportopen(15672):    print "RabbitMQ is Running"    print "If you have configured it according to YaCy setup press N"    print "If you have not configured it according to YaCy setup or Do not know what to do press Y"    n=raw_input()    if(n=='Y' or n=='y'):        os.system('service rabbitmq-server stop')         if not checkportopen(15672):    print "rabbitmq is not running"    os.system('python bin/start_rabbitmq.py') Checks whether RabbitMQ is running or not, if yes then asks user to configure it according to YaCy Grid setup by pressing Y or else ignore,if not then starts RabbitMQ according to required configuration. subprocess.call('bin/update_all.sh') .Updates all the Grid components including MCP. if not checkportopen(2121):    print "ftp server is not Running" Checks for an ftp server and prints message accordingly. def run_mcp():    subprocess.call(['gnome-terminal', '-e', "gradle run"]) def run_loader():    os.system('cd ../yacy_grid_loader')    subprocess.call(['gnome-terminal', '-e', "gradle run"]) def run_crawler():    os.system('cd ../yacy_grid_crawler')    subprocess.call(['gnome-terminal', '-e', "gradle run"]) def run_parser():    os.system('cd ../yacy_grid_parser')    subprocess.call(['gnome-terminal', '-e', "gradle run"])   Runs all components of YaCy Grid in separate terminal. Once user starts it,…

Continue ReadingSetting up YaCy Grid locally