Removing vulnerable dependencies from SUSPER

A vulnerability is a problem in a project’s code that could be exploited to damage the confidentiality, integrity, or availability of the project or other projects that use its code. Depending on the severity level and the way your project uses the dependency, vulnerabilities can cause a range of problems for your project or the people who use it.GitHub tracks public vulnerabilities in Ruby gems and NPM packages on MITRE’s Common Vulnerabilities and Exposures (CVE) List.

What were  vulnerabilities in SUSPER ?

SUSPER was having vulnerability in Gemfile.lock, Gemfile.lock makes our application a single package of both your own code and the third-party code it ran the last time you know for sure that everything worked. Specifying exact versions of the third-party code you depend on in your Gemfile would not provide the same guarantee, because gems usually declare a range of versions for their dependencies.

What were vulnerable dependencies in Gemfile.lock ?

Two dependency namely Nokogiri and Yajl-Ruby were having security vulnerability.

Nokogiri is an HTML, XML, SAX, and Reader parser. Among Nokogiri’s many features is the ability to search documents via XPath or CSS3 selectors whereas

Yajl-Ruby gem is a C binding to the excellent YAJL JSON parsing and generation library. Older versions of both the dependencies were having security vulnerability.

Security alerts for a vulnerable dependency in our repository include a severity level and a link to the affected file in our project. When available, the alerts also include a link to the CVE record and a suggested fix.

What was the suggested fix ?

One way to fix this problem was to update the vulnerable dependencies to latest versions.

The versions of Nokogiri and Yajl-Ruby which were used in SUSPER are:

Nokogiri (~>1.5)

Yajl-Ruby (1.1.0)

What are the best ways to update dependencies without breaking

the project ?

The best way to update a dependency is to check where those dependencies are used in project and what are breaking changes which are introduced within the dependencies.

How vulnerable dependencies were updated ?

Firstly we updated the Bundler the tool we use to update our gems in Gemfile.lock,from version 1.13.6 to 1.16.0.

We then updated Nokogiri dependency and other sub dependencies using  bundle update nokogiri i.e:

mini_portile2 (2.1.0) -> mini_portile2 (2.3.0)

nokogiri (1.6.8.1) ->nokogiri (1.8.2)

Then we checked the project for integrity , and the project was working well.

We then tried to update Yajl-Ruby, but there was a problem in updating Yajl-Ruby,

We later found that Yajl-Ruby was replaced by many other dependencies.

We therefore updated whole Gemfile.lock . Following are two simple steps to update Gemfile.lock

bundle update

bundle install

 

We later checked that whether the new dependencies do not break the current project and we found that there were no breaking changes involved in updated dependencies.

Security alerts for vulnerable dependencies list the affected dependency and, in some cases, use machine learning to suggest a fix from the GitHub community. By default, we receive a weekly email summarizing security alerts for up to 10 of our repositories. We can choose to receive security alerts individually by email, in a daily digest email, in our web notifications, or in the GitHub user interface.

Resources:

Setting up YaCy Grid locally

SUSPER is a search interface that uses P2P search engine YaCy . Search results are displayed using Solr server which is embedded into YaCy. The retrieval of search results is done using YaCy search API. When a search request is made in one of the search templates, an HTTP request is made to YaCy and the response is done in JSON. In this blog post I will show how to setup YaCy Grid locally.

What is YaCy Grid ?

The YaCy Grid is the second-generation implementation of YaCy, a peer-to-peer search engine. The required storage functions of the YaCy Grid are:

  1.  An asset storage, basically a file sharing environment for YaCy components,an ftp server is used for asset storage.
  2.  A message system providing an Enterprise Integration Framework using a message-oriented middleware,RabbitMQ message queues for the message system.
  3.  A database system providing search-engine related retrieval functions.It uses Elasticsearch for database operations.

How to setup YaCy Grid locally ?

YaCy Grid have 4 components MCP(Master Connect Program), Loader, Crawler and  Parser.

  1. Clone all the components using –recursive flag.

git clone --recursive https://github.com/yacy/yacy_grid_mcp.git
git clone --recursive https://github.com/yacy/yacy_grid_parser.git
git clone --recursive https://github.com/yacy/yacy_grid_crawler.git
git clone --recursive https://github.com/yacy/yacy_grid_loader.git
  1.  Now to starting YaCy Grid requires starting Elasticsearch, RabbitMQ with Username `anonymous` and Password `yacy` and an ftp server(it can be omitted as MCP can take over).
  2.  All the above steps can also be done in a single step by running a python script in `bin` folder `run_all.py`
  3.  Working of `run_all.py` in yacy_grid_mcp:

if not checkportopen(9200):
   print "Elasticsearch is not running"
   mkapps()
   elasticversion = 'elasticsearch-5.6.5'
   if not os.path.isfile(path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz'):
       print('Downloading ' + elasticversion)
       urllib.urlretrieve ('https://artifacts.elastic.co/downloads/elasticsearch/' + elasticversion + '.tar.gz', path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz')
   if not os.path.isdir(path_apphome + '/data/mcp-8100/apps/elasticsearch'):
       print('Decompressing' + elasticversion)
       os.system('tar xfz ' + path_apphome + '/data/mcp-8100/apps/' + elasticversion + '.tar.gz -C ' + path_apphome + '/data/mcp-8100/apps/')
       os.rename(path_apphome + '/data/mcp-8100/apps/' + elasticversion, path_apphome + '/data/mcp-8100/apps/elasticsearch')
   # run elasticsearch
   print('Running Elasticsearch')
   os.chdir(path_apphome + '/data/mcp-8100/apps/elasticsearch/bin')
   os.system('nohup ./elasticsearch &')

 

  • Checks whether Elasticsearch is running or not, if not then runs Elasticsearch.
if checkportopen(15672):
   print "RabbitMQ is Running"
   print "If you have configured it according to YaCy setup press N"
   print "If you have not configured it according to YaCy setup or Do not know what to do press Y"
   n=raw_input()
   if(n=='Y' or n=='y'):
       os.system('service rabbitmq-server stop')
       
if not checkportopen(15672):
   print "rabbitmq is not running"
   os.system('python bin/start_rabbitmq.py')
  • Checks whether RabbitMQ is running or not, if yes then asks user to configure it according to YaCy Grid setup by pressing Y or else ignore,if not then starts RabbitMQ according to required configuration.
subprocess.call('bin/update_all.sh')
  • .Updates all the Grid components including MCP.

if not checkportopen(2121):
   print "ftp server is not Running"
  • Checks for an ftp server and prints message accordingly.

def run_mcp():
   subprocess.call(['gnome-terminal', '-e', "gradle run"])

def run_loader():
   os.system('cd ../yacy_grid_loader')
   subprocess.call(['gnome-terminal', '-e', "gradle run"])

def run_crawler():
   os.system('cd ../yacy_grid_crawler')
   subprocess.call(['gnome-terminal', '-e', "gradle run"])

def run_parser():
   os.system('cd ../yacy_grid_parser')
   subprocess.call(['gnome-terminal', '-e', "gradle run"])

 

  • Runs all components of YaCy Grid in separate terminal.

Once user starts it, then he can start using YaCy Grid through terminal.

If a YaCy Grid service has used the MCP once, it learns from the MCP to connect to the infrastructure itself. For example:

  • a YaCy Grid service starts up and connects to the MCP
  • the Grid service pushes a message to the message queue using the MCP
  • the MCP fulfils the message send operation and response with the actual address of the message broker
  • the YaCy Grid service learns the direct connection information
  • whenever the YaCy Grid service wants to connect to the message broker again, it can do so using a direct broker connection. This process is done transparently, the Grid service does not need to handle such communication details itself. The routing is done automatically. To use the MCP inside other grid components the git submodule functionality is used.

Resources: