Creating A Dockerfile For Yacy Grid MCP

The YaCy Grid is the second-generation implementation of YaCy, a peer-to-peer search engine. A YaCy Grid installation consists of a set of micro-services which communicate with each other using a common infrastructure for data persistence. The task was to deploy the second-generation of YaCy Grid. To do so, we first had created a Dockerfile. This dockerfile should start the micro services such as rabbitmq, Apache ftp and elasticsearch in one docker instance along with MCP. The microservices perform following tasks:

  • Apache ftp server for asset storage.
  • RabbitMQ message queues for the message system.
  • Elasticsearch for database operations.

To launch these microservices using Dockerfile, we referred to following documentations regarding running these services locally: https://github.com/yacy/yacy_grid_mcp/blob/master/README.md

For creating a Dockerfile we proceeded as follows:

FROM ubuntu:latest
MAINTAINER Harshit Prasad# Update
RUN apt-get update
RUN apt-get upgrade -y# add packages
# install jdk package for java
RUN apt-get install -y git openjdk-8-jdk

#install gradle required for build
RUN apt-get update && apt-get install -y software-properties-common
RUN add-apt-repository ppa:cwchien/gradle
RUN apt-get update
RUN apt-get install -y wget
RUN wget https://services.gradle.org/distributions/gradle-3.4.1-bin.zip
RUN mkdir /opt/gradle
RUN apt-get install -y unzip
RUN unzip -d /opt/gradle gradle-3.4.1-bin.zip
RUN PATH=$PATH:/opt/gradle/gradle-3.4.1/bin
ENV GRADLE_HOME=/opt/gradle/gradle-3.4.1
ENV PATH=$PATH:$GRADLE_HOME/bin
RUN gradle -v

# install apache ftp server 1.1.0
RUN wget http://www-eu.apache.org/dist/mina/ftpserver/1.1.0/dist/apache-ftpserver-1.1.0.tar.gz
RUN tar xfz apache-ftpserver-1.1.0.tar.gz

# install RabbitMQ server
RUN wget https://www.rabbitmq.com/releases/rabbitmq-server/v3.6.6/rabbitmq-server-generic-unix-3.6.6.tar.xz
RUN tar xf rabbitmq-server-generic-unix-3.6.6.tar.xz

# install erlang language for RabbitMQ
RUN apt-get install -y erlang

# install elasticsearch
RUN wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-5.5.0.tar.gz
RUN sha1sum elasticsearch-5.5.0.tar.gz
RUN tar -xzf elasticsearch-5.5.0.tar.gz

# clone yacy_grid_mcp repository
RUN git clone https://github.com/nikhilrayaprolu/yacy_grid_mcp.git
WORKDIR /yacy_grid_mcp

RUN cat docker/configftp.properties > ../apacheftpserver1.1.0/res/conf/users.properties

# compile
RUN gradle build
RUN mkdir data/mcp-8100/conf/ -p
RUN cp docker/config-mcp.properties data/mcp-8100/conf/config.properties
RUN chmod +x ./docker/start.sh

# Expose web interface ports
# 2121: ftp, a FTP server to be used for mass data / file storage
# 5672: rabbitmq, a rabbitmq message queue server to be used for global messages, queues and stacks
# 9300: elastic, an elasticsearch server or main cluster address for global database storage
EXPOSE 2121 5672 9300 9200 15672 8100

# Define default command.
ENTRYPOINT [“/bin/bash”, “./docker/start.sh”]

 

We have created a start.sh file to start RabbitMQ and Apache FTP services. At the end, for compilation gradle run will be executed.

adduser –disabled-password –gecos ” r
adduser r sudo
echo ‘%sudo ALL=(ALL) NOPASSWD:ALL’ >> /etc/sudoers
chmod a+rwx /elasticsearch-5.5.0 -R
su -m r -c ‘/elasticsearch-5.5.0/bin/elasticsearch -Ecluster.name=yacygrid &’
cd /apacheftpserver1.1.0
./bin/ftpd.sh res/conf/ftpdtypical.xml &
/rabbitmq_server-3.6.6/sbin/rabbitmq-server -detached
sleep 5s;
/rabbitmq_server-3.6.6/sbin/rabbitmq-plugins enable rabbitmq_management
/rabbitmq_server3.6.6/sbin/rabbitmqctl add_user yacygrid password4account
echo [{rabbit, [{loopback_users, []}]}]. >> /rabbitmq_server-3.6.6/etc/rabbitmq/rabbitmq.config
/rabbitmq_server-3.6.6/sbin/rabbitmqctl set_permissions -p / yacygrid “.*” “.*” “.*”
cd /yacy_grid_mcp
sleep 5s;
gradle run

 

start.sh will first add username and then password. Then it will start RabbitMQ along with Apache FTP.  For username and password, we have created a separate files to configure their properties during Docker run which can be found here:

The logic behind running all the microservices in one docker instance was: creating each container for microservice and then link those containers with the help of docker-compose.yml file.

The Dockerfile which we have created was corresponding to one image. Another image was elasticsearch which was linked to this Dockerfile. The latest version of elasticsearch image was already available on their site: https://www.elastic.co/guide/en/elasticsearch/reference/current/docker.html

We configured the docker-compose.yml file according to the reference link provided above. The docker-compose file can be found here: https://github.com/yacy/yacy_grid_mcp/blob/master/docker/docker-compose.yml

The source code for the implementation of whole structure can be found here: https://github.com/yacy/yacy_grid_mcp/tree/master/docker

Resources

 

Continue Reading

Optimising Docker Images for loklak Server

The loklak server is in a process of moving to Kubernetes. In order to do so, we needed to have different Docker images that suit these deployments. In this blog post, I will be discussing the process through which I optimised the size of Docker image for these deployments.

Initial Image

The image that I started with used Ubuntu as base. It installed all the components needed and then modified the configurations as required –

FROM ubuntu:latest

# Env Vars
ENV LANG=en_US.UTF-8
ENV JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF8
ENV DEBIAN_FRONTEND noninteractive

WORKDIR /loklak_server

RUN apt-get update
RUN apt-get upgrade -y
RUN apt-get install -y git openjdk-8-jdk
RUN git clone https://github.com/loklak/loklak_server.git /loklak_server
RUN git checkout development
RUN ./gradlew build -x test -x checkstyleTest -x checkstyleMain -x jacocoTestReport
RUN sed -i.bak 's/^\(port.http=\).*/\180/' conf/config.properties
... # More configurations
RUN echo "while true; do sleep 10;done" >> bin/start.sh

# Start
CMD ["bin/start.sh", "-Idn"]

The size of images built using this Dockerfile was quite huge –

REPOSITORY          TAG                 IMAGE ID            CREATED              SIZE

loklak_server       latest              a92f506b360d        About a minute ago   1.114 GB

ubuntu              latest              ccc7a11d65b1        3 days ago           120.1 MB

But since this size is not acceptable, we needed to reduce it.

Moving to Apline

Alpine Linux is an extremely lightweight Linux distro, built mainly for the container environment. Its size is so tiny that it hardly puts any impact on the overall size of images. So, I replaced Ubuntu with Alpine –

FROM alpine:latest

...
RUN apk update
RUN apk add git openjdk8 bash
...

And now we had much smaller images –

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

loklak_server       latest              54b507ee9187        17 seconds ago      668.8 MB

alpine              latest              7328f6f8b418        6 weeks ago         3.966 MB

As we can see that due to no caching and small size of Alpine, the image size is reduced to almost half the original.

Reducing Content Size

There are many things in a project which are no longer needed while running the project, like the .git folder (which is huge in case of loklak) –

$ du -sh loklak_server/.git
236M loklak_server/.git

We can remove such files from the Docker image and save a lot of space –

rm -rf .[^.] .??*

Optimizing Number of Layers

The number of layers also affect the size of the image. More the number of layers, more will be the size of image. In the Dockerfile, we can club together the RUN commands for lower number of images.

RUN apk update && apk add openjdk8 git bash && \
  git clone https://github.com/loklak/loklak_server.git /loklak_server && \
  ...

After this, the effective size is again reduced by a major factor –

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

loklak_server       latest              54b507ee9187        17 seconds ago      422.3 MB

alpine              latest              7328f6f8b418        6 weeks ago         3.966 MB

Conclusion

In this blog post, I discussed the process of optimising the size of Dockerfile for Kubernetes deployments of loklak server. The size was reduced to 426 MB from 1.234 GB and this provided much faster push/pull time for Docker images, and therefore, faster updates for Kubernetes deployments.

Resources

Continue Reading
Close Menu