Uploading Badges To Google Cloud Badgeyay

Badgeyay is an open source project developed by FOSSASIA community. This project mainly aims for generating badges for technical conferences and events.The project is divided into two parts mainly. Backend is developed in flask and frontend is developed in emberjs. The problem is after the badge generation, the flask server is storing and serving those files. In practise this is not a good convention to do so. This should be handled by secondary hosting server program like gunicorn or nginx. Better approach would be to consume the firebase storage and admin sdk for storing the badges on google cloud. This will also offload storage needs from the flask server and also give a public link over the network to access.

Procedure

  1. Get the file path of the temporary badge generated on the flask server. Currently badges are saved in the directory of the image file uploaded and final badge generated is written in all-badges.pdf`
badgePath = os.getcwd() + ‘/static/temporary/’ + badgeFolder
badgePath + ‘/all-badges.pdf’

 

  1. Create the blob path for the storage. Blob can be understood as the final reference to the location where the contents are saved onto the server. This can be a nested directory structure or simply a filename in root directory.
‘badges/’ + badge_created.id + ‘.pdf’

 

In our case it is the id of the badge that is generated in the badges directory.

  1. Function for uploading the file generated in temporary storage to google cloud storage.
def fileUploader(file_path, blob_path):
  bucket = storage.bucket()
  fileUploaderBlob = bucket.blob(blob_path)
  try:
      with open(file_path, ‘rb’) as file_:
          fileUploaderBlob.upload_from_file(file_)
  except Exception as e:
      print(e)
  fileUploaderBlob.make_public()
  return fileUploaderBlob.public_url

 

It creates a bucket using the firebase admin SDK and then open the file from the file path. After opening the file from the path it writes the data to the cloud storage. After the data is written, the blob is made public and the public access link to the blob is fetched, which then later returned and saved in the local database.

Topics Involved

  • Firebase admin sdk for storage
  • Google cloud storage sdk

Resources

  • Firebase admin sdk documentation – Link
  • Google Cloud Storage SDK Python – Link
  • Blob Management – Link

 

Continue ReadingUploading Badges To Google Cloud Badgeyay

Deploying a Postgres-Based Open Event Server to Kubernetes

In this post, I will walk you through deploying the Open Event Server on Kubernetes, hosted on Google Cloud Platform’s Compute Engine. You’ll be needing a Google account for this, so create one if you don’t have one.

First, I cd into the root of our project’s Git repository. Now I need to create a Dockerfile. I will use Docker to package our project into a nice image which can be then be “pushed” to Google Cloud Platform. A Dockerfile is essentially a text doc which simply contains the commands required to assemble an image. For more details on how to write one for your project specifically, check out Docker docs. For Open Event Server, the Dockerfile looks like the following:

FROM python:3-slim

ENV INSTALL_PATH /open_event
RUN mkdir -p $INSTALL_PATH

WORKDIR $INSTALL_PATH

# apt-get update and update some packages
RUN apt-get update && apt-get install -y wget git ca-certificates curl && update-ca-certificates && apt-get clean -y


# install deps
RUN apt-get install -y --no-install-recommends build-essential python-dev libpq-dev libevent-dev libmagic-dev && apt-get clean -y

# copy just requirements
COPY requirements.txt requirements.txt
COPY requirements requirements

# install requirements
RUN pip install --no-cache-dir -r requirements.txt 
RUN pip install eventlet

# copy remaining files
COPY . .

CMD bash scripts/docker_run.sh

These commands simply install the dependencies and set up the environment for our project. The final CMD command is for running our project, which, in our case, is a server.

After our Dockerfile is configured, I go to Google Cloud Platform’s console and create a new project:

Screen Shot 2018-08-09 at 7.37.34 PM

Once I enter the product name and other details, I enable billing in order to use Google’s cloud resources. A credit card is required to set up a billing account, but Google doesn’t charge any money for that. Also, one of the perks of being a part of FOSSASIA was that I had about $3000 in Google Cloud credits! Once billing is enabled, I then enable the Container Engine API. It is required to support Kubernetes on Google Compute Engine. Next step is to install Google Cloud SDK. Once that is done, I run the following command to install Kubernetes CLI tool:

gcloud components install kubectl

Then I configure the Google Cloud Project Zone via the following command:

gcloud config set compute/zone us-west1-a

Now I will create a disk (for storing our code and data) as well as a temporary instance for formatting that disk:

gcloud compute disks create pg-data-disk --size 1GB
gcloud compute instances create pg-disk-formatter
gcloud compute instances attach-disk pg-disk-formatter --disk pg-data-disk

Once the disk is attached to our instance, I SSH into it and list the available disks:

gcloud compute ssh "pg-disk-formatter"

Now, ls the available disks:

ls /dev/disk/by-id

This will list multiple disks (as shown in the Terminal window below), but the one I want to format is “google-persistent-disk-1”.

Screen Shot 2018-08-09 at 7.48.39 PM.png

Now I format that disk via the following command:

sudo mkfs.ext4 -F -E lazy_itable_init=0,lazy_journal_init=0,discard /dev/disk/by-id/google-persistent-disk-1

Finally, after the formatting is done, I exit the SSH session and detach the disk from the instance:

gcloud compute instances detach-disk pg-disk-formatter --disk pg-data-disk

Now I create a Kubernetes cluster and get its credentials (for later use) via gcloud CLI:

gcloud container clusters create opev-cluster
gcloud container clusters get-credentials opev-cluster

I can additionally bind our deployment to a domain name! If you already have a domain name, you can use the IP reserved below as an A record of your domain’s DNS Zone. Otherwise, get a free domain at freenom.com and do the same with that domain. This static external IP address is reserved in the following manner:

gcloud compute addresses create testip --region us-west1

It’ll be useful if I note down this IP for further reference. I now add this IP (and our domain name if I used one) in Kubernetes configuration files (for Open Event Server, these were specific files for services like nginx and PostgreSQL):

#
# Deployment of the API Server and celery worker
#
kind: Deployment
apiVersion: apps/v1beta1
metadata:
name: api-server
namespace: web
spec:
replicas: 1
template:
metadata:
labels:
app: api-server
spec:
volumes:
- name: data-store
emptyDir: {}
containers:
- name: api-server
image: eventyay/nextgen-api-server:latest
command: ["/bin/sh","-c"]
args: ["./kubernetes/run.sh"]
livenessProbe:
httpGet:
path: /health-check/
port: 8080
initialDelaySeconds: 30
timeoutSeconds: 3
ports:
- containerPort: 8080
protocol: TCP
envFrom:
- configMapRef:
name: api-server
env:
- name: C_FORCE_ROOT
value: 'true'
- name: PYTHONUNBUFFERED
value: 'TRUE'
- name: DEPLOYMENT
value: 'api'
- name: FORCE_SSL
value: 'yes'
volumeMounts:
- mountPath: /opev/open_event/static/uploads/
name: data-store
- name: celery
image: eventyay/nextgen-api-server:latest
command: ["/bin/sh","-c"]
args: ["./kubernetes/run.sh"]
envFrom:
- configMapRef:
name: api-server
env:
- name: C_FORCE_ROOT
value: 'true'
- name: DEPLOYMENT
value: 'celery'
volumeMounts:
- mountPath: /opev/open_event/static/uploads/
name: data-store
restartPolicy: Always

These files will look similar for your specific project as well. Finally, I run the deployment script provided for Open Event Server to deploy with defined configuration:

./kubernetes/deploy.sh create all

This is the specific deployment script for this project; for your app, this will be a bunch of simple kubectl create * commands specific to your app’s context. Once this script completes, our app is deployed! After a few minutes, I can see it live at the domain I provided above!

I can also see our project’s statistics and set many other attributes at Google Cloud Platform’s console:

Screen Shot 2018-08-09 at 7.49.42 PM.png

And that’s it for this post! For more details on deploying Docker-based containers on Kubernetes clusters hosted by Google, visit Google Cloud Platform’s documentation.

Resources

Continue ReadingDeploying a Postgres-Based Open Event Server to Kubernetes

Continuous Integration and Deployment of Yacy Grid

We have deployed Yacy Grid on Google cloud recently, and we have achieved this using kubernetes and Travis for auto deployment.

How we have deployed it:

Firstly, it is advised to have different containers for each service your application requires, and follow a multi container architecture. Using multi container architecture you can allocate fixed size of power to each application and also replicate individual services, whichever is required. Presently, Yacy has two main applications which are required to be deployed in separate containers – Yacy_grid_mcp and ElasticSearch.

We took the official kubernetes YAML files of ElasticSearch and followed the instructions at https://github.com/kubernetes/examples/blob/master/staging/elasticsearch/README.md for deployment of elastic search on the google cloud.

With this we are able to run pods, volumes required for elastic search and services for connecting Yacy with elastic search.

The pull request regarding deployment of separate elasticsearch component is at https://github.com/yacy/yacy_grid_mcp/pull/27/files

Below figure shows different services and external endpoints present pods use for elastic search.

Now elastic search can be accessed at 35.202.154.219:9300 and http://35.193.124.253:9200/

Continuous deployment of Yacy_grid_mcp:

Please make sure that you have created a cluster on google container engine for deploying our containers on it. Regarding starting a project and cluster please read https://cloud.google.com/container-engine/docs/

1.Initially, Travis.yml initiates and sets up the required environment for Yacy deployment by installing Google cloud cli and kubectl components.

Source code regarding the Travis setup could be found at https://github.com/yacy/yacy_grid_mcp/blob/master/.travis.yml

2.Later Travis runs the depoy_staging.sh file, which builds the docker image of yacy o the present build and pushes it to hub.docker.com

if [ "$TRAVIS_PULL_REQUEST" != "false" -o "$TRAVIS_BRANCH" != "$SOURCE_BRANCH" ]; then
    echo "Skipping deploy; The request or commit is not on master"
    exit 0
fi

set -e

docker build -t nikhilrayaprolu/yacygridmcp:$TRAVIS_COMMIT ./docker
docker login -u="$DOCKER_USERNAME" -p="$DOCKER_PASSWORD"
docker tag nikhilrayaprolu/yacygridmcp:$TRAVIS_COMMIT nikhilrayaprolu/yacygridmcp:latest
docker push nikhilrayaprolu/yacygridmcp

Later with service key, we authenticate with google cloud and set the required environments and variables

echo $GCLOUD_SERVICE   base64 --decode -i > ${HOME}/gcloud-service-key.json
gcloud auth activate-service-account --key-file ${HOME}/gcloud-service-key.json

gcloud --quiet config set project $PROJECT_NAME_STG
gcloud --quiet config set container/cluster $CLUSTER_NAME_STG
gcloud --quiet config set compute/zone ${CLOUDSDK_COMPUTE_ZONE}
gcloud --quiet container clusters get-credentials $CLUSTER_NAME_STG

And Later we push the docker image built to google cloud and deploy it

kubectl config view
kubectl config current-context

kubectl set image deployment/${KUBE_DEPLOYMENT_NAME} ${KUBE_DEPLOYMENT_CONTAINER_NAME}=nikhilrayaprolu/yacygridmcp:$TRAVIS_COMMIT

Presently Yacy runs on 5vCPUs

With the following pods and services:

Also one can use kubectl cli for getting information regarding the cluster and pods as shown below

Pull request regarding deployment of yacy on google cloud is available at: https://github.com/yacy/yacy_grid_mcp/pull/16/files

References:

1.A Medium Blog on CD to Google Container: https://medium.com/google-cloud/continuous-delivery-in-a-microservice-infrastructure-with-google-container-engine-docker-and-fb9772e81da7

2.Another Blog on CD to Google Container: https://engineering.hexacta.com/automatic-deployment-of-multiple-docker-containers-to-google-container-engine-using-travis-e5d9e191d5ad

3.Deploying ElasticSearch to Cloud using Kubernetes: https://github.com/kubernetes/examples/blob/master/staging/elasticsearch/README.md

Continue Reading

How to Get Secure Webhook for SUSI Bots in Kubernetes Deployment

Webhook is a user-defined callback which gets triggered by any events in code like receiving a message from a user in SUSI bot is an event. Few bots need webhook URI for callback like in SUSI Viber bot we need to define a webhook URI in the code to receive callbacks and make our Viber bot work. In this blog, we will learn how can we get an SSL activated webhook while deploying our bot to Google container using Kubernetes. We will generate SSL certificate using kube lego service that is included in kubernetes and you will define that in yaml files below. We can also generate SSL certificate using third party services like CloudFlare but by using it we will be dependant on CloudFlare so we will use kube lego.

We will start off by registering a domain first on which we will activate SSL certificate and use that domain as a webhook. Go to freenom and register your account. After logging in, register a free domain of any name and check out that order. Next, you have to set IP for DNS of this domain. To do so we will reserve an IP address in our Google cloud project with this command:

gcloud compute addresses create IPname --region us-central1

You will get a created message. To see your IP go to VPC Network -> External IP addresses. Add this IP to DNS zone of your domain and save it for later use in yaml files that we will use for deployment. Now we will deploy our bot using yaml files but before deployment, we will create a cluster

gcloud container clusters create clusterName

After creating cluster add these yaml files to your bot repository and add your IP address that you have saved above to the yamls/nginx/service.yaml file for “loadBalancerIP” parameter. Replace domain name in yamls/application/ingress-notls.yaml and yamls/application/ingress-tls.yaml with your domain name that you have registered already. Add your email ID to yamls/lego/configmap.yaml for “lego.email” parameter. Replace “image” and “env” parameters in yamls/application/deployment.yaml with your docker image and your environment variables that you are using in your code. After changing yaml files we will use this deploy script to create a deployment. Change paths for yaml files in script according to your yaml files path.

In gcloud shell run the following command to deploy an application using given configurations.

bash ./path-to-deploy-script/deploy.sh create all

This will create the deployment as we have defined in the script. The Kubernetes master creates the load balancer and related Compute Engine forwarding rules, target pools, and firewall rules to make the service fully accessible from outside of Google Cloud Platform. Wait for a few minutes for all the containers to be created and the SSL Certificates to be generated and loaded.

You have successfully created a secure webhook. Test it by opening the domain that you have registered at the start.

Resources

Enabling SSL using CloudFlare: https://jonnyjordan.com/blog/how-to-setup-cloudflare-flexible-ssl-for-wordpress/
https://www.youtube.com/watch?v=qFvwEVkl5gk

Continue ReadingHow to Get Secure Webhook for SUSI Bots in Kubernetes Deployment

Persistently Storing loklak Server Dumps on Kubernetes

In an earlier blog post, I discussed loklak setup on Kubernetes. The deployment mentioned in the post was to test the development branch. Next, we needed to have a deployment where all the messages are collected and dumped in text files that can be reused.

In this blog post, I will be discussing the challenges with such deployment and the approach to tackle them.

Volatile Disk in Kubernetes

The pods that hold deployments in Kubernetes have disk storage. Any data that gets written by the application stays only until the same version of deployment is running. As soon as the deployment is updated/relocated, the data stored during the application is cleaned up.


Due to this, dumps are written when loklak is running but they get wiped out when the deployment image is updated. In other words, all dumps are lost when the image updates. We needed to find a solution to this as we needed a permanent storage when collecting dumps.

Persistent Disk

In order to have a storage which can hold data permanently, we can mount persistent disk(s) on a pod at the appropriate location. This ensures that the data that is important to us stays with us, even
when the deployment goes down.


In order to add persistent disks, we first need to create a persistent disk. On Google Cloud Platform, we can use the gcloud CLI to create disks in a given region –

gcloud compute disks create --size=<required size> --zone=<same as cluster zone> <unique disk name>

After this, we can mount it on a Docker volume defined in Kubernetes configurations –

      ...
      volumeMounts:
        - mountPath: /path/to/mount
          name: volume-name
  volumes:
    - name: volume-name
      gcePersistentDisk:
        pdName: disk-name
        fsType: fileSystemType

But this setup can’t be used for storing loklak dumps. Let’s see “why” in the next section.

Rolling Updates and Persistent Disk

The Kubernetes deployment needs to be updated when the master branch of loklak server is updated. This update of master deployment would create a new pod and try to start loklak server on it. During all this, the older deployment would also be running and serving the requests.


The control will not be transferred to the newer pod until it is ready and all the probes are passing. The newer deployment will now try to mount the disk which is mentioned in the configuration, but it would fail to do so. This would happen because the older pod has already mounted the disk.


Therefore, all new deployments would simply fail to start due to insufficient resources. To overcome such issues, Kubernetes allows persistent volume claims. Let’s see how we used them for loklak deployment.

Persistent Volume Claims

Kubernetes provides Persistent Volume Claims which claim resources (storage) from a Persistent Volume (just like a pod does from a node). The higher level APIs are provided by Kubernetes (configurations and kubectl command line). In the loklak deployment, the persistent volume is a Google Compute Engine disk –

apiVersion: v1
kind: PersistentVolume
metadata:
  name: dump
  namespace: web
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: slow
  gcePersistentDisk:
    pdName: "data-dump-disk"
    fsType: "ext4"

[SOURCE]

It must be noted here that a persistent disk by the name of data-dump-index is already created in the same region.


The storage class defines the way in which the PV should be handled, along with the provisioner for the service –

kind: StorageClass
apiVersion: storage.k8s.io/v1
metadata:
  name: slow
  namespace: web
provisioner: kubernetes.io/gce-pd
parameters:
  type: pd-standard
  zone: us-central1-a

[SOURCE]

After having the StorageClass and PersistentVolume, we can create a claim for the volume by using appropriate configurations –

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: dump
  namespace: web
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: slow

[SOURCE]

After this, we can mount this claim on our Deployment –

  ...
  volumeMounts:
    - name: dump
      mountPath: /loklak_server/data
volumes:
  - name: dump
    persistentVolumeClaim:
      claimName: dump

[SOURCE]

Verifying persistence of Dumps

To verify this, we can redeploy the cluster using the same persistent disk and check if the earlier dumps are still present there –

$ http http://link.to.deployment/dump/
HTTP/1.1 200 OK
Cache-Control: public, max-age=60
Content-Type: text/html;charset=utf-8
...


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">

<h1>Index of /dump</h1>
<pre>      Name 
[gz ] <a href="messages_20170802_71562040.txt.gz">messages_20170802_71562040.txt.gz</a>   Thu Aug 03 00:07:21 GMT 2017   132M
[gz ] <a href="messages_20170803_69925009.txt.gz">messages_20170803_69925009.txt.gz</a>   Mon Aug 07 15:40:04 GMT 2017   532M
[gz ] <a href="messages_20170807_36357603.txt.gz">messages_20170807_36357603.txt.gz</a>   Wed Aug 09 10:26:24 GMT 2017   377M
[txt] <a href="messages_20170809_27974404.txt">messages_20170809_27974404.txt</a>      Thu Aug 10 08:51:49 GMT 2017  1564M
<hr></pre>
...

Conclusion

In this blog post, I discussed the process of deployment of loklak with persistent dumps on Kubernetes. This deployment is intended to work as root.loklak.org in near future. The changes were proposed in loklak/loklak_server#1377 by @singhpratyush (me).

Resources

Continue ReadingPersistently Storing loklak Server Dumps on Kubernetes

Deploying Yacy with Docker on Different Cloud Platforms

To make deploying of yacy easier we are now supporting Docker based installation.

Following the steps below one could successfully run Yacy on docker.

  1. You can pull the image of Yacy from https://hub.docker.com/r/nikhilrayaprolu/yacygridmcp/ or buid it on your own with the docker file present at https://github.com/yacy/yacy_grid_mcp/blob/master/docker/Dockerfile

One could pull the docker image using command:

docker pull nikhilrayaprolu/yacygridmcp

 

2) Once you have an image of yacygridmcp you can run it by typing

docker run <image_name>

 

You can access the yacygridmcp endpoint at localhost:8100

Installation of Yacy on cloud servers:

Installing Yacy and all microservices with just one command:

  • One can also download,build and run Yacy and all its microservices (presently supported are yacy_grid_crawler, yacy_grid_loader, yacy_grid_ui, yacy_grid_parser, and yacy_grid_mcp )
  • To build all these microservices in one command, run this bash script productiondeployment.sh
    • `bash productiondeployment.sh build` will install all required dependencies and build microservices by cloning them from github repositories.
    • `bash productiondeployment.sh run` will run all services and starts them.
    • Right now all repositories are cloned into ~/yacy and you can make customisations and your own changes to this code and build your own customised yacy.

The related PRs of this work are https://github.com/yacy/yacy_grid_mcp/pull/21 and https://github.com/yacy/yacy_grid_mcp/pull/20 and https://github.com/yacy/yacy_grid_mcp/pull/13

Resources:

Continue ReadingDeploying Yacy with Docker on Different Cloud Platforms