aws – blog.fossasia.org

Uploading Files via APIs in the Open Event Server

Post author:magdalenesuo
Post published:August 10, 2017
Post category:API Event Event Management FOSSASIA GSoC Open Event
Post comments:0 Comments

There are two file upload endpoints. One is endpoint for image upload and the other is for all other files being uploaded. The latter endpoint is to be used for uploading files such as slides, videos and other presentation materials for a session. So, in FOSSASIA’s Orga Server project, when we need to upload a file, we make an API request to this endpoint which is turn uploads the file to the server and returns back the url for the uploaded file. We then store this url for the uploaded file to the database with the corresponding row entry.

Sending Data

The endpoint /upload/file accepts a POST request, containing a multipart/form-data payload. If there is a single file that is uploaded, then it is uploaded under the key “file” else an array of file is sent under the key “files”.

A typical single file upload cURL request would look like this:

curl -H “Authorization: JWT <key>” -F file=@file.pdf -x POST http://localhost:5000/v1/upload/file

A typical multi-file upload cURL request would look something like this:

curl -H “Authorization: JWT <key>” -F files=@file1.pdf -F files=@file2.pdf -x POST http://localhost:5000/v1/upload/file

Thus, unlike other endpoints in open event orga server project, we don’t send a json encoded request. Instead it is a form data request.

Saving Files

We use different services such as S3, google cloud storage and so on for storing the files depending on the admin settings as decided by the admin of the project. One can even ask to save the files locally by passing a GET parameter force_local=true. So, in the backend we have 2 cases to tackle- Single File Upload and Multiple Files Upload.

Single File Upload

if 'file' in request.files:
        files = request.files['file']
        file_uploaded = uploaded_file(files=files)
        if force_local == 'true':
            files_url = upload_local(
                file_uploaded,
                UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
            )
        else:
            files_url = upload(
                file_uploaded,
                UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
            )

We get the file, that is to be uploaded using request.files[‘file’] with the key as ‘file’ which was used in the payload. Then we use the uploaded_file() helper function to convert the file data received as payload into a proper file and store it in a temporary storage. After this, if force_local is set as true, we use the upload_local helper function to upload it to the local storage, i.e. the server where the application is hosted, else we use whatever service is set by the admin in the admin settings.

In uploaded_file() function of helpers module, we extract the filename and the extension of the file from the form-data payload. Then we check if the suitable directory already exists. If it doesn’t exist, we create a new directory and then save the file in the directory

extension = files.filename.split('.')[1]
        filename = get_file_name() + '.' + extension
        filedir = current_app.config.get('BASE_DIR') + '/static/uploads/'
        if not os.path.isdir(filedir):
            os.makedirs(filedir)
        file_path = filedir + filename
        files.save(file_path)

After that the upload function gets the settings key for either s3 or google storage and then uses the corresponding functions to upload this temporary file to the storage.

Multiple File Upload

 elif 'files[]' in request.files:
        files = request.files.getlist('files[]')
        files_uploaded = uploaded_file(files=files, multiple=True)
        files_url = []
        for file_uploaded in files_uploaded:
            if force_local == 'true':
                files_url.append(upload_local(
                    file_uploaded,
                    UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
                ))
            else:
                files_url.append(upload(
                    file_uploaded,
                    UPLOAD_PATHS['temp']['event'].format(uuid=uuid.uuid4())
                ))

In case of multiple files upload, we get a list of files instead of a single file. Hence we get the list of files sent as form data using request.files.getlist(‘files[]’). Here ‘files’ is the key that is used and since it is an array of file content, hence it is written as files[]. We again use the uploaded_file() function to get back a list of temporary files from the content that has been uploaded as form-data. After that we loop over all the temporary files that are stored in the variable files_uploaded in the above code. Next, for every file in the list of temporary files, we use the upload() helper function to save these files in the storage system of the application.

In the uploaded_file() function of the helpers module, since this time there are multiple files and their content sent, so things work differently. We loop over all the files that are received and for each of these files we find their filename and extension. Then we create directories to save these files in and then save the content of the file with the corresponding filename and extension. After the file has been saved, we append it to a list and finally return the entire list so that we can get a list of all files.

if multiple:
        files_uploaded = []
        for file in files:
            extension = file.filename.split('.')[1]
            filename = get_file_name() + '.' + extension
            filedir = current_app.config.get('BASE_DIR') + '/static/uploads/'
            if not os.path.isdir(filedir):
                os.makedirs(filedir)
            file_path = filedir + filename
            file.save(file_path)
            files_uploaded.append(UploadedFile(file_path, filename))

The upload() function then finally returns us the urls for the files after saving them.

API Response

The file upload endpoint either returns a single url or a list of urls depending on whether a single file was uploaded or multiple files were uploaded. The url for the file depends on the storage system that has been used. After the url or list of urls is received, we jsonify the entire response so that we can send a proper JSON response that can be parsed properly in the frontend and used for saving corresponding information to the database using the other API services.

A typical single file upload response looks like this:

{
     "url": "https://xyz.storage.com/asd/fgh/hjk/12332233.docx"
 }

Multiple file upload response looks like this:

{
     "url": [
         "https://xyz.storage.com/asd/fgh/hjk/12332233.docx",
         "https://xyz.storage.com/asd/fgh/hjk/66777777.ppt"
     ]
 }

You can find the related documentations and example payloads on how to use this endpoint to upload files here: http://open-event-api.herokuapp.com/#upload-file-upload.

Reference:

More about the uploaded_file() helper function: https://github.com/fossasia/open-event-orga-server/blob/nextgen/app/api/helpers/files.py
Source code for all storage related functions used in open event organizer server: https://github.com/fossasia/open-event-orga-server/blob/nextgen/app/api/helpers/storage.py
Check how to send multipart/form-data as curl request: https://ec.haxx.se/http-multipart.html

Deploying Yacy with Docker on Different Cloud Platforms

Post author:nikhilrayaprolu
Post published:August 7, 2017
Post category:FOSSASIA
Post comments:0 Comments

To make deploying of yacy easier we are now supporting Docker based installation.

Following the steps below one could successfully run Yacy on docker.

You can pull the image of Yacy from https://hub.docker.com/r/nikhilrayaprolu/yacygridmcp/ or buid it on your own with the docker file present at https://github.com/yacy/yacy_grid_mcp/blob/master/docker/Dockerfile

One could pull the docker image using command:

docker pull nikhilrayaprolu/yacygridmcp

2) Once you have an image of yacygridmcp you can run it by typing

docker run <image_name>

You can access the yacygridmcp endpoint at localhost:8100

Installation of Yacy on cloud servers:

Right now installation yacy on cloud servers is documented at https://github.com/nikhilrayaprolu/yacy_grid_mcp/tree/documentation/docs/installation
We have documentation provided for hosting yacy on Google Cloud, AWS, Bluemix and digital Ocean and Heroku.

Installing Yacy and all microservices with just one command:

One can also download,build and run Yacy and all its microservices (presently supported are yacy_grid_crawler, yacy_grid_loader, yacy_grid_ui, yacy_grid_parser, and yacy_grid_mcp )
To build all these microservices in one command, run this bash script productiondeployment.sh
- `bash productiondeployment.sh build` will install all required dependencies and build microservices by cloning them from github repositories.
- `bash productiondeployment.sh run` will run all services and starts them.
- Right now all repositories are cloned into ~/yacy and you can make customisations and your own changes to this code and build your own customised yacy.

Resources:

Docker documentation: https://docs.docker.com/
Deployment to Google Cloud: https://engineering.hexacta.com/automatic-deployment-of-multiple-docker-containers-to-google-container-engine-using-travis-e5d9e191d5ad
Writing bash script http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html