Until now, static files stored locally on Open Event server did not have port specification in their URLs. This opened the door for problems while consuming local APIs. This would have created inconsistencies, if two server processes were being served on the same machine but at different ports. In this blog post, I will explain my approach towards solving this problem, and describe code snippets to demonstrate the changes I made in the Open Event Server codebase.
The first part in this process involved finding the source of the bug. For this, my open-source integrated development environment, Microsoft Visual Studio Code turned out to be especially useful. It allowed me to jump from function calls to function definitions quickly:
I started at events.py and jumped all the way to storage.py, where I finally found out the source of this bug, in upload_local() function:
def upload_local(uploaded_file, key, **kwargs): """ Uploads file locally. Base dir - static/media/ """ filename = secure_filename(uploaded_file.filename) file_relative_path = 'static/media/' + key + '/' + generate_hash(key) + '/' + filename file_path = app.config['BASE_DIR'] + '/' + file_relative_path dir_path = file_path.rsplit('/', 1)[0] # delete current try: rmtree(dir_path) except OSError: pass # create dirs if not os.path.isdir(dir_path): os.makedirs(dir_path) uploaded_file.save(file_path) file_relative_path = '/' + file_relative_path if get_settings()['static_domain']: return get_settings()['static_domain'] + \ file_relative_path.replace('/static', '') url = urlparse(request.url) return url.scheme + '://' + url.hostname + file_relative_path
Look closely at the return statement:
return url.scheme + '://' + url.hostname + file_relative_path
Bingo! This is the source of our bug. A straightforward solution is to simply concatenate the port number in between, but that will make this one-liner look clumsy – unreadable and un-pythonic. We therefore use Python string formatting:
return '{scheme}://{hostname}:{port}{file_relative_path}'.format( scheme=url.scheme, hostname=url.hostname, port=url.port, file_relative_path=file_relative_path)
But this statement isn’t perfect. There’s an edge case that might give unexpected URL. If the port isn’t originally specified, Python’s string formatting heuristic will substitute url.port with None. This will result in a URL like http://localhost:None/some/file_path.jpg, which is obviously something we don’t desire. We therefore append a call to Python’s string replace() method: replace(‘:None’, ”)
The resulting return statement now looks like the following:
return '{scheme}://{hostname}:{port}{file_relative_path}'.format( scheme=url.scheme, hostname=url.hostname, port=url.port, file_relative_path=file_relative_path).replace(':None', '')
This should fix the problem. But that’s not enough. We need to ensure that our project adapts well with the change we made. We check this by running the project tests locally:
$ nosetests tests/unittests
Unfortunately, the tests fail with the following traceback:
====================================================================== ERROR: test_create_save_image_sizes (tests.unittests.api.helpers.test_files.TestFilesHelperValidation) ---------------------------------------------------------------------- Traceback (most recent call last): File "/open-event-server/tests/unittests/api/helpers/test_files.py", line 138, in test_create_save_image_sizes resized_width_large, _ = self.getsizes(resized_image_file_large) File "/open-event-server/tests/unittests/api/helpers/test_files.py", line 22, in getsizes im = Image.open(file) File "/usr/local/lib/python3.6/site-packages/PIL/Image.py", line 2312, in open fp = builtins.open(filename, "rb") FileNotFoundError: [Errno 2] No such file or directory: '/open-event-server:5000/static/media/events/53b8f572-5408-40bf-af97-6e9b3922631d/large/UFNNeW5FRF/5980ede1-d79b-4907-bbd5-17511eee5903.jpg'
It’s evident from this traceback that the code in our test framework is not converting the image url to file path correctly. The port specification part is working fine, but it should not affect file names, they should be independent of port number. The files saved originally do not have port specified in their name, but the code in test framework is expecting port to be involved, hence the above error.
Using the traceback, I went to the code in the test framework where this problem occurred:
def test_create_save_image_sizes(self): with app.test_request_context(): image_url_test = 'https://cdn.pixabay.com/photo/2014/09/08/17/08/hot-air-balloons-439331_960_720.jpg' image_sizes_type = "event" width_large = 1300 width_thumbnail = 500 width_icon = 75 image_sizes = create_save_image_sizes(image_url_test, image_sizes_type) resized_image_url = image_sizes['original_image_url'] resized_image_url_large = image_sizes['large_image_url'] resized_image_url_thumbnail = image_sizes['thumbnail_image_url'] resized_image_url_icon = image_sizes['icon_image_url'] resized_image_file = app.config.get('BASE_DIR') + resized_image_url.split('/localhost')[1] resized_image_file_large = app.config.get('BASE_DIR') + resized_image_url_large.split('/localhost')[1] resized_image_file_thumbnail = app.config.get('BASE_DIR') + resized_image_url_thumbnail.split('/localhost')[1] resized_image_file_icon = app.config.get('BASE_DIR') + resized_image_url_icon.split('/localhost')[1] resized_width_large, _ = self.getsizes(resized_image_file_large) resized_width_thumbnail, _ = self.getsizes(resized_image_file_thumbnail) resized_width_icon, _ = self.getsizes(resized_image_file_icon) self.assertTrue(os.path.exists(resized_image_file)) self.assertEqual(resized_width_large, width_large) self.assertEqual(resized_width_thumbnail, width_thumbnail) self.assertEqual(resized_width_icon, width_icon)
Obviously, resized_image_url.split(‘/localhost’)[1] will involve port number. So we have to change this line. But this means we also have to change the subsequent lines involving thumbnail, icon and large images. Instead of stripping the port for each of these, we can simply do this collectively at an earlier stage. So we redefine the image_sizes dictionary after the create_save_image_sizes() function call:
image_sizes = { url_name: urlparse(image_sizes[url_name]).path for url_name in image_sizes } # Now file names don't contain port (this gives relative urls).
Now we can simplify the lines each of which earlier required port-stripping code:
resized_image_file = app.config.get('BASE_DIR') + resized_image_url resized_image_file_large = app.config.get('BASE_DIR') + resized_image_url_large resized_image_file_thumbnail = app.config.get('BASE_DIR') + resized_image_url_thumbnail resized_image_file_icon = app.config.get('BASE_DIR') + resized_image_url_icon
We now do a similar modification in test_create_save_resized_image() test method as it also involves URL to file path conversion. We break the line
resized_image_file = app.config.get('BASE_DIR') + resized_image_url.split('/localhost')[1]
to 2 lines:
resized_image_path = urlparse(resized_image_url).path resized_image_file = app.config.get('BASE_DIR') + resized_image_path
Now let’s run the tests (which failed earlier) again:
Finally, the tests pass without errors! Now, we can add some extra convenience functionality: we can also strip the port when it corresponds with the protocol we’re using. For example, if we’re using https protocol, then we need not specify the port if it is 443, as 443 corresponds to that protocol. We can add this functionality by creating a mapping of such correspondence and checking for it before generating the URL. To do this, we now go back to storage.py and add the following:
SCHEMES = {80: 'http', 443: 'https'}
And add
# No need to specify scheme-corresponding port port = url.port if port and url.scheme == SCHEMES.get(url.port, None): port = None
just before
return '{scheme}://{hostname}:{port}{file_relative_path}'.format( scheme=url.scheme, hostname=url.hostname, port=port, file_relative_path=file_relative_path).replace(':None', '')
And that finishes our work! The tests again pass successfully, plus on top of that we have this new functionality of mentioning ports only when they don’t correspond with the URL scheme!
Resources: