Editing files and piped data with “sed” in Loklak server

What is sed ?

“sed” is used in “Loklak Server” one of the most popular projects of FOSSASIA. “sed” is acronym for “Stream Editor” used for filtering and transforming text, as mentioned in the manual page of sed. Stream can be a file or input from a pipeline or even standard input. Regular expressions are used to filter the text and transformation are carried out using sed commands, either inline or from a file. So, most of the time writing a single line does the work of text substitution, removal or to obtaining a value from a text file.

Basic Syntax of “sed”

$sed [options]... {inline commands or file having sed commands} [input_file]...

Loklak Server uses a config.properties file – contains key-value pairs – which is the basis of the server as it contains configuration values, used by the server during runtime for various operations.

Let’s go through a simple sed example that prints line containing word “https” at the beginning in the config.properties file.

$sed -n '/^https/p' config.properties

Here “-n” option suppresses automatically printing of pattern space (pattern space is where each line is put that is to be processed by sed). Without “-n” option sed will print the whole file.

Now, the regular expression part,  “/^https” matches all the lines that has “https” at the start of line and “/p” is print command to print the output in console. Finally we provide the filename i.e. config.properties. If filename is not provided then sed waits for input from standard input.

Use cases of “sed” in Loklak Server

  • Displaying proper port number in message while starting or installing Loklak Server

The default port of loklak server is port number 9000, but it can be started in any non-occupied port by using “-p” flag with bin/start.sh and bin/installation.sh like

$ bin/installation.sh -p 8888

starts installation of Loklak Server in port 8888. To display the proper localhost address so that user can open it in a browser the port number in shortlink.urlstub parameter in config.properties needs to be changed. This is carried out by the function change_shortlink_urlstub in bin/utility.sh. The function is defined as

Now let’s try to understand what the sed command is doing.

“-i” option is used for in-place editing of the specified file i.e. config.properties in conf directory.

s” is substitute command of sed. The regular expression can be divided into two parts, between “/”:

  1. \(shortlink\.urlstub=http:.*:\)\(.*\) this is used to find the match in a line.
  2. \1′”$1″ is used to substitute the matched string in part 1.

The regular expressions can be split into groups so that operations can be performed on each group separately. A group is enclosed between “\(“ and “\)”. In our 1st part of regular expressions, there are two groups.

Dissecting the first group i.e. \(shortlink\.urlstub=http:.*:\):

  • shortlink\.urlstub=http:” will match the expression “shortlink.urlstub=http:”, here “\” is used as an escape sequence as “.” in regex represents any character.
  • “.*:”, “.” represents any character and “*” represents 0 or more characters of the previous character. So, it will basically match any expression. But, it ends with a “:”, which means any expression that ends with a “:”. Thus it matches the expression “//localhost:”.

So, the first group collectively matches the expression “shortlink.urlstub=http://localhost:”.

As described above second group i.e. \(.*\) will match any expression, here matches “9000”.

Now coming to the 2nd part of regular expression i.e. \1′”$1″:

  • “\1” represents the match of the first group in 1st part i.e. “shortlink.urlstub=http://localhost:”.
  • “$1” is the value of the first parameter provided to our function “change_shortlink_urlstub” which is an unused port where we want to run Loklak Server.

So 2nd part picks up the match from the first group and concatenates with the first parameter of the function. Assuming the first parameter to the function is “8888”, the expression for 2nd part becomes “shortlink.urlstub=http://localhost:8888” which replaces “shortlink.urlstub=http://localhost:9000”.

So a correct localhost address is displayed in the console while starting or installing Loklak Server.

  • Extracting value of a key from config.properties

“grep” and “sed” are used to extract the values of key from config.properties in bash scripts e.g. extracting default port, value of “port.http” in bin/start.sh, value of “shortlink.urlstub” in bin/start.sh and bin/installation.sh.

$ grep -iw 'port.http' conf/config.properties | sed 's/^[^=]*=//'

Here grep is used to filter a line and pass the filtered line to sed by piping the output. “i” flag is for ignoring case sensitivity and “w” flag is used for matching of word only. The output of

$ grep -iw 'port.http' conf/config.properties
port.http=9000

The aim is to get “9000”, the value of “port.http”. The approach used here is to substitute “port.http=” in the output of grep command above with a string of zero characters, that way only “9000” remains.

Let’s deconstruct the “sed” part, s/^[^=]*=//:

s” command of sed is used for substitution.  Here 1st part is “^[^=]*=” and 2nd part is nothing, as no characters are enclosed within “//”.

  • “^” means to match at the start.
  • [^characters] represents not to consider a set of characters. For matching a set of characters “^” is not used inside square brackets, [characters]. Here [^=] means not to include equal to – “=” symbol – and “*” after that makes it to match characters that are not “=”.

So “^[^=]*=” matches a sequence of characters that doesn’t start with “=” and followed by “=”. Thus the matched expression in this case is “http.port=” (“h” is not “=” and it ends with “=”), which is substituted by a string with zero characters which leaves “9000”.  Finally, “9000” is assigned to the required variable is bash script.

Continue ReadingEditing files and piped data with “sed” in Loklak server