In Susper knowledge graph,we were unable to get results for multi word query from Wikipedia API and also we were unable to decide how much information should be shown in knowledge graph which was retrieved from Wikipedia API.For example for a query donald trump we do not get information (https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=donald%20trump) . Also for searching for any query like india (https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts&exintro=&explaintext=&titles=india) we get a lot of information. Earlier we used Angular slice pipe to display only 600 characters in knowledge graph (Infobox) but almost for all queries the sentences were terminated before it was completed. In this blog, I will describe how we solved both problems in knowledge graph.
Getting results for multi word query from Wikipedia API:
On searching a lot we found that the results were present on Wikipedia for multi word queries but these queries must have its starting letters as capital letters. For example we do not get results for queries such as donald trump
but when we make a query for Donald Trump
i.e
But since in most of the search queries users use small letters, we need to convert the query such that each word starts with a capital letter. Since the Knowledge Graph has been implemented using ngrx pattern, this logic was easily implemented in knowledge effects. For this we will be selecting each word in query by using regular expression and then we will use toUppercase() and toLowerCase() methods of javascript and will capitalize every word of the query.
toTitleCase(str) { return str.replace( /\w\S*/g, function(txt) { return txt.charAt(0).toUpperCase() + txt.substr(1).toLowerCase(); } ); } this.knowledgeservice.getSearchResults(this.toTitleCase(querypay.query))
Like this we now get results for almost all queries.
Limiting information in Knowledge Graph without terminating the sentences:
To solve this issue, we decided to show only four lines of data retrieved from Wikipedia API
For this we have implemented a getPosition function which will take three parameters a string, a substring and a index. Here string is the whole string from which the function will return the index of index position substring.
getPosition(string, subString, index) { return string.split(subString, index).join(subString).length; } this.description = this.description.slice(0, this.getPosition(this.description, '.', 4) + 1);
Here, We get the index of 4th ‘.’ full stop by getPosition() function and then pass it to javascript slice method to slice the string upto that position.
Using this we limited results to 4 lines without terminating each line in middle.
References
1.W3Schools Regular Expression: https://www.w3schools.com/jsref/jsref_obj_regexp.asp
2.Javascript Slice method: https://www.w3schools.com/jsref/jsref_slice_array.asp
3.Wikipedia API: https://www.mediawiki.org/wiki/API:Main_page