My first two weeks of the Google Summer of Code

Sahan Dilshan
4 min readJun 26, 2021

In this blog, I’m going to discuss my first two-week experience of the GSoC’21 at DBpedia. I’m working on the Lifecycle Management of the DBpedia Neural QA Models project. Basically, I have to implement a framework to manage the lifecycle of the Neural QA model of the DBpedia. Before the coding period officially began, I had my first meeting with my mentors Edgard Marx and Lahiru Oshara. First, we introduced ourselves to each other. Then we discussed the project and created a road map from beginning to end on how to complete this project. The road map for the first month is as follows,

  1. Port command-line interfaces to JSON output

First of all, I’ll explain what is KBox?. KBox is an abbreviation for Knowledge Box. it is a data management framework designed to facilitate data deployment whether on the cloud, personal computers, or smart devices. You can read more about it from its GitHub repo and this research paper. We have to interact with this KBox library(.jar) from the terminal with various command-line arguments like -list, -install, -info... & etc. For each argument, it will print the relevant information in the terminal. What I had to do is convert this printed information into JSON message format so that the user can convert the printing text on the terminal as a JSON object.

What did I do in my first week?

First of all, as every beginner does, I went through the KBox documentation and get familiar with it. Then I start to examine the implementation of the KBox. KBox is written using java, and since I have lots of experience with the java language, I was able to understand and get familiar with the core implementation of the KBox easily. While getting familiar with the KBox, I was able to identify some bugs related to KBox. So on the next meeting with my mentors, I discussed these bugs and some parts about KBox which I did not fully understand. This is how I basically spent my first week of the GSoC’21.

My second week of the GSoC’21

This is the week I actually start to begin the code. After the last meeting with my mentors, I had a really good understanding of what should I need to implement and why should I implement the JSON output feature for the KBox. As my mentor Edgard Marx said, after we add this JSON serialization feature to the KBox, we can use it with the airML project directly without any issue. The current version of the airML has a huge problem that the output of the KBox can not be parsed into any valuable information. So after this JSON serialization, we can parse the output text of the KBox into a JSON object and extract valuable information from that JSON object.

Before the next meeting with my mentors, I was able to add the JSON serialization feature to the KBox. To convert the normal output of the KBox into JSON string, the user has to give additional -o json command with the regular KBox command. As an example, let’s take the following command,

java -jar KBox.jar -list

The above command will list all available knowledge bases of the KBox as follows,

KBox KNS Resource table list
##############################
name,format,version
##############################
http://purl.org/pcp-on-web/dbpedia,kibe,c9a618a875c5d46add88de4f00b538962f9359ad
http://purl.org/pcp-on-web/ontology,kibe,c9a618a875c5d46add88de4f00b538962f9359ad
http://purl.org/pcp-on-web/dataset,kibe,dd240892384222f91255b0a94fd772c5d540f38b

Now if we pass the -o json command along with the -list command, the output will look like this,

java -jar KBox.jar -list -o json{
"success": true,
"results": [
{
"name": "http://purl.org/pcp-on-web/dbpedia",
"format": "kibe",
"version": "c9a618a875c5d46add88de4f00b538962f9359ad"
},
{
"name": "http://purl.org/pcp-on-web/ontology",
"format": "kibe",
"version": "c9a618a875c5d46add88de4f00b538962f9359ad"
},
{
"name": "http://purl.org/pcp-on-web/dataset",
"format": "kibe",
"version": "dd240892384222f91255b0a94fd772c5d540f38b"
}
]
}

I was able to convert the output of-list, -install, -remove, -info, -locate, -search, -r-dir, -version commands into JSON as requested by the mentors. The output of those commands are as follows,

java -jar kbox.jar -list -o json{
"success": true,
"results": [
{
"name": "http://purl.org/pcp-on-web/dbpedia",
"format": "kibe",
"version": "c9a618a875c5d46add88de4f00b538962f9359ad"
},
{
"name": "http://purl.org/pcp-on-web/ontology",
"format": "kibe",
"version": "c9a618a875c5d46add88de4f00b538962f9359ad"
},
{
"name": "http://purl.org/pcp-on-web/dataset",
"format": "kibe",
"version": "dd240892384222f91255b0a94fd772c5d540f38b"
}
]
}
java -jar kbox.jar -install http://purl.org/pcp-on-web/ontology -o json{"install": true}java -jar kbox.jar -remove -kns https://github.com/AKSW/KBox/blob/master -o json{"remove": false}java -jar kbox.jar -info http://purl.org/pcp-on-web/ontology -o json{
"success": true,
"info": {
"License:": "[\"http://en.wikipedia.org/wiki/Wikipedia:Text_of_the_GNU_Free_Documentation_License\",\"http://en.wikipedia.org/wiki/Wikipedia:Text_of_Creative_Commons_Attribution-ShareAlike_3.0_Unported_License\"]",
"version Tags:": ["latest"],
"Format:": "kibe",
"KN:": "http://purl.org/pcp-on-web/ontology",
"label:": "PCP-on-Web ontology",
"Owner:": "PCP-on-Web",
"KNS:": "https://raw.githubusercontent.com/AKSW/kbox/master/kns/2.0",
"Version:": "c9a618a875c5d46add88de4f00b538962f9359ad",
"Publisher:": "KBox team"
}
}
java -jar kbox.jar -locate https://schema.org/ -o json{"path": "/home/sahan/.kbox/https/org/schema"}Java -jar kbox.jar -search ontology -o json{
"success": true,
"results": [{
"name": "http://purl.org/pcp-on-web/ontology",
"format": "kibe",
"version": "c9a618a875c5d46add88de4f00b538962f9359ad"
}]
}
java -jar kbox.jar -r-dir -o json{"path": "/home/sahan/.kbox"}java -jar kbox.jar -version -o json{"version": "v0.0.2-alpha1"}

After finishing the implementation of JSON serialization, we had our next meeting. In there, we decided to follow the same json message structure for every command of the KBox. As you can see above, for each command of KBox, the JSON output has different key-value pairs. This is not a good implementation since the user has to write separate logic for each command to retrieve the output from the json message. Hence we decided to follow the following json message format for every KBox command,

{“status_code”: HTTP_CODE“message:”: “information or error message”“result”: [expected resutls as a json array]}

This is what happened during my second week of the GSoC’21. In the next week, I have to make sure that each JSON message follows the same message structure and integrate this JSON serializing feature with the airML project.

--

--