Wikidata for everyone as a local service

HDT Query Service

Introduction

In a previous blog we talked about Wikidata, one of the largest existing knowledge graphs. Today we are glad to anounce that we are making wikidata more accessible to all the community and everyone who is interested in open linked data, from researchers to engineers.

Contribution

As far as everybody knows, wikidata provides a public query service that receives millions of queries everyday. To avoid overloading the public service with numerous requests, one could download the data set and load it to the triple store they provide. However this may take up to 12 days just to index the data until one could be able to start running queries ! And you'd probably need an enormous machine with 200GB of memory.

What we are offering today is a docker image that you just have to pull and start on a small machine (16GB of memory could sufice), it will basically download a compressed version of Wikidata (~65GB), and you will be able to run SPARQL queries immedietaly just after the download is finished. This service is built on top of HDT, a compact data structure and binary serialization format for RDF that keeps big datasets compressed to save space while maintaining search and browse operations without prior decompression.

This project is fully maintained by The QA Company.

If you have any problems or issues we would be more than happy to hear about it and try to fix it (contact-us).

Conclusion

We used wikidata for a long time and we feel today that it's time to pay back, and here we are providing its community with such a nice alternative of their public query service.

Thanks, see you again!

Don't miss these latest stories