Datasets

Temporal Wikipedia

We distinguish three main Wikipedia dataset derivatives. ¬† All three cases operate on the revision history of Wikipedia. This release consists of Wikipedia revisions until 16/11/2016. Due to the large size of the Wikipedia datasets, we provide only small sample, however, we on request to the ALEXANDRIA team, we can provide the complete dataset repository. …

View page »

Semantic Layers

What is a Semantic Layer? A Semantic Layer is an RDF repository (RDF graph) of structured data about a collection of archived documents.¬†Structured data includes not only metadata information about a document (like publication date), but also entity annotations, i.e., disambiguated entities mentioned in each document extracted using an entity linking system. The following figure …

View page »

TweetsKB

TweetsKB is an RDF corpus of anonymized data for a large collection of annotated tweets. The dataset currently contains data for more than 1.5 billion tweets, spanning almost 5 years (January 2013 – November 2017). Metadata information about the tweets as well as extracted entities, sentiments, hashtags and user mentions are exposed in RDF using …

View page »

EventKG

EventKG is a novel multilingual resource incorporating event-centric information extracted from several large-scale knowledge graphs such as Wikidata, DBpedia and YAGO, as well as less structured sources such as the Wikipedia Current Events Portal and Wikipedia event lists in five languages. The EventKG is an extensible event-centric resource modeled in RDF. It relies on Open …

View page »