ArchEE Among the Top 3 Startups in Lower Saxony

When exploring news archives, a key requirement of historians is to get an overview of their search results initially. To address this problem we developed a novel retrieval model – HistDiv – which ranks articles according to historical relevance. The Archive Exploration Engine (ArchEE) system was built to showcase how HistDiv and various other state-of-the-art retrieval models coupled with time-lines and entity filters can help users better explore large news archives.

ArchEE also been selected as one of the top 3 startups in Lower Saxony for the 2016 Going Global competition organized by Hannover Impuls.

A demo of ArchEE can be found here: http://bit.ly/archive-search

ALEXANDRIA Internet Archive Search Prototype

We are delighted to announce the first public release of our ALEXANDRIA Internet Archive Search Prototype:

http://alexandria-project.eu/archivesearch/

ArchiveSearch provides, for the first time, entity based search and exploration functionalities into the Web Archive of the Internet Archive allowing you to use (most of) the 1.9 million concepts of the German Wikipedia or (most of) the 5 million concepts of the English Wikipedia as search terms.

For these search terms, the current version provides the most important results from the Internet Archive, ranking resources based on Bing search, with more sophisticated re-ranking in future releases, as well as related entity suggestions for most of the queries.

Read the rest of this entry »

Successful 2nd International Alexandria Workshop

IMG_0833_smThe second Alexandria Workshop took place in L3S Research Center on 2-3rd November 2015. The workshop was aimed at bringing together communities involved in web archiving, digital preservation, digital humanities and information retrieval to encourage a closer dialogue between researchers from computer science, digital humanities and cultural heritage institutions. It was widely attended from participants from national libraries, humanities to computer scientists from varying disciplines like Information retrieval, natural language processing, database systems and distributed systems. The workshop, spanning two days, included two keynotes, several research talks, system demonstrations and a panel discussion on shortcomings, research infrastructures, and future directions.

Read the rest of this entry »

2nd International Alexandria Workshop

Alexandria LogoFoundations for Temporal Retrieval, Exploration and Analytics in Web Archives

2./3. November 2015

L3S Research Center, Hannover, Germany

Significant parts of our cultural heritage are produced on the Web, yet only insufficient opportunities exist for accessing and exploring the past of the Web. While the easy accessibility to the current Web is a good baseline, optimal access to Web archives requires new models and algorithms for retrieval, exploration, and analytics which go far beyond what is needed to access the current state of the Web. This includes taking into account the unique temporal dimension of Web archives, structured semantic information already available on the Web, as well as social media and network information.

The workshop aims at bringing together communities involved in Web Archiving, Digital Preservation, Digital Humanities and Information Retrieval to encourage a closer dialogue between researchers from computer science, digital humanities and cultural heritage institutions.

Click here for more detailed information, agenda, venue, etc.

Alexandria @ WWW 2015

The ALEXANDRIA logo_crproject team participated in the 14th International Conference on World Wide Web (WWW ’15) in Florence, Italy in May 2015. We contribute one full paper for the main conference and two full papers for the workshops.

 

Contribution to the main conference:

Markus Rokicki, Sergej Zerr, Stefan Siersdorfer. “Groupsourcing: Team Competition Designs for Crowdsourcing”

Many data processing tasks such as semantic annotation of images, translation of texts in foreign languages, and labeling of training data for machine learning models require human input, and, on a large scale, can only be accurately solved using crowd based online work. Recent work shows that frameworks where crowd workers compete against each other can drastically reduce crowdsourcing costs, and outperform conventional reward schemes where the payment of online workers is proportional to the number of accomplished tasks (“pay-per-task”). In this paper, we investigate how team mechanisms can be leveraged to further improve the cost efficiency of crowdsourcing competitions. To this end, we introduce strategies for team based crowdsourcing, ranging from team formation processes where workers are randomly assigned to competing teams, over strategies involving self-organization where workers actively participate in team building, to combinations of team and individual competitions. Our large-scale experimental evaluation with more than 1,100 participants and overall 5,400 hours of work spent by crowd workers demonstrates that our team based crowdsourcing mechanisms are well accepted by online workers and lead to substantial performance boosts.

DSC_1079_cr_small

Markus Rokicki during his presentation

Read the rest of this entry »

Older posts «

» Newer posts