Geo Tagger Evaluation Framework

1 minute read

The Geo Tagger Evaluation Framework (geoTEF) provides an open framework for evaluating Geo-Taggers. The following graph shows the conceptual evaluation framework:

Currently only a proof-of-concept scoring based on HierarchyLocationReference and SetLocationReference are implemented; More complex implementations of the ILocationReference interface supporting OntologyBasedScoring are under development.

Software

This framework is used to evalute geo-taggers using utility scoring. Please refer to the corresponding paper for a detailed description of the underlying concepts and ideas.

Download

A tar file containing the geoTEF framework, the extensible Web Retrieval Toolkit, data files, and cache files required to run the experiments.

geoTEF-0.1.tar.bz2

Code Repository

The most recent code is available at github.

git clone https://github.com/AlbertWeichselbraun/geoTEF.git

Installation Instructions

Dependencies:

Installation:

  • download & unpack the software
  • adjust env.sh to reflect your installation settings and set the environment variables using

   source env.sh

  • copy geoTEFconfig.py-sample.py to geoTEFconfig.py. If you plan to evaluate your own geo-tagger’s you will have to set up the gazatteer database and adjust the database settings in the configuration file accordingly.
  • ./evaluation.py starts the evaluation.

Database set up

  • Download the gazetteer database dump from here.
  • Dump it into your database using

 bzcat geoTEF_gazetteer_20090207.sql.bz2 |psql dbname

Evaluation Data Sets

We currently use the Reuters corpus to perform the evaluations. For legal reasons we cannot publish this dataset on the project page, but have instead included comma separated text files with the tagging results (which are used for the evaluation) in the frameworks /data directory.

The gazetteer used by the evaluation framework uses the following database schema:

A compressed postgres data dump of the gazetteer is available here.

Remarks

The Framework implements caching using the extensible Web Retrieval Toolkit.

Bibliography

Categories:

Updated:

Leave a comment