The Geo Tagger Evaluation Framework (geoTEF) provides an open framework for evaluating Geo-Taggers. The following graph shows the conceptual evaluation framework:
Currently only a proof-of-concept scoring based on HierarchyLocationReference and SetLocationReference are implemented; More complex implementations of the ILocationReference interface supporting OntologyBasedScoring are under development.
This framework is used to evalute geo-taggers using utility scoring. Please refer to the corresponding paper for a detailed description of the underlying concepts and ideas.
A tar file containing the geoTEF framework, the extensible Web Retrieval Toolkit, data files, and cache files required to run the experiments.
The most recent code is available at github.
git clone https://github.com/AlbertWeichselbraun/geoTEF.git
- Python2.4 or higher
- The extensible Web Retrieval Toolkit (eWRT) is required to run this program (only necessary if you use the most current version from subversion).
- download & unpack the software
- adjust env.sh to reflect your installation settings and set the environment variables using
geoTEFconfig.py. If you plan to evaluate your own geo-tagger’s you will have to set up the gazatteer database and adjust the database settings in the configuration file accordingly.
- ./evaluation.py starts the evaluation.
Database set up
- Download the gazetteer database dump from here.
- Dump it into your database using
bzcat geoTEF_gazetteer_20090207.sql.bz2 |psql dbname
Evaluation Data Sets
We currently use the Reuters corpus to perform
the evaluations. For legal reasons we cannot publish this dataset on the
project page, but have instead included comma separated text files with
the tagging results (which are used for the evaluation) in the
The gazetteer used by the evaluation framework uses the following database schema:
A compressed postgres data dump of the gazetteer is available here.
The Framework implements caching using the extensible Web Retrieval Toolkit.
- Reuters. Reuters Corpus, Volume 1, English language, 1996-08-20 to 1997-08-19
- Weichselbraun, Albert. (2009). A Utility Centered Approach for Evaluating and Optimizing Geo-Tagging. First International Conference on Knowledge Discovery and Information Retrieval (KDIR 2009), 134-139, Madeira, Portugal