Optimizing Geographic Tagging

1 minute read

Introduction

The vision of the Geospatial Web combines geographic data, Internet technology and social change. Geospatial applications like the IDIOM Media Watch on Climate Change facilitate geo-annotation services to refine Web pages and media articles with geographic tags.

Identifying the document’s target geographies is a rather complex task, complicated by geo/geo ambiguities (e.g. Vienna/at versus Vienna/Virginia/us) and geo/non-geo ambiguities like turkey/bird versus Turkey/country. Most approaches toward tagging the target geography therefore facilitate machine learning technologies, gazetteers, or a combination of both to identify geo-tags. The gazetteer’s size and many internal tuning parameter determine the geo-tagger’s performance and its bias towards identifying smaller geographic-entities or higher-level units. Designing a geo-tagger and choosing these parameters often involve trade-offs; improvements in one particular area does not necessarily yield better results in other areas.

The goals of this thesis are

designing a testcase for evaluating geo-taggers
implementing this testcase as a unittest
applying the framework to different approaches towards geo-tagging.

Todo

Literature recherches
- geo-tagging algorithms
- public geo coding API’s (evaluation)
Design geo-testcases (different gazetteer sizes, different regions)
Implement geo-unittests
Modifiy and measure the performance of different geo-algorithms

Literature

E. Amitay, N. Har’El, R. Sivan, and A. Soer. “Web-a-where: geotagging web content”. In SIGIR ‘04: Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, pages 273280, New York, NY, USA, 2004. ACM.
R. Beierheimer: “Geo Tagging of Web Resources”, Bakkalaureatsarbeit an der Technischen Universität Graz, Sept. 2006
A. Weichselbraun: “A Utility-Testing Centered Approach for Optimizing Geo-Tagging”, draft

Share on

Twitter Facebook LinkedIn

Albert Weichselbraun

Optimizing Geographic Tagging

Introduction

Todo

Literature

Share on

You may also enjoy

Extracting text (and annotations) from HTML with Python

Setup and automatic renewal of wildcard SSL certificates for Kubernetes with Certbot and NSD

Managing DavMail with systemd and preventing service timeouts after network reconnects.

Setting up Gnome CalDAV and CardDAV support with Radicale