Evaluating popular Information Retrieval Models

1 minute read

Introduction

Retrieval Models return and rank documents based on their relevance in regard to a search query q. Popular retrieval models such as the Vector Space Model (VSM) and Explicit Semantic Analysis (ESA) have numerous applications in information retrieval, text mining and natural language processing.

This thesis focuses on

the creation of a Java library which implements popular retrieval models, and
the design of a framework for evaluating and comparing these models to each other.

Theory:
- IR Models (Definition, Classifications)
- Popular Models (VSM, ESA, LSI, …)
- Computational and Memory Complexity
Implementation (VSM, ESA, …)
- Retrieval Models (VSM, ESA, …)
- Evaluation Framework (precision, recall, F1, processing time, computational complexity, memory complexity, data storage ..)
Evaluation:
Outlook and Conclusions

Student Profile

An interest in information retrieval.
Good Java skills. The implementation is an integral part of this thesis.

Literature

Stein, Benno and Anderka, Maik (2009). Collection-Relative Representations - A Unifying View to Retrieval Models, Twentieth International Workshop on Database and Expert Systems Application (DEXA 2009); Sixth International Workshop on Text-Based Information Retrieval TIR 2009, pages 383–387
Salton, G., Wong, A. and Yang, C. S. (1975). A vector space model for information retrieval, Communications of the ACM, pages 613-620, 18(11)
Gabrilovich, Evgeniy and Markovitch, Shaul (2009). Wikipedia-based Semantic Interpretation for Natural Language Processing, Journal of Artificial Intelligence Research, pages 443–498

Share on

Twitter Facebook LinkedIn

Albert Weichselbraun

Evaluating popular Information Retrieval Models

Introduction

Table of Contents

Student Profile

Literature

Share on

You may also enjoy

Extracting text (and annotations) from HTML with Python

Setup and automatic renewal of wildcard SSL certificates for Kubernetes with Certbot and NSD

Managing DavMail with systemd and preventing service timeouts after network reconnects.

Setting up Gnome CalDAV and CardDAV support with Radicale