Tilahun Abedissa Taffa


2024

pdf bib
Low Resource Question Answering: An Amharic Benchmarking Dataset
Tilahun Abedissa Taffa | Ricardo Usbeck | Yaregal Assabie
Proceedings of the Fifth Workshop on Resources for African Indigenous Languages @ LREC-COLING 2024

Question Answering (QA) systems return concise answers or answer lists based on natural language text, which uses a given context document. Many resources go into curating QA datasets to advance the development of robust QA models. There is a surge in QA datasets for languages such as English; this is different for low-resource languages like Amharic. Indeed, there is no published or publicly available Amharic QA dataset. Hence, to foster further research in low-resource QA, we present the first publicly available benchmarking Amharic Question Answering Dataset (Amh-QuAD). We crowdsource 2,628 question-answer pairs from over 378 Amharic Wikipedia articles. Using the training set, we fine-tune an XLM-R-based language model and introduce a new reader model. Leveraging our newly fine-tuned reader run a baseline model to spark open-domain Amharic QA research interest. The best- performing baseline QA achieves an F-score of 80.3 and 81.34 in retriever-reader and reading comprehension settings.

2019

bib
Amharic Question Answering for Biography, Definition, and Description Questions
Tilahun Abedissa Taffa | Mulugeta Libsie
Proceedings of the 2019 Workshop on Widening NLP

A broad range of information needs can often be stated as a question. Question Answering (QA) systems attempt to provide users concise answer(s) to natural language questions. The existing Amharic QA systems handle fact-based questions that usually take named entities as an answer. To deal with more complex information needs we developed an Amharic non-factoid QA for biography, definition, and description questions. A hybrid approach has been used for the question classification. For document filtering and answer extraction we have used lexical patterns. On the other hand to answer biography questions we have used a summarizer and the generated summary is validated using a text classifier. Our QA system is evaluated and has shown a promising result.