An NLP task where a system receives a question and returns a precise answer, optionally from a given context. Question answering is a core interaction pattern for multimodal retrieval systems where users expect direct answers rather than document lists.
Extractive QA identifies the answer span within a given context passage by predicting start and end token positions. Abstractive QA generates the answer in natural language, potentially synthesizing information from multiple sources. Open-domain QA first retrieves relevant documents from a large corpus, then extracts or generates answers from the retrieved context.
Extractive models fine-tune BERT or RoBERTa on SQuAD-style datasets to predict answer spans. Generative QA uses encoder-decoder models (T5, BART) or LLMs with RAG (Retrieval-Augmented Generation). Multi-hop QA requires reasoning across multiple documents. Evaluation uses Exact Match (EM) and F1 token overlap for extractive, and ROUGE/human evaluation for generative answers.