- FActScore: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation
- Language Models Hallucinate, but May Excel at Fact Verification
- RAGAS: Automated Evaluation of Retrieval Augmented Generation (sentence-level generation)
- FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios
- Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers
- Towards LLM-based Fact Verification on News Claims with a Hierarchical Step-by-Step Prompting Method
- Fine-tuning Language Models for Factuality
- Chain-of-Verification Reduces Hallucination in Large Language Models
- SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models
- RARR: Researching and Revising What Language Models Say, Using Language Models
- FELM: Benchmarking Factuality Evaluation of Large Language Models
- Improving Model Factuality with Fine-grained Critique-based Evaluator
- Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification
- RAGBench: Explainable Benchmark for Retrieval-Augmented Generation Systems
- FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation
- FactAlign: Long-form Factuality Alignment of Large Language Models
- Counterfactual Generation from Language Models
- LongReward: Improving Long-context Large Language Models with AI Feedback
- MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
- Looking beyond the surface: A challenge set for reading comprehension over multiple sentences
- LONG2RAG: Evaluating Long-Context & Long-Form Retrieval-Augmented Generation with Key Point Recal (based on ELI5)
- L-Eval: Instituting Standardized Evaluation for Long Context Language Models
- MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents
- AQuaMuSe: Automatically Generating Datasets for Query-Based Multi-Document Summarization
- ExpertQA: Expert-Curated Questions and Attributed Answers
- WICE: Real-World Entailment for Claims in Wikipedia