Paper Reading

Reading lists for papers, including but not limited to Natural Language Processing (NLP).

Deep Learning in NLP
Debiasing
Information Theory
Dataset Construction
Bias Suppression
Others

Uncategorized

Thinking Fair and Slow: "“Thinking” Fair and Slow: On the Efficacy of Structured Prompts for Debiasing Language Models". EMNLP(2024) [PDF]
Regard Dataset: "The Woman Worked as a Babysitter: On Biases in Language Generation". EMNLP-IJCNLP(2019) [PDF][CODE]
Regard V3: "Towards Controllable Biases in Language Generation". EMNLP-Findings(2020) [PDF][CODE]
Bias in Bios: "Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting". FAT(2019) [PDF][CODE]
Tree Structure Reasoning: "TREA: Tree-Structure Reasoning Schema for Conversational Recommendation". ACL(2023) [PDF][CODE]
SEER: "SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning". ACL(2024) [PDF][CODE]
TrustLLM: "Trustllm: Trustworthiness in large language models". ICML(2024) [PDF][CODE][website]
Entailment Trees: "Explaining Answers with Entailment Trees". EMNLP(2021) [PDF][CODE]
LLM Data Annotation: "Large Language Models for Data Annotation and Synthesis: A Survey". EMNLP(2024) [PDF]
Free-Text Explanation: "Reframing Human-AI Collaboration for Generating Free-Text Explanations". NAACL(2022) [PDF][CODE]
Bias-NLI: "On Measuring and Mitigating Biased Inferences of Word Embeddings". AAAI(2019) [PDF]
Gaps: "The Gaps between Pre-train and Downstream Settings in Bias Evaluation and Debiasing". COLING(2025) [PDF]
Self-Correction: "The Capacity for Moral Self-Correction in Large Language Models". ArXiv(2023) [PDF]
BBQ: "BBQ: A Hand-Built Bias Benchmark for Question Answering ". ACL-Findings(2022) [PDF][CODE]
BNLI: "Evaluating gender bias of pre-trained language models in natural language inference by considering all labels". LREC-COLING(2024) [PDF]
Adversarial NLI: "Adversarial NLI: A New Benchmark for Natural Language Understanding". ACL(2020) [PDF]
SEAT: "On Measuring Social Biases in Sentence Encoders". NAACL(2019) [PDF]
Implicit Rangking "A Study of Implicit Ranking Unfairness in Large Language Models". EMNLP Findings(2024) [PDF] [CODE]
OCCUGENDER "Causally Testing Gender Bias in LLMs: A Case Study on Occupational Bias". NeurIPS(2024) [PDF] [CODE]
Salmon Paper "Stereotyping Norwegian salmon: An inventory of pitfalls in fairness benchmark datasets". ACL(2021) [PDF]
Romantic Relationship Prediction "On the Influence of Gender and Race in Romantic Relationship Prediction from Large Language Models". EMNLP(2024) [PDF]
Theory-Grounded "Theory-Grounded Measurement of U.S. Social Stereotypes in English Language Models". NAACL(2022) [PDF]
Marked Personas "Marked Personas: Using Natural Language Prompts to Measure Stereotypes in Language Models". ACL(2024) [PDF]
SODAPOP "SODAPOP: Open-Ended Discovery of Social Biases in Social Commonsense Reasoning Models". EACL(2023) [PDF]
DDRel Dataset "DDRel: A New Dataset for Interpersonal Relation Classification in Dyadic Dialogues" AAAI(2021) [PDF]
Hiring Decisions "Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?". ACL(2024) [PDF]
First Name Biases "Nichelle and Nancy: The Influence of Demographic Attributes and Tokenization Length on First Name Biases" ACL(2023) [PDF]
ORPO "ORPO: Monolithic Preference Optimization without Reference Model". EMNLP(2024) [PDF]
OPT "OPT: Open Pre-trained Transformer Language Models". ArXiv(2022) [PDF]
Word Meanings Adapt "Understanding the Semantic Space: How Word Meanings Dynamically Adapt in the Context of a Sentence". SemSpace(2021) [PDF]
Answer is all you need "Answer is All You Need: Instruction-following Text Embedding via Answering the Question" ACL(2024) [PDF]
Social IQa "Social IQa: Commonsense Reasoning about Social Interactions". EMNLP-IJCNLP(2019) [PDF] [CODE]
Symbolic Knowledge Distillation "Symbolic Knowledge Distillation: from General Language Models to Commonsense Models". NAACL(2022) [PDF]
Implicit User Intention Understanding "Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents". ACL(2024) [PDF]
SOCIAL IQA "SOCIAL IQA: Commonsense Reasoning about Social Interactions". EMNLP-IJCNLP(2019) [PDF]
Low Frequency Names "Low Frequency Names Exhibit Bias and Overfitting in Contextualizing Language Models". EMNLP(2021) [PDF]
Measure Fainess in Generative Models "On Measuring Fairness in Generative Models". NeurIPS(2023) [PDF]
Dependency-Based Semantic Space "Dependency-Based Construction of Semantic Space Models". CL(2002) [PDF]
Sentence BERT "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks" ArXiv(2019) [PDF]
Biased or Flawed "Biased or Flawed? Mitigating Stereotypes in Generative Language Models by Addressing Task-Specific Flaws" ArXiv(2024) [PDF]
Bias Vector "Bias Vector: Mitigating Biases in Language Models with Task Arithmetic Approach" ArXiv(2024) [PDF]
Multi-Objective Approach "Mitigating Social Bias in Large Language Models: A Multi-Objective Approach within a Multi-Agent Framework" ArXiv(2024) [PDF]
Humans or LLMs as the Judge "Humans or LLMs as the Judge? A Study on Judgement Bias" EMNLP(2024) [PDF]
Personas Is Not Helpful "When "A Helpful Assistant" Is Not Really Helpful: Personas in System Prompts Do Not Improve Performances of Large Language Models" [PDF]
Measuring "Ask LLMs Directly, “What shapes your bias? ”: Measuring Social Bias in Large Language Models"
MBTI Evaluation "Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models"
MBTI by LLMs "Can ChatGPT Assess Human Personalities? A General Evaluation Framework"
MBTI Detection "Can Large Language Models Understand You Better? An MBTI Personality Detection Dataset Aligned with Population Traits"
Role play "When “A Helpful Assistant” Is Not Really Helpful: Personas in System Prompts Do Not Improve Performances of Large Language Models"
ProSA "ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs"
Quantifying Sensitivity "Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting"
In-Context Impersonation "In-Context Impersonation Reveals Large Language Models' Strengths and Biases"
VISBIAS "VISBIAS: Measuring Explicit and Implicit Social Biases in Vision Language Models"
Two Hundred Sentiment "Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems"
No LLM is Free "No LLM is Free From Bias: A Comprehensive Study of Bias Evaluation in Large Language models"
Gender Bias on Wikipedia "Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia"
Clinical Bias "CLIMB: A Benchmark of Clinical Bias in Large Language Models"
Personal Biases "Are personalized stochastic parrots more dangerous? evaluating persona biases in dialogue systems"
Effect Compression "Understanding the Effect of Model Compression on Social Bias in Large Language Models"
Uncertainty Estimation "Uncertainty Estimation for Debiased Models: Does Fairness Hurt Reliability?"
Mitigating Bias "Mitigating Word Bias in Zero-shot Prompt-based Classifiers"
BiasAlert "BiasAlert: A Plug-and-play Tool for Social Bias Detection in LLMs"
Short-Term "Evaluating Short-Term Temporal Fluctuations of Social Biases in Social Media Data and Masked Language Models"
Cross-Lingual Training "Cross-Lingual Training for Automatic Question Generation"
BiasDPO "BiasDPO: Mitigating Bias in Language Models through Direct Preference Optimization"
ReSS "Your Stereotypical Mileage may Vary: Practical Challenges of Evaluating Biases in Multiple Languages and Cultural Contexts"
Cross-Lingual Transfer "Cross-Lingual Transfer of Debiasing and Detoxification in Multilingual LLMs: An Extensive Investigation"
Social Bias "Do LLMs Exhibit Human-like Response Biases? A Case Study in Survey Design"
MBBQ "MBBQ: A Dataset for Cross-Lingual Comparison of Stereotypes in Generative LLMs"
DPO and Toxicity "A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity"
Across Languages "Preference Tuning For Toxicity Mitigation Generalizes Across Languages"

Deep Learning in NLP

Data Augmentation: "A Survey of Data Augmentation Approaches for NLP". ACL-Findings(2021) [PDF]
RAG: "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks". NeurIPS(2020) [PDF]
DPR: "Dense Passage Retrieval for Open-Domain Question Answering". EMNLP(2020) [PDF]

Debiasing

Survey

Empirical Survey: "An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models". ACL(2022) [PDF]
Bias and Fairness in LLMs "Bias and Fairness in Large Language Models: A Survey". CL(2024) [PDF]

Measure

Measure Biases&Harms: "On Measures of Biases and Harms in NLP". AACL(2022) [PDF][CODE]

Static Embeddings

Gender-Neutral: "Gender-preserving Debiasing for Pre-trained Word Embeddings". ACL(2019) [PDF] [CODE]

Pre-trained Language Models

Attention-Debiasing: "Debiasing Pretrained Text Encoders by Paying Attention to Paying Attention". EMNLP(2022) [PDF] [CODE]
Debiasing Masks: "Debiasing Masks: A New Framework for Shortcut Mitigation in NLU". EMNLP(2022) [PDF] [CODE]
DebiasGAN: "DebiasGAN: Eliminating Position Bias in News Recommendation with Adversarial Learning". EMNLP(2022) [PDF] [CODE]
DCLR: "Debiased Contrastive Learning of Unsupervised Sentence Representations". ACL(2022) [PDF] [CODE]
Self-Debiasing: "Self-Diagnosis and Self-Debiasing: A Proposal for Reducing Corpus-Based Bias in NLP". TACL(2021) [PDF] [CODE]
SENT-DEBIAS: "Towards Debiasing Sentence Representations". TACL(2021) [PDF] [CODE]
FairFil: "FairFil: Contrastive Neural Debiasing Method for Pretrained Text Encoders". ICLR(2021) [PDF]
Context-Debias: "Debiasing Pre-trained Contextualised Embeddings". EACL(2021) [PDF] [CODE]
Occupation Data: "Good Secretaries, Bad Truck Drivers? Occupational Gender Stereotypes in Sentiment Analysis". EACL(2021) [PDF] [CODE]

Large Language Models

Few-Shot Data Interventions: "Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions". ACL(2023) [PDF][CODE]
Label Bias in LLMs: "Beyond Performance: Quantifying and Mitigating Label Bias in LLMs". NAACL(2024) [PDF][CODE]
Conceptor-Aided Debiasing: "Conceptor-Aided Debiasing of Large Language Models". EMNLP(2023) [PDF]
Likelihood-based Mitigation: "Likelihood-based Mitigation of Evaluation Bias in Large Language Models". ACL(2024) [PDF][CODE]

Information Theory

Embedding-MI: "Estimating Mutual Information Between Dense Word Embeddings". ACL(2020) [PDF] [CODE]
MI-Max: "A Mutual Information Maximization Perspective of Language Representation Learning". ACL(2020) [PDF] [CODE]

Dataset Construction

Coscript: "Distilling Script Knowledge from Large Language Models for Constrained Language Planning". ACL(2023) [PDF] [CODE]
Role of Demonstrations: "Rethinking the Role of Demonstrations: What Makes In-Context Learning Work?". EMNLP(2022) [PDF] [CODE]
Order Sensitivity: "Fantastically Ordered Prompts and Where to Find Them: Overcoming Few-Shot Prompt Order Sensitivity". ACL(2022) [PDF]
Holistic Descriptor Dataset: "I'm sorry to hear that: Finding New Biases in Language Models with a Holistic Descriptor Dataset". EMNLP(2022) [PDF] [CODE]

Bias Suppression

Prompt Bias Suppression: "In-Contextual Gender Bias Suppression for Large Language Models". Findings of EACL(2023) [PDF] [CODE]

Impicit Bias

Explicit and Implicit Gender: "Probing Explicit and Implicit Gender Bias through LLM Conditional Text Generation". Findings of EACL(2023) [PDF]

Others

Wiki Bias: "Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia". ACL(2021) [PDF] [CODE]
Annotator Demographics Matter: "When Do Annotator Demographics Matter? Measuring the Influence of Annotator Demographics with the POPQUORN Dataset". LAW-XVII@ACL(2023) [PDF] [CODE]
Label Debias: "Beyond Performance: Quantifying and Mitigating Label Bias in LLMs". NAACL(2024) [PDF] [CODE]

Back to Top

Name		Name	Last commit message	Last commit date
Latest commit History 107 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Paper Reading

Uncategorized

Deep Learning in NLP

Debiasing

Survey

Measure

Static Embeddings

Pre-trained Language Models

Large Language Models

Information Theory

Dataset Construction

Bias Suppression

Impicit Bias

Others

About

Releases

Packages

nlply/paper_reading

Folders and files

Latest commit

History

Repository files navigation

Paper Reading

Uncategorized

Deep Learning in NLP

Debiasing

Survey

Measure

Static Embeddings

Pre-trained Language Models

Large Language Models

Information Theory

Dataset Construction

Bias Suppression

Impicit Bias

Others

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages