Skip to content

Latest commit

 

History

History
32 lines (16 loc) · 1.91 KB

Papers.md

File metadata and controls

32 lines (16 loc) · 1.91 KB

Papers

Survey Papers

[[A Practical Survey on Faster and Lighter Transformers]] : Popular approaches to make Transformers fasters and lighter by providing a comprehensive explanation of the methods' strengths, limitations and underlying assumptions

Information Retrieval

[[MS-Shift An Analysis of MS Marco Distribution Shifts on Neural Retrieval]] : This paper segments the MS Marco dataset to evaluate the three families of neural retrievers based on BERT - Sparse, Dense and Late Interaction.

[[Atomised Search Length Beyond User Models]] : Proposes a new IR metric called [[Atomised Search Length]] which helps to better reflect the quality of retrievals

LLM

Foundation Models

Improving Language Understanding by Generative Pre-Training : How GPT-1 helped to bring forth a huge revolution in the NLP space by showing efficient transfer learning abilities

Language Models Are Unsupervised Multitask Learners : How GPT-2 challenged a traditional paradigm of pre-train -> fine-tune on task with its auto-regressive prompting ability

A Comprehensive Overview of Large Language Models : A walkthrough of the key ideas and concepts around large language models

Training

[[ORPO Monolithic Preference Optimization without Reference Model]] : Using the odds ratio of LLM output generated directly to fine-tune a model. This combines the SFT and RLHF stages in a single stage

[[LoRA Learns Less and Forgets Less]] : An ablation study of LoRA's performance as compared to a full fine-tune when it comes to instruction fine-tuning and continued pre-training.

Embeddings

Matryoshka Embeddings: Training embedding models that are able to work at a variety of different embedding dimensions