Skip to content

Analyzing the performance of Mamba-based vs. Transformer-based Retrieval Augmented Language Models over increasing k number of chunks retrieved.

License

Notifications You must be signed in to change notification settings

abarton51/MambaRALM

Repository files navigation

MambaRALM: Analyzing RALMs with Selective State Space and Transformer Based Architectures for Long Sequence Modeling

Left: RAG Pipeline. Right: RALM Analysis Pipeline.

Sebastian Jaskowski, Austin T. Barton, Nolan Bridges

Abstract: This study examines the efficacy of Retrieval Augmented Language Models (RALMs), a recent paradigm incorporating retrievers to enhance standalone language models during inference. While most RALMs rely on transformer architecture, which suffers from scalability issues limiting context windows, this project explores the potential of the Mamba architecture, known for its proficiency with long sequences and Long Range Dependencies (LRDs), in improving RALMs' performance. The study constructs a RALM based on the Mamba architecture and evaluates it alongside a transformer-based RALM on a subset of the TriviaQA dataset. Results show comparable performance for small to medium context chunks (k ≤ 7), but the Mamba-based RALM demonstrates better resilience to larger context sizes (k > 7), indicating its potential for handling irrelevant information more effectively.

Paper

The MambaRALM project aims to construct and evaluate a Retrieval Augmented Generation (RAG) QA language model based on an instruction-tuned language model based on the Mamba architecture. In our case, we are using the 2.8B parameter instruction-tuned Mamba-Chat model.

Comparisons of performance were done between a Mamba-based RALM (mamba-chat) to a Transformer-based RALM (Dolly-v2-3B by Databricks). The models were evaluated over a subset of the TriviaQA QA dataset.

About

Analyzing the performance of Mamba-based vs. Transformer-based Retrieval Augmented Language Models over increasing k number of chunks retrieved.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •