Research project

Scalable Generative IR

Scalable Generative IR

Team

About This Project

The Scalable Generative Information Retrieval (IR) project investigates how generative models can be applied to information retrieval tasks in a scalable and efficient manner. Our goal is to develop retrieval systems that leverage large language models (LLMs) while remaining practical for real-world applications, such as domain-specific search and systematic reviews.

Key Areas of Focus

  • Elastic & Scalable Modeling: Our models support dynamic adjustment of transformer depth and embedding dimensionality, enabling flexible trade-offs between effectiveness and efficiency. We also leverage techniques like 2D Matryoshka pruning to train adaptable models that operate efficiently across a range of deployment budgets.
  • Context Compression for RAG: We explore advanced context compression techniques, such as COCOM (Context Compression Model), to enhance the efficiency of RAG systems.