Scalable Generative IR
People involved:




About this project
The Scalable Generative Information Retrieval (IR) project investigates how generative models can be applied to information retrieval tasks in a scalable and efficient manner. Our goal is to develop retrieval systems that leverage large language models (LLMs) while remaining practical for real-world applications, such as domain-specific search and systematic reviews.
Key Areas of Focus
- Elastic & Scalable Modeling: Our models support dynamic adjustment of transformer depth and embedding dimensionality, enabling flexible trade-offs between effectiveness and efficiency. We also leverage techniques like 2D Matryoshka pruning to train adaptable models that operate efficiently across a range of deployment budgets.
- Context Compression for RAG: We explore advanced context compression techniques, such as COCOM (Context Compression Model), to enhance the efficiency of RAG systems.