After the RAG already adopted by many companies, the RIG offers an innovative approach to generate more precise and contextualized responses.
A new approach to make LLM answers more reliable. After the massive adoption of RAG systems in companies, which make it possible to provide context to an LLM from a documentary base, a new, more intelligent approach is developing. Having passed under the media radar, RIG, for Retrieval Interleaved Generation, allows large language models to provide detailed and documented answers. Technical explanations, advantages and disadvantages… We give you the keys to this new innovative approach.
RAG vs RIG: opposite functioning
The RIG was popularized by Google researchers as part of work to reduce hallucinations in LLMs. The study, published in September 2024, presents in particular the advantages of RIG compared to RAG. RAG and RIG differ fundamentally in the way they interact with knowledge bases. In a traditional RAG system, the process is linear and occurs in three distinct steps: First, the user's question is converted into a digital vector via an embedding model. Then, this vector is used to search for the most similar document fragments in a vector database. Finally, these fragments are provided as context to the LLM which generates its response in one go.
The RIG takes a more interactive and iterative approach. Instead of relying on a simple vector similarity search, the LLM is trained to formulate structured queries itself (in the database language, e.g. SQL) throughout its response generation. Concretely, when the LLM generates text and it needs to cite a fact or a statistic, it interrupts to formulate a precise query to the database. For example, if he writes “The unemployment rate in France in 2023 was”, the model will automatically generate a structured query to obtain this exact information, then integrate the answer into its text.
Many advantages, complex deployment
Thanks to its agile architecture, the RIG allows the LLM to generate more relevant answers. The model learns to identify the information it needs as it constructs its response. In the case of RAG, the model does not have the capacity to interact directly with the database and must make do with an initial context. For example, with the RIG, for a complex question on a historical topic, the LLM might first look for the general context of the era, then specific events, and finally specific details about the actors involved. An iterative method which offers better documented answers than the RAG.
Although RIG is promising, its practical implementation in production remains complex. The model must in fact be fine-tuned to have the capacity to make structured queries with the appropriate database language. Furthermore, RIG involves several queries to the database and can therefore generate a higher computational cost than RAG. Finally, querying the database directly multiple times may result in slightly higher latency in response to the end user.
RAG, RIG: defined use cases
Although RIG represents a promising new direction, the RAG approach will likely remain the simplest solution for most general use cases where the user simply needs a concise, factual answer. RAG shines when the user query is simple and can be approached with simple textual documentation.
For his part, RIG excels in cases where queries are complex and require iterative interaction with a structured database. For example, to query an SQL database, the RIG allows you to construct a precise response by navigating between different layers of information.
For companies, testing and experimenting with RIG will be essential to assess the potential in targeted use cases such as specialized agents or systems requiring responses based on dynamic and complex data. However, for most general chatbots, RAG remains the most relevant solution.