Permission-Based Retrieval Augmented Generation (RAG)

November 15, 2024

For all organizations it is important to make knowledge easily accessible. But it is paramount that colleagues can only retrieve information which they can also access in the respective content sources (content repositories). The same principle must therefore often hold true to retrieval augmented generation (RAG).

In this blog post we wanted to explain how we implement secure retrieval augmented generation within the RheinInsights Retrieval Suite. This approach takes user and group permissions in the content source at search time into account so that the RAG does not expose any confidential information.

Search With Early Binding Security Trimming

Enterprise search normally uses early binding security trimming to retrieve only results which a user should see. So it takes the user permissions into account. In order to do so, two different indexes are created, one index for the indexed documents (document index) and one for the user group relationships (principal index).

Document Index

The document index contains all content from the indexed repository, along with the document metadata. Part of the document metadata are access control lists (ACLs), one for allow ACLs and one for deny ACLs. These ACLs must contain exactly the user and group tokens which have access to the indexed document.

Principal Index

The principal index hosts the user group relationships. Here, our connectors index the user ids as key and all groups of the user as values. The connectors take care of computing a transitive hull and to flatten group to group relationships. Thus, you use this index as a key-value store which you can easily query at search time to get the user and group tokens for this user.

Powerful Search with Security Trimming at Query Time - Enterprise Search

At query time, a user must be authenticated against the search application. In turn, prior to issuing a request against the document index, the search application can fetch the groups for the user. Then the original search and filter query of the user is enriched with filters which act on the allow and deny ACL fields. For instance,

query = rheininsights

becomes

query = (rheininsights) AND (allow_acl:"user@company.org" OR allow_acl:"group1" ...) AND NOT (deny_acl:"user@company.org" OR deny_acl:"group1" ...)

Please note that our RheinInsights Retrieval Suite offers enterprise search for all supported search engines. At the time of writing this is Microsoft Search, Azure AI Search, Apache Solr and Elasticsearch.

Permission-based Vector Search

The same approach to early binding security trimming can easily be extended to vector search. All modern vector search engines, such as Azure AI Search, Apache Solr and Elasticsearch support filtering at query time. This means that in contrast to a keyword search, you perform a vector search and apply the same allow_acl-Filter as described above:

query = (v:[...]) AND (allow_acl:"user@company.org" OR allow_acl:"group1" ...) AND NOT (deny_acl:"user@company.org" OR deny_acl:"group1" ...)

Secure Retrieval Augmented Generation for Enterprises

Now, we only need to glue these pieces together to implement secure retrieval augmented generation in bot or Q&A scenarios. If the underlying search engine is secure, also the bot will generate secure answers if you use LLMs which are trained without sensitive data.

The RheinInsights Retrieval Suite offers secure retrieval augmented generation setups for all connectors and Azure AI Search, Apache Solr and Elasticsearch. It builds up an enterprise search and secure vector search index. At query time, authenticated users execute the vector search as described above, and the answers are put in relationship with the input query. As the search results are security trimmed, the answers will not expose sensitive knowledge to the user.

Please note that at the time of writing, the authors do not know of an approach where an LLM (large language model, such as GPT) can be trained to do a fully reliable security trimming on LLM side. Thus, the described approach is an easy solution to not exposing sensitive information at query time for RAGs.

More insights

The RheinInsights Retrieval Suite in Action > Permission-Based Retrieval Augmented Generation (RAG) > RheinInsights Welcomes Three New Connectors