Skip to content

Commit 5685e23

Browse files
add user-facing-documentation for ragretriever
1 parent 430efc0 commit 5685e23

File tree

1 file changed

+196
-0
lines changed

1 file changed

+196
-0
lines changed

genai_docs/RagRetriever/USER_GUIDE.md

Lines changed: 196 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,196 @@
1+
# RagRetriever User Guide
2+
3+
## Summary
4+
5+
RagRetriever is a powerful service that enables intelligent search and retrieval from knowledge graphs created by GraphRAG Importer. It offers two distinct search methods:
6+
7+
- **Global Search**: Analyzes entire document to identify themes and patterns, perfect for high-level insights and comprehensive summaries.
8+
- **Local Search**: Focuses on specific entities and their relationships, ideal for detailed queries about particular concepts.
9+
10+
The service supports both private (Triton Inference Server) and public (OpenAI) LLM deployments, making it flexible for various security and infrastructure requirements. With simple HTTP endpoints, you can easily query your knowledge graph and get contextually relevant responses.
11+
12+
Key features:
13+
- Dual search methods for different query types
14+
- Support for both private and public LLM deployments
15+
- Simple REST API interface
16+
- Integration with ArangoDB knowledge graphs
17+
- Configurable community hierarchy levels
18+
19+
## Overview
20+
21+
The RagRetriever service enables intelligent search and retrieval of information from your knowledge graph. It provides two powerful search methods - Global Search and Local Search - that leverage the structured knowledge graph created by the GraphRAG Importer to deliver accurate and contextually relevant responses to your natural language queries.
22+
23+
## Search Methods
24+
25+
### Global Search
26+
27+
Global Search is designed for queries that require understanding and aggregation of information across your entire document. It's particularly effective for questions about overall themes, patterns, or high-level insights in your data.
28+
29+
#### How Global Search Works
30+
31+
1. **Community-Based Analysis**: Uses pre-generated community reports from your knowledge graph to understand the overall structure and themes of your data
32+
2. **Map-Reduce Processing**:
33+
- **Map Stage**: Processes community reports in parallel, generating intermediate responses with rated points
34+
- **Reduce Stage**: Aggregates the most important points to create a comprehensive final response
35+
36+
#### Best Use Cases
37+
- "What are the main themes in the dataset?"
38+
- "Summarize the key findings across all documents"
39+
- "What are the most important concepts discussed?"
40+
41+
### Local Search
42+
43+
Local Search focuses on specific entities and their relationships within your knowledge graph. It's ideal for detailed queries about particular concepts, entities, or relationships.
44+
45+
#### How Local Search Works
46+
47+
1. **Entity Identification**: Identifies relevant entities from the knowledge graph based on the query
48+
2. **Context Gathering**: Collects:
49+
- Related text chunks from original documents
50+
- Connected entities and their strongest relationships
51+
- Entity descriptions and attributes
52+
- Context from the community each entity belongs to
53+
3. **Prioritized Response**: Generates a response using the most relevant gathered information
54+
55+
#### Best Use Cases
56+
- "What are the properties of [specific entity]?"
57+
- "How is [entity A] related to [entity B]?"
58+
- "What are the key details about [specific concept]?"
59+
60+
## How to install RagRetriever Service
61+
62+
The RagRetriever service can be configured to use either Triton Inference Server (for private LLM deployments) or OpenAI / OpenRouter (for public LLM deployments).
63+
To start the service, use GenAI service endpoint `/v1/graphragretriever`. Please refer to the documentation of GenAI service for more information on how to use it.
64+
Here are the configuration options for all 3 options:
65+
66+
### Using Triton Inference Server (Private LLM)
67+
68+
First setup and install LLM-Host service with LLM and embedding models of your choice. The setup will use Triton Inference Server and mlflow at the backend. Please refer to below documentation for more detail:
69+
// @docs-team please insert reference to GenAI/Triton documentation here
70+
71+
Once the LLM-host service is installed and running successfully, then you can start the retriever service using the below reference:
72+
73+
```json
74+
{
75+
"env": {
76+
"username": "your_username",
77+
"db_name": "your_database_name",
78+
"api_provider": "triton",
79+
"triton_url": "your-arangodb-llm-host-url",
80+
"triton_model": "mistral-nemo-instruct"
81+
},
82+
}
83+
```
84+
85+
- `username`: ArangoDB database user with permissions to access collections
86+
- `db_name`: Name of the ArangoDB database where the knowledge graph is stored
87+
- `api_provider`: Specifies which LLM provider to use
88+
- `triton_url`: URL of your Triton Inference Server instance. This should be the URL where your LLM-host service is running
89+
- `triton_model`: Name of the LLM model to use for text processing
90+
91+
### Using OpenAI
92+
93+
```json
94+
{
95+
"env": {
96+
"openai_api_key": "your_openai_api_key",
97+
"username": "your_username",
98+
"db_name": "your_database_name",
99+
"api_provider": "openai"
100+
},
101+
}
102+
```
103+
104+
- `username`: ArangoDB database user with permissions to access collections
105+
- `db_name`: Name of the ArangoDB database where the knowledge graph is stored
106+
- `api_provider`: Specifies which LLM provider to use
107+
- `openai_api_key`: Your OpenAI API key
108+
109+
Note: By default for OpenAI API, we use gpt-4-mini and text-embedding-3-small models as LLM and embedding model respectively.
110+
### Using OpenRouter (Gemini, Anthropic, etc.)
111+
112+
OpenRouter makes it possible to connect to a huge array of LLM API providers, including non-OpenAI LLMs like Gemini Flash, Anthropic Claude and publicly hosted open-source models.
113+
114+
When using the OpenRouter option, the LLM responses are served via OpenRouter while OpenAI is used for the embedding model.
115+
116+
```json
117+
{
118+
"env": {
119+
"db_name": "your_database_name",
120+
"username": "your_username",
121+
"api_provider": "openrouter",
122+
"openai_api_key": "your_openai_api_key",
123+
"openrouter_api_key": "your_openrouter_api_key",
124+
"openrouter_model": "mistralai/mistral-nemo" // Specify a model here
125+
},
126+
}
127+
```
128+
129+
- `username`: ArangoDB database user with permissions to access collections
130+
- `db_name`: Name of the ArangoDB database where the knowledge graph is stored
131+
- `api_provider`: Specifies which LLM provider to use
132+
- `openai_api_key`: Your OpenAI API key (for the embedding model)
133+
- `openrouter_api_key`: Your OpenRouter API key (for the LLM)
134+
- `openrouter_model`: Desired LLM (optional; default is `mistral-nemo`)
135+
136+
> **Note**
137+
> When using OpenRouter, we default to `mistral-nemo` for generation (via OpenRouter) and `text-embedding-3-small` for embeddings (via OpenAI).
138+
## Using the Retriever Service
139+
140+
### Executing Queries
141+
142+
After the retriever service is installed successfully. You can interact with the retriever service using the following HTTP endpoint:
143+
144+
#### Local Search
145+
146+
```bash
147+
curl -X POST /v1/graphrag-query \
148+
-H "Content-Type: application/json" \
149+
-d '{
150+
"query": "What is the AR3 Drone?",
151+
"query_type": 2,
152+
"provider": 0
153+
}'
154+
```
155+
156+
#### Global Search
157+
158+
```bash
159+
curl -X POST /v1/graphrag-query \
160+
-H "Content-Type: application/json" \
161+
-d '{
162+
"query": "What is the AR3 Drone?",
163+
"level": 1,
164+
"query_type": 1,
165+
"provider": 0
166+
}'
167+
```
168+
169+
The request parameters are:
170+
- `query`: Your search query text
171+
- `level`: The community hierarchy level to use for the search (1 for top-level communities)
172+
- `query_type`: The type of search to perform
173+
- `1`: Global Search
174+
- `2`: Local Search
175+
- `provider`: The LLM provider to use
176+
- `0`: OpenAI (or OpenRouter)
177+
- `1`: Triton
178+
179+
### Health Check
180+
181+
Monitor the service health:
182+
```bash
183+
GET /v1/health
184+
```
185+
186+
## Best Practices
187+
188+
1. **Choose the Right Search Method**:
189+
- Use Global Search for broad, thematic queries
190+
- Use Local Search for specific entity or relationship queries
191+
192+
193+
2. **Performance Considerations**:
194+
- Global Search may take longer due to its map-reduce process
195+
- Local Search is typically faster for concrete queries
196+

0 commit comments

Comments
 (0)