-
Notifications
You must be signed in to change notification settings - Fork 5
✨️(ai) add deep search #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
base_url: str = os.getenv("AI_BASE_URL", "https://albert.api.etalab.gouv.fr/v1") | ||
api_key: str = os.getenv("AI_API_KEY", "") | ||
model: str = os.getenv("AI_MODEL", "albert-large") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you should retrieve that from django settings
return True | ||
|
||
# Reindex if it's been more than 2 hours since last indexing (to handle any API issues) | ||
if self.last_index_time and (current_time - self.last_index_time) > 7200: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could declare the duration as a class attribute
if self.last_index_time and (current_time - self.last_index_time) > 7200: | |
if self.last_index_time and (current_time - self.last_index_time) > self.INDEX_STALE_TIME: |
# Use default embedding model as fallback | ||
self.embeddings_model = "embeddings-small" | ||
|
||
def collection_exists(self, user_id: str) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this method is duplicated
"content": email.get('body', ''), | ||
"metadata": { | ||
"subject": email.get('subject', ''), | ||
"sender": email.get('sender', ''), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could interesting to index recipients nope ? (to and cc)
Maybe also if email has attachments just through a boolean value ?
# Get user ID from authenticated user | ||
if not hasattr(request, 'user') or not request.user.is_authenticated: | ||
return Response({ | ||
'success': False, | ||
'error': 'Authentication required' | ||
}, status=status.HTTP_401_UNAUTHORIZED) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If user is not authenticated the permission IsAuthenticated
should have already return a 401 response.
user_id=user_id, | ||
user_query=query, | ||
api_client=chatbot.api_client, | ||
max_results=10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we imagine to let api consumer to set this property through a query param? Or is it a threshold for performance purpose?
logger = logging.getLogger(__name__) | ||
|
||
|
||
@api_view(['POST']) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO this should be a get endpoint not a post as it does not create resource then the query can be easily get from a query param
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You tried to standardize response format but currently it will really hard to maintain. IMO a DRF Serializer could help to factorize that.
aeba690
to
9851db8
Compare
Purpose
Add AI-powered intelligent email search functionality in the search bar using the Albert API from Etalab to enable semantic search capabilities beyond traditional keyword matching.
Proposal
This PR introduces a comprehensive deep search system that allows users to find relevant emails using natural language queries, using two different methods.
Key Features: