Welcome to the AI engineering challenge for the Jedi Team at GWI!
This task is designed to help us understand how you approach applied‐AI problems and build intelligent, production‑ready systems. The Jedi team owns and evolves the company’s AI infrastructure, so the exercise focuses on agentic patterns, tool orchestration, and reasoning quality. Creativity, thoughtful design, and clean code are all appreciated.
GWI’s market‑research data has been converted into natural‑language statements and stored in a relational database. A sample export lives in data.md.
Your mission is to implement an agentic chatbot that helps clients answer questions using that data—and, when needed, additional tools.
- 
Retrieve answers from internal data
- Exact match or similarity search is acceptable.
 - Cite the data fragment you used.
 
 - 
Plan a course of action when the answer is not immediately available.
- E.g. “I will check the database; if that fails, I’ll run a web search.”
 
 - 
Call at least two external tools (examples, choose any two or invent your own):
- Web‑search API.
 - Vector‑database or RAG lookup.
 - Classifier to ensure quality of the answer.
 
 - 
Evaluate and Score Agent Responses
- Implement an evaluation process to assess the quality and relevance of the agent's answers.
 - Use the evaluation results to iteratively improve the agent's reasoning and tool selection.
 
 - 
Finetune the Agent Based on Feedback
- Incorporate user feedback and evaluation outcomes to continuously finetune the agent's models or decision logic.
 - This may involve retraining components, updating prompts, or refining tool selection strategies to enhance performance over time.
 
 - 
Persist conversations so a user can resume from where they left off.
- Support multiple concurrent chats per user.
 
 - 
Expose an HTTP interface (REST, SSE, or WebSocket) that allows a UI to stream the agent’s intermediate reasoning (“thoughts”) and final answers.
 
You may implement the service in Go 1.22+ or Python 3.11+. Mixing in other languages or frameworks for tooling is fine if it helps.
- Auto‑generate a concise, human‑readable title for every new chat.
 - Allow users to give thumbs‑up / thumbs‑down feedback on any agent message.
 - Log tool invocations and surface simple analytics (e.g., % of questions resolved without web search).
 - Containerisation (Docker) or a dev script (
make,task,invoke, etc.) for one‑command setup. - Deployment notes for a cloud environment of your choice.
 - AI‑generated simple UI (100 % optional).
 
Provide a short guide (README section or shell script) that covers:
- Startup – how to launch the service (and any supporting services such as a vector DB).
 - Configuration – environment variables, API keys, or model endpoints needed.
 - Testing – how to run unit and integration tests.
 - Evaluation – how to evaluate the performance of the agent.
 - Assumptions – anything non‑obvious you decided (e.g., which LLM provider you picked, why a specific vector DB).
 
Fork this repository, commit your solution to your fork, and send it back to us.
Good luck, potential colleague—may the (reinforcement) learning be with you!