Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 29 additions & 2 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -46,10 +46,24 @@ MAX_WORKERS=30
# API Keys and External Services
# =============================================================================

# Serper API for web search and Google Scholar
# Web Search Providers (in order of quality/preference)
# The system will try each provider in order until one succeeds.
# You only need ONE provider configured, but having multiple provides fallback.

# Exa.ai - Best semantic/neural search ($10 free credits)
# Get your key from: https://exa.ai/
EXA_API_KEY=your_key

# Tavily - Purpose-built for RAG/LLMs (1,000 free requests/month)
# Get your key from: https://tavily.com/
TAVILY_API_KEY=your_key

# Serper API for Google search results (2,500 free queries)
# Get your key from: https://serper.dev/
SERPER_KEY_ID=your_key

# DuckDuckGo is always available as final fallback (FREE, no API key needed)

# Jina API for web page reading
# Get your key from: https://jina.ai/
JINA_API_KEYS=your_key
Expand Down Expand Up @@ -95,4 +109,17 @@ IDP_KEY_SECRET=your_idp_key_secret

# These are typically set by distributed training frameworks
# WORLD_SIZE=1
# RANK=0
# RANK=0

# =============================================================================
# llama.cpp Local Inference (Alternative for Mac/Local Users)
# =============================================================================
# If using the llama.cpp local inference option instead of vLLM:

# The llama.cpp server URL (default works if using start_llama_server.sh)
LLAMA_SERVER_URL=http://127.0.0.1:8080

# For llama.cpp mode:
# - Web search uses DuckDuckGo by default (FREE, no API key needed)
# - JINA_API_KEYS is optional but recommended for better page reading
# - See: python inference/interactive_llamacpp.py --help
49 changes: 49 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,6 +179,55 @@ You need to modify the following in the file [inference/react_agent.py](https://
- Change the model name to alibaba/tongyi-deepresearch-30b-a3b.
- Adjust the content concatenation way as described in the comments on lines **88–90.**


---

### 7. Local Inference with llama.cpp (Optional)

> **For Mac users or anyone who wants 100% local inference without vLLM/CUDA dependencies.**

This repo includes support for running DeepResearch locally using [llama.cpp](https://github.com/ggerganov/llama.cpp) with Metal (Apple Silicon) or CUDA acceleration. Zero API costs, full privacy.

#### Requirements

- llama.cpp built with Metal or CUDA support
- GGUF model: [bartowski/Alibaba-NLP_Tongyi-DeepResearch-30B-A3B-GGUF](https://huggingface.co/bartowski/Alibaba-NLP_Tongyi-DeepResearch-30B-A3B-GGUF)
- 32GB+ RAM (for Q4_K_M quantization)

#### Quick Start

```bash
# Install minimal dependencies
pip install -r requirements-local.txt

# Build llama.cpp (Mac with Metal)
cd llama.cpp
cmake -B build -DLLAMA_METAL=ON -DCMAKE_BUILD_TYPE=Release
cmake --build build --config Release
cd ..

# Download model (~18GB)
mkdir -p models/gguf
curl -L -o models/gguf/Alibaba-NLP_Tongyi-DeepResearch-30B-A3B-Q4_K_M.gguf \
'https://huggingface.co/bartowski/Alibaba-NLP_Tongyi-DeepResearch-30B-A3B-GGUF/resolve/main/Alibaba-NLP_Tongyi-DeepResearch-30B-A3B-Q4_K_M.gguf'

# Terminal 1: Start the server
./scripts/start_llama_server.sh

# Terminal 2: Run research queries
python inference/interactive_llamacpp.py
```

The llama.cpp server provides both an API and a web UI at http://localhost:8080.

#### Features

- **Free web search**: Uses DuckDuckGo (no API key required)
- **Page visiting**: Uses Jina Reader (optional API key for better results)
- **Loop detection**: Prevents infinite tool call cycles
- **32K context**: Long research sessions supported

---
## Benchmark Evaluation

We provide benchmark evaluation scripts for various datasets. Please refer to the [evaluation scripts](./evaluation/) directory for more details.
Expand Down
Loading