|
| 1 | +--- |
| 2 | +alwaysApply: false |
| 3 | +description: "Screenpipe SQLite: schema, search patterns, and best-practice queries for 24/7 screen + mic (OCR/STT) data" |
| 4 | +--- |
| 5 | + |
| 6 | +### Screenpipe SQLite database: how to use it |
| 7 | + |
| 8 | +This rule explains what the Screenpipe database contains and exactly how to query it to serve the user. The DB is a 24/7 capture of screen and microphone activity with OCR and speech-to-text, optimized for full‑text search across modalities. |
| 9 | + |
| 10 | +- **Location**: `$HOME/.screenpipe/db.sqlite` |
| 11 | +- **Core idea**: Use FTS5 tables to search text quickly, then JOIN to base tables for context (timestamps, app/window, file paths). Filter by time first for speed. |
| 12 | + |
| 13 | +### High-level schema map |
| 14 | + |
| 15 | +- **Vision (screen video & frames)** |
| 16 | + - `video_chunks(id, file_path, device_name)` |
| 17 | + - `frames(id, video_chunk_id, offset_index, timestamp, name, device_name, browser_url, app_name, window_name, focused)` |
| 18 | + - FTS + triggers: `frames_fts(id, name, browser_url, app_name, window_name, focused)` |
| 19 | + - Tags: `vision_tags(vision_id=frames.id, tag_id)` |
| 20 | + |
| 21 | +- **OCR (text read from frames)** |
| 22 | + - `ocr_text(frame_id, text, text_json, app_name, ocr_engine, window_name, focused, text_length)` |
| 23 | + - FTS + triggers: `ocr_text_fts(text, app_name, window_name, frame_id)` |
| 24 | + |
| 25 | +- **Audio (microphone capture and transcripts)** |
| 26 | + - `audio_chunks(id, file_path, timestamp)` |
| 27 | + - `audio_transcriptions(id, audio_chunk_id, offset_index, timestamp, transcription, device, is_input_device, speaker_id, transcription_engine, start_time, end_time)` |
| 28 | + - FTS + triggers: `audio_transcriptions_fts(transcription, device, audio_chunk_id, speaker_id, start_time, end_time)` |
| 29 | + - Speakers & embeddings: `speakers(id, name, metadata, hallucination)`; `speaker_embeddings(id, embedding, speaker_id)` |
| 30 | + - Tags: `audio_tags(audio_chunk_id, tag_id)` |
| 31 | + |
| 32 | +- **UI logs (optional)** |
| 33 | + - `ui_monitoring(id, text_output, timestamp, app, window, initial_traversal_at, text_length)` |
| 34 | + - FTS + triggers: `ui_monitoring_fts(text_output, app, window, ui_id)` |
| 35 | + - Tags: `ui_monitoring_tags(ui_monitoring_id, tag_id)` |
| 36 | + |
| 37 | +- **Tags catalog** |
| 38 | + - `tags(id, name, created_at)` |
| 39 | + |
| 40 | +- **Chunked text (optional internal index)** |
| 41 | + - `chunked_text_index(text_id, text)` and related `*_fts` tables |
| 42 | + |
| 43 | +Notes: |
| 44 | +- Do not write directly to any `*_fts*` table; triggers keep FTS in sync with bases. |
| 45 | +- Timestamps are ISO‑8601 strings; prefer `julianday()` for time filtering. |
| 46 | + |
| 47 | +### Querying principles (fast + robust) |
| 48 | + |
| 49 | +1) Filter by time first, then text search, then JOIN context. |
| 50 | + - Use `julianday(timestamp) >= julianday('now','-N hours')`. |
| 51 | +2) Use FTS5 for text; avoid `%LIKE%` on large text. |
| 52 | +3) JOIN back to base tables for metadata (app/window/url, file paths, speakers). |
| 53 | +4) Cap result size with `LIMIT`; rely on indexes (`timestamp`, `app_name`, `window_name`, `browser_url`, `text_length`). |
| 54 | +5) Never update FTS tables directly; insert/update the base table instead. |
| 55 | + |
| 56 | +### Recipes (drop-in queries) |
| 57 | + |
| 58 | +- Recent screen activity (frames) in last N hours |
| 59 | + |
| 60 | +```sql |
| 61 | +SELECT timestamp, app_name, window_name, browser_url |
| 62 | +FROM frames |
| 63 | +WHERE julianday(timestamp) >= julianday('now','-3 hours') |
| 64 | +ORDER BY timestamp DESC |
| 65 | +LIMIT 200; |
| 66 | +``` |
| 67 | + |
| 68 | +- Search OCR text in last N hours (with screen context) |
| 69 | + |
| 70 | +```sql |
| 71 | +SELECT f.timestamp, f.app_name, f.window_name, |
| 72 | + substr(replace(o.text, char(10), ' '),1,200) AS text_sample |
| 73 | +FROM ocr_text_fts ft |
| 74 | +JOIN ocr_text o ON o.frame_id = ft.frame_id |
| 75 | +JOIN frames f ON f.id = o.frame_id |
| 76 | +WHERE julianday(f.timestamp) >= julianday('now','-3 hours') |
| 77 | + AND ft MATCH 'invoice NEAR/3 total' |
| 78 | +ORDER BY f.timestamp DESC |
| 79 | +LIMIT 50; |
| 80 | +``` |
| 81 | + |
| 82 | +- Search audio transcripts in last N hours |
| 83 | + |
| 84 | +```sql |
| 85 | +SELECT t.timestamp, |
| 86 | + substr(replace(t.transcription, char(10), ' '),1,200) AS line, |
| 87 | + t.speaker_id, t.device |
| 88 | +FROM audio_transcriptions_fts ft |
| 89 | +JOIN audio_transcriptions t ON t.audio_chunk_id = ft.audio_chunk_id |
| 90 | +WHERE julianday(t.timestamp) >= julianday('now','-3 hours') |
| 91 | + AND ft MATCH 'plan OR decision OR blocker' |
| 92 | +ORDER BY t.timestamp DESC |
| 93 | +LIMIT 50; |
| 94 | +``` |
| 95 | + |
| 96 | +- Join OCR to source video clip path |
| 97 | + |
| 98 | +```sql |
| 99 | +SELECT f.timestamp, v.file_path, f.offset_index, |
| 100 | + substr(replace(o.text, char(10), ' '),1,200) AS text_sample |
| 101 | +FROM ocr_text o |
| 102 | +JOIN frames f ON f.id = o.frame_id |
| 103 | +JOIN video_chunks v ON v.id = f.video_chunk_id |
| 104 | +WHERE julianday(f.timestamp) >= julianday('now','-1 day') |
| 105 | +ORDER BY f.timestamp DESC |
| 106 | +LIMIT 50; |
| 107 | +``` |
| 108 | + |
| 109 | +- Top apps/windows/urls (last N hours) |
| 110 | + |
| 111 | +```sql |
| 112 | +-- Apps |
| 113 | +SELECT app_name, COUNT(*) AS n |
| 114 | +FROM frames |
| 115 | +WHERE app_name IS NOT NULL AND app_name <> '' |
| 116 | + AND julianday(timestamp) >= julianday('now','-3 hours') |
| 117 | +GROUP BY app_name ORDER BY n DESC LIMIT 10; |
| 118 | + |
| 119 | +-- Windows |
| 120 | +SELECT window_name, COUNT(*) AS n |
| 121 | +FROM frames |
| 122 | +WHERE window_name IS NOT NULL AND window_name <> '' |
| 123 | + AND julianday(timestamp) >= julianday('now','-3 hours') |
| 124 | +GROUP BY window_name ORDER BY n DESC LIMIT 10; |
| 125 | + |
| 126 | +-- URLs |
| 127 | +SELECT browser_url, COUNT(*) AS n |
| 128 | +FROM frames |
| 129 | +WHERE browser_url IS NOT NULL AND browser_url <> '' |
| 130 | + AND julianday(timestamp) >= julianday('now','-3 hours') |
| 131 | +GROUP BY browser_url ORDER BY n DESC LIMIT 10; |
| 132 | +``` |
| 133 | + |
| 134 | +- Vision/UI/Audio tags in last N hours |
| 135 | + |
| 136 | +```sql |
| 137 | +-- Vision tags (on frames) |
| 138 | +SELECT t.name, COUNT(*) AS n |
| 139 | +FROM vision_tags vt |
| 140 | +JOIN tags t ON t.id = vt.tag_id |
| 141 | +JOIN frames f ON f.id = vt.vision_id |
| 142 | +WHERE julianday(f.timestamp) >= julianday('now','-3 hours') |
| 143 | +GROUP BY t.name ORDER BY n DESC LIMIT 10; |
| 144 | + |
| 145 | +-- UI tags |
| 146 | +SELECT t.name, COUNT(*) AS n |
| 147 | +FROM ui_monitoring_tags ut |
| 148 | +JOIN tags t ON t.id = ut.tag_id |
| 149 | +JOIN ui_monitoring u ON u.id = ut.ui_monitoring_id |
| 150 | +WHERE julianday(u.timestamp) >= julianday('now','-3 hours') |
| 151 | +GROUP BY t.name ORDER BY n DESC LIMIT 10; |
| 152 | + |
| 153 | +-- Audio tags (by audio chunk) |
| 154 | +SELECT t.name, COUNT(*) AS n |
| 155 | +FROM audio_transcriptions tr |
| 156 | +JOIN audio_chunks ac ON ac.id = tr.audio_chunk_id |
| 157 | +JOIN audio_tags ag ON ag.audio_chunk_id = ac.id |
| 158 | +JOIN tags t ON t.id = ag.tag_id |
| 159 | +WHERE julianday(tr.timestamp) >= julianday('now','-3 hours') |
| 160 | +GROUP BY t.name ORDER BY n DESC LIMIT 10; |
| 161 | +``` |
| 162 | + |
| 163 | +- Rank OCR hits and show snippets |
| 164 | + |
| 165 | +```sql |
| 166 | +SELECT f.timestamp, |
| 167 | + snippet(ocr_text_fts, 0, '[', ']', '…', 8) AS snippet, |
| 168 | + bm25(ocr_text_fts) AS rank, |
| 169 | + f.app_name, f.window_name |
| 170 | +FROM ocr_text_fts ft |
| 171 | +JOIN ocr_text o ON o.frame_id = ft.frame_id |
| 172 | +JOIN frames f ON f.id = o.frame_id |
| 173 | +WHERE ft MATCH 'invoice NEAR/3 total' |
| 174 | + AND julianday(f.timestamp) >= julianday('now','-1 day') |
| 175 | +ORDER BY rank |
| 176 | +LIMIT 20; |
| 177 | +``` |
| 178 | + |
| 179 | +### Practical patterns for agents |
| 180 | + |
| 181 | +- Recent context: pull last 1–3 hours from `frames` and `ocr_text` to infer what the user is reading/doing; correlate with `audio_transcriptions` to detect meetings/conversations. |
| 182 | +- Cross-modal search: start with FTS (`*_fts MATCH 'query'`), then JOIN to frames/audio for timestamps and `video_chunks`/`audio_chunks` file paths to retrieve the exact clip. |
| 183 | +- Summarize a session: |
| 184 | + 1) Get active windows/apps via `frames` in the time window |
| 185 | + 2) Extract `ocr_text` samples for high `text_length` rows |
| 186 | + 3) Extract top transcript lines and speakers |
| 187 | + 4) Produce a bullet summary of tasks, decisions, links (from `browser_url`) |
| 188 | +- Lightweight counts: use `EXISTS`/`LIMIT` for presence checks; avoid `COUNT(*)` on giant ranges unless filtered by time. |
| 189 | + |
| 190 | +### Safety & constraints |
| 191 | + |
| 192 | +- Contains raw on-screen text and mic transcripts; treat as sensitive. Do not export large dumps by default. |
| 193 | +- Writing should only target base tables; FTS tables are maintained by triggers. |
| 194 | + |
| 195 | +### Quick CLI snippets |
| 196 | + |
| 197 | +```bash |
| 198 | +sqlite3 -readonly $HOME/.screenpipe/db.sqlite \ |
| 199 | + "SELECT type,name FROM sqlite_master WHERE name NOT LIKE 'sqlite_%' ORDER BY type,name;" |
| 200 | + |
| 201 | +sqlite3 -readonly $HOME/.screenpipe/db.sqlite <<'SQL' |
| 202 | +.headers on |
| 203 | +.mode table |
| 204 | +SELECT f.timestamp, f.app_name, f.window_name, substr(replace(o.text, char(10),' '),1,160) AS text |
| 205 | +FROM ocr_text_fts ft |
| 206 | +JOIN ocr_text o ON o.frame_id=ft.frame_id |
| 207 | +JOIN frames f ON f.id=o.frame_id |
| 208 | +WHERE julianday(f.timestamp) >= julianday('now','-3 hours') AND ft MATCH 'error OR crash'; |
| 209 | +SQL |
| 210 | +``` |
0 commit comments