screenpipe-api

star 19.3k

Query the user's data via the local screenpipe REST API at localhost:3030 — screen recordings, audio, UI elements, usage analytics, and the user's persistent memory store. Use when the user asks about their screen activity, meetings, apps, productivity, media export, retranscription, connected services, OR when they ask to save / remember / store / note information so it can be retrieved later (POST /memories — survives across sessions and is queryable by Claude/external agents via the same API).

screenpipe By screenpipe schedule Updated 6/15/2026

name: screenpipe-api description: Query the user's data via the local screenpipe REST API at localhost:3030 — screen recordings, audio, UI elements, usage analytics, meetings, connected services, and the user's persistent memory store. Use for questions about screen activity, meetings, apps, productivity, media export, retranscription, connections, OR to save / remember / store information for later (POST /memories — survives across sessions, queryable by external agents).

Screenpipe API

Local REST API at http://localhost:3030.

Authentication

Every request needs auth. $SCREENPIPE_LOCAL_API_KEY is already in your env; without it you get 403.

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/..."

No-auth endpoints: /health, /ws/health, /audio/device/status, /connections/oauth/callback, /frames/*, /notify, /pipes/store/*.

Context Window Protection

Responses can be large. Write curl output to a file (-o /tmp/sp.json), check size (wc -c), and if over ~5KB read only the first 50-100 lines / extract with jq. Never dump full large responses into context.

Cut tokens at the source on list endpoints (/search, /elements): add &format=csv (or tsv) for a columnar table (column names written once instead of per-row keys — ~70% cheaper on uniform rows like elements), and &fields=a,b,c for only the columns you need (dotted paths like content.text). Text-heavy ocr/audio barely benefit — use fields + max_content_length there instead.


1. Activity Summary — GET /activity-summary

Default broad-context call. Bundles apps, windows, key_texts, audio, edited_files, recording health, top memories, deduped screen+audio snippets, and a data_status/query_status/guidance triple.

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/activity-summary?start_time=30m%20ago&end_time=now"

Required: start_time, end_time. Optional: app_name, q (filters memories+snippets, drives query_status); include_recording|memories|snippets|guidance=false to slim (each defaults true); max_snippets, max_snippet_chars, max_memories. For a lean time-tracking sweep also set include_key_texts=false (biggest win), include_apps=false, include_windows=falsetotal_active_minutes + per-app/window minutes + the status triple still return.

  • data_statusok|empty_but_recording|no_capture_in_range|not_recording — check before claiming "no activity".
  • query_statusnot_requested|matched|no_query_matches; guidance.next_best_query is a ready hint when empty.
  • Escalate to /search only for verbatim quotes / frame_ids.

2. Search — GET /search

Use when /activity-summary says ok but you need verbatim quotes, media paths, frame IDs, or a specific match.

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/search?q=QUERY&content_type=all&limit=10&start_time=1h%20ago"
Parameter Required Description
q No Keywords. Avoid for audio — transcriptions are noisy, q over-filters.
content_type No all (default), accessibility, audio, input, ocr, memory. Screen text is primarily the accessibility tree; OCR is the fallback for apps without it (videos, games, remote desktops).
limit No Default 20. Keep ≤20 to protect context.
offset No Pagination. Default 0.
start_time Yes ISO 8601 or relative (16h ago, 2d ago, 30m ago).
end_time No Defaults to now (now, 1h ago).
app_name No Substring, e.g. "Google Chrome", "Slack".
window_name No Window title substring.
speaker_name No Filter audio by speaker (case-insensitive partial).
focused No Only focused windows.
tags No Comma-separated; returns items carrying ALL of them (person:ada,project:atlas). Exact match.
include_related No With tags, also return a related map of co-occurring tags (people/projects/workflows), most-frequent first.
max_content_length No Middle-truncate each result's text.
format No json (default), csv, tsv/table. CSV is lossless; TSV collapses newlines.
fields No Column allowlist of dotted paths, e.g. type,content.app_name,content.text.

Critical rules: always include start_time (unbounded queries timeout) · "recent" = 30 min, "today" = since midnight, "yesterday" = yesterday's range · if /search is empty, fall back to /activity-summary and check data_status before saying "no data" · on timeout, narrow the range.

Tags link people/projects/topics across screen, audio, and memories under one namespace (person:ada, project:atlas, topic:pricing). Add to a frame/audio: POST /tags/vision/{frame_id} or POST /tags/audio/{chunk_id} body {"tags":["person:ada"]}; to a memory: tags in POST /memories. Retrieve: GET /search?tags=person:ada&start_time=30d%20ago (add content_type=memory for memories). Frames are pruned by retention — tag a memory for durable links (memories carry created_at + a frame_id back to the moment). include_related=true returns co-occurring tags grouped by namespace, replacing 2-3 follow-up calls.

Response: {"data": [{"type":"OCR","content":{"frame_id":...,"text":...,"app_name":...}}, {"type":"Audio","content":{"chunk_id":...,"transcription":...,"speaker":{"name":...}}}, {"type":"Input","content":{...}}], "pagination":{"limit":10,"offset":0,"total":42}}.


3. Elements — GET /elements

Lightweight FTS over UI elements (~100-500 bytes each vs 5-20KB from /search). Uniform rows, so format=csv pays off most.

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/elements?frame_id=12345&format=csv&fields=role,text,bounds.left,bounds.top"

Params: q, frame_id, source (accessibility|ocr), role, start_time, end_time, app_name, limit, offset, format, fields.

Frame context (accessibility text, parsed nodes, extracted URLs): GET /frames/{id}/context.

Roles are not normalized across platforms — use the right one for the user's OS:

Concept macOS Windows Linux
Button AXButton Button Button
Static text AXStaticText Text Label
Link AXLink Hyperlink Link
Text field AXTextField Edit Entry
Menu item AXMenuItem MenuItem MenuItem
Checkbox AXCheckBox CheckBox CheckBox
Web area AXWebArea Pane DocumentWeb
Heading AXHeading Header Heading
List item AXRow ListItem ListItem

OCR-only roles (accessibility-unavailable fallback): line, word, block, paragraph, page.


4. Frames (Screenshots) — GET /frames/{frame_id}

curl -o /tmp/frame.png "http://localhost:3030/frames/12345"

Raw PNG. Never fetch more than 2-3 frames per query (~1000-2000 tokens each).


5. Media Export — POST /export

Real-time MP4 (screen frames at true timestamps + synced mic audio). Duration matches the wall-clock span — NOT a timelapse.

curl -X POST http://localhost:3030/export -H "Content-Type: application/json" \
  -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" -d '{"start": "5m ago", "end": "now"}'

Fields: start+end (ISO 8601 or relative; end defaults to now), OR meeting_id for a whole meeting. Optional output_path (absolute, e.g. ~/Downloads/clip.mp4); else lands in the data dir's exports/. Returns {output_path, frame_count, audio_chunk_count, duration_secs, file_size_bytes} — show output_path as inline code. Long ranges take minutes.

ffmpeg on audio file_path from search results (always -y, save to ~/.screenpipe/exports/):

ffmpeg -y -i audio.mp4 -q:a 2 out.mp3                              # convert
ffmpeg -y -i in.mp4 -ss 00:01:00 -to 00:05:00 -q:a 2 clip.mp3      # trim
ffmpeg -y -i in.mp4 -t 10 -vf "fps=10,scale=640:-1" out.gif        # GIF

6. Retranscribe — POST /audio/retranscribe

curl -X POST http://localhost:3030/audio/retranscribe -H "Content-Type: application/json" \
  -d '{"start": "1h ago", "end": "now"}'

Optional: engine (deepgram, screenpipe-cloud, whisper-large, whisper-large-v3-turbo, whisper-large-v3-turbo-quantized, qwen3-asr, parakeet, parakeet-mlx, openai-compatible), vocabulary (array of {"word","replacement"}), prompt (Whisper topic context). Keep ranges ≤1h. Show old vs new.


7. Raw SQL — POST /raw_sql

curl -X POST http://localhost:3030/raw_sql -H "Content-Type: application/json" \
  -d '{"query": "SELECT ... LIMIT 100"}'

Rules: every SELECT needs LIMIT · always filter by time (datetime('now','-24 hours')) · read-only. Never use frame counts for time estimates — frames are event-driven; use /activity-summary for screen time.

Table Key Columns Time Column
frames app_name, window_name, browser_url, focused timestamp
ocr_text text, app_name, window_name join via frame_id
elements source, role, text, bounds_* join via frame_id
audio_transcriptions transcription, device, speaker_id, is_input_device timestamp
audio_chunks file_path timestamp
speakers name, metadata
ui_events event_type, app_name, window_title, browser_url timestamp
accessibility app_name, window_name, text_content, browser_url timestamp
meetings meeting_app, title, attendees, detection_source meeting_start
memories content, source, tags, importance created_at
-- Most used apps (last 24h)
SELECT app_name, COUNT(*) AS frames FROM frames
WHERE timestamp > datetime('now','-24 hours') AND app_name IS NOT NULL
GROUP BY app_name ORDER BY frames DESC LIMIT 20;

-- Context switches per hour
SELECT strftime('%H:00', timestamp) AS hour, COUNT(*) AS switches
FROM ui_events WHERE event_type='app_switch' AND timestamp > datetime('now','-24 hours')
GROUP BY hour ORDER BY hour LIMIT 24;

Patterns: GROUP BY date(timestamp) (daily), GROUP BY strftime('%H:00', timestamp) (hourly), HAVING frames > 5 (filter noise).


8. Connections — GET /connections

curl http://localhost:3030/connections            # list all integrations (40+)
curl http://localhost:3030/connections/telegram   # saved creds for a webhook/token integration

Each entry's description is self-describing — for control surfaces (browsers, gateways, OAuth proxies) it includes the exact endpoint + body shape. Read it before guessing. If not connected, tell the user to set it up in Settings > Connections.

Credential integrationsGET /connections/<id> returns fields to call the service directly:

  • Telegram: bot_token+chat_idPOST https://api.telegram.org/bot{token}/sendMessage
  • Slack / Teams: webhook_urlPOST {webhook_url} with {"text":...}
  • Discord: webhook_urlPOST {webhook_url} with {"content":...}
  • Todoist: api_tokenPOST https://api.todoist.com/api/v1/tasks (Bearer)
  • Email: smtp_host, smtp_port, smtp_user, smtp_pass, from_address

OAuth/proxy integrations — tokens live in SecretStore, never exposed via GET. Call the local proxy; it injects auth and forwards upstream. There is no /connections/<id>/token endpoint.

# GitHub create issue (repo from pipe settings). Same shape for comments: .../issues/42/comments {"body":...}
curl -X POST http://localhost:3030/connections/github/proxy/repos/OWNER/REPO/issues \
  -H "Content-Type: application/json" -d '{"title":"Bug","body":"Steps..."}'

# Generic OAuth proxy (Zoom, Vercel, Google Docs, Microsoft 365, ...)
curl -X POST http://localhost:3030/connections/<id>/proxy/<upstream-api-path> \
  -H "Content-Type: application/json" -d '{...}'

Don't call https://api.github.com/... directly from a pipe — use the proxy.

Calendar — use calendar endpoints for appointments/upcoming events. If /connections shows ics-calendar.connected: true, include ICS results too before saying the calendar is empty:

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" \
  "http://localhost:3030/connections/calendar/events?hours_back=0&hours_ahead=72"
# also: /connections/google-calendar/events , /connections/ics-calendar/events

Browser control (owned-default) — an embedded browser, shown in the chat. Cookies persist (isolated profile); password fields are stripped from snapshots. Try snapshot first; reach for eval only when needed.

# Navigate → {"ok":true,"url":"<final>"}
curl -X POST -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" -H "Content-Type: application/json" \
  -d '{"url":"https://en.wikipedia.org/wiki/Giraffe"}' \
  http://localhost:3030/connections/browsers/owned-default/navigate

# Snapshot (no JS) → {title, url, tree:"[h1] ...\n  [a] ... → /href", truncated}. Best for "what's on the page?".
curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" \
  http://localhost:3030/connections/browsers/owned-default/snapshot

# Eval (escape hatch) — arbitrary JS return value, for clicks / values the snapshot tree omits.
curl -X POST -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" -H "Content-Type: application/json" \
  -d '{"code":"return [...document.querySelectorAll(\".title>a\")].slice(0,5).map(a=>a.innerText)"}' \
  http://localhost:3030/connections/browsers/owned-default/eval

9. Meetings — GET /meetings, PUT /meetings/:id

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/meetings?start_time=1d%20ago&end_time=now&limit=10"
curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/meetings/42"

# Partial update — omitted fields stay as-is. Read first and re-include existing `note` so user notes survive.
curl -X PUT http://localhost:3030/meetings/42 -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" \
  -H "Content-Type: application/json" -d '{"title":"Q3 planning","note":"<existing>\n\n## Summary\n<summary>"}'

Detected from calendar, app detection, window titles, UI elements, multi-speaker audio. q is a case-insensitive substring over title/attendees/notes. Uses PUT, not PATCH. Fields: id, meeting_start, meeting_end (null if ongoing), meeting_app, title?, attendees?, note?, detection_source. Also queryable via raw SQL on the meetings table.


10. Speakers — POST /speakers/*

All POST with Content-Type: application/json unless noted:

  • GET /speakers/search?name=John — search by name
  • GET /speakers/unnamed?limit=20 — unnamed speakers (for labeling)
  • GET /speakers/similar?speaker_id=29&limit=5 — similar by voice embedding
  • /speakers/update {"id":29,"name":"Jordan"} — rename/metadata
  • /speakers/reassign {"audio_chunk_id":456,"new_speaker_name":"Jordan","propagate_similar":true} — returns new_speaker_id, transcriptions_updated, old_assignments (for undo)
  • /speakers/undo-reassign {"old_assignments":[{"transcription_id":1,"old_speaker_id":29}]}
  • /speakers/merge {"speaker_to_keep_id":5,"speaker_to_merge_id":29}
  • /speakers/hallucination {"speaker_id":29} — mark false detection
  • /speakers/delete {"id":29} — also removes audio chunk files

"That was actually Jordan, not Karishma": find the audio result's chunk_idPOST /speakers/reassign with audio_chunk_id + new_speaker_name; propagate_similar:true (default) also fixes similar chunks.


11. Memories — High-Signal Persistent Knowledge

Memories are the highest-signal source — curated facts, preferences, decisions, project context distilled from hours of data. If you're calling /search, also query /memories: search gives you what happened, memories give you what matters and why. Query memories first when answering about preferences/decisions/past context, building background on a project/person/workflow, or generating any summary/recommendation/plan.

curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/memories?q=preference&limit=20"          # FTS search
curl -H "Authorization: Bearer $SCREENPIPE_LOCAL_API_KEY" "http://localhost:3030/memories?min_importance=0.5&limit=20"    # recent, high importance
curl -X POST http://localhost:3030/memories -H "Content-Type: application/json" \
  -d '{"content":"User prefers dark mode","source":"user","tags":["preference","ui"],"importance":0.7}'                   # create
curl -X PUT http://localhost:3030/memories/1 -H "Content-Type: application/json" -d '{"content":"...","importance":0.8}' # update
curl -X DELETE http://localhost:3030/memories/1                                                                          # delete

GET /memories params: q, source, tags, min_importance, start_time, end_time, limit, offset. Memories also come via GET /search?content_type=memory (NOT included in content_type=all — ask explicitly), which adds tags + include_related. When you learn a genuinely useful long-lived fact, store it with importance 0.0-1.0 — not transient observations.


12. Notifications — POST http://localhost:11435/notify

Notify the desktop UI. This is the Tauri sidecar (port 11435), not the main API. body supports markdown (**bold**, `code`, [text](url)).

curl -X POST http://localhost:11435/notify -H "Content-Type: application/json" \
  -d '{"title":"3 new voice memos","body":"found recordings from today"}'

# Markdown body + action buttons. action types: "link" (web), "deeplink" (screenpipe://), "dismiss".
curl -X POST http://localhost:11435/notify -H "Content-Type: application/json" \
  -d '{"title":"Meeting summary","body":"**Q3 Planning** saved\n\nopen [notes](~/Documents/q3.md)","actions":[{"id":"view","label":"view","type":"deeplink","url":"screenpipe://timeline"},{"id":"skip","label":"skip","type":"dismiss"}]}'

Fields: title* , body* (markdown), type (default "pipe"), timeout/autoDismissMs (ms, default 20000), actions (buttons). Body links: web URL → browser, file path (~/notes.md, /var/log/app.log) → default app, screenpipe://... → in-app. Returns {"success":true}.


13. Other Endpoints

curl http://localhost:3030/health        # health check
curl http://localhost:3030/audio/list    # audio devices
curl http://localhost:3030/vision/list   # monitors

Deep Links & Videos

Reference real moments with clickable links (only IDs/timestamps from actual results — never fabricate):

  • [10:30 AM — Chrome](screenpipe://frame/12345) — screen results (use frame_id)
  • [meeting at 3pm](screenpipe://timeline?timestamp=ISO8601) — audio results (use timestamp)

Show a search result's file_path as inline code to make it a playable video: `/Users/name/.screenpipe/data/monitor_1_..._10-30-00.mp4`.

Install via CLI
npx skills add https://github.com/screenpipe/screenpipe --skill screenpipe-api
Repository Details
star Stars 19,336
call_split Forks 1,838
navigation Branch main
article Path SKILL.md
More from Creator