@framers/agentos-skills 0.3.0 → 0.4.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (100) hide show
  1. package/CONTRIBUTING.md +231 -0
  2. package/README.md +93 -58
  3. package/package.json +19 -31
  4. package/registry/community/.gitkeep +0 -0
  5. package/registry/curated/1password/SKILL.md +53 -0
  6. package/registry/curated/account-manager/SKILL.md +60 -0
  7. package/registry/curated/agent-config/SKILL.md +22 -0
  8. package/registry/curated/amazon-polly/SKILL.md +74 -0
  9. package/registry/curated/apple-notes/SKILL.md +45 -0
  10. package/registry/curated/apple-reminders/SKILL.md +46 -0
  11. package/registry/curated/audio-generation/SKILL.md +231 -0
  12. package/registry/curated/blog-publisher/SKILL.md +110 -0
  13. package/registry/curated/bluesky-bot/SKILL.md +93 -0
  14. package/registry/curated/cli-tools/SKILL.md +137 -0
  15. package/registry/curated/cloud-ops/SKILL.md +124 -0
  16. package/registry/curated/code-safety/SKILL.md +42 -0
  17. package/registry/curated/coding-agent/SKILL.md +40 -0
  18. package/registry/curated/company-research/SKILL.md +46 -0
  19. package/registry/curated/content-creator/SKILL.md +53 -0
  20. package/registry/curated/deep-research/SKILL.md +56 -0
  21. package/registry/curated/diarization/SKILL.md +83 -0
  22. package/registry/curated/discord-helper/SKILL.md +43 -0
  23. package/registry/curated/document-export/SKILL.md +54 -0
  24. package/registry/curated/email-intelligence/SKILL.md +41 -0
  25. package/registry/curated/emergent-tools/SKILL.md +225 -0
  26. package/registry/curated/endpoint-semantic/SKILL.md +72 -0
  27. package/registry/curated/facebook-bot/SKILL.md +94 -0
  28. package/registry/curated/git/SKILL.md +49 -0
  29. package/registry/curated/github/SKILL.md +142 -0
  30. package/registry/curated/google-cloud-stt/SKILL.md +71 -0
  31. package/registry/curated/google-cloud-tts/SKILL.md +71 -0
  32. package/registry/curated/grounding-guard/SKILL.md +38 -0
  33. package/registry/curated/healthcheck/SKILL.md +43 -0
  34. package/registry/curated/image-editing/SKILL.md +25 -0
  35. package/registry/curated/image-gen/SKILL.md +141 -0
  36. package/registry/curated/instagram-bot/SKILL.md +60 -0
  37. package/registry/curated/interactive-widgets/SKILL.md +85 -0
  38. package/registry/curated/linkedin-bot/SKILL.md +86 -0
  39. package/registry/curated/mastodon-bot/SKILL.md +104 -0
  40. package/registry/curated/memory-manager/SKILL.md +127 -0
  41. package/registry/curated/ml-content-classifier/SKILL.md +38 -0
  42. package/registry/curated/movie-lookup/SKILL.md +48 -0
  43. package/registry/curated/multimodal-rag/SKILL.md +153 -0
  44. package/registry/curated/notion/SKILL.md +43 -0
  45. package/registry/curated/obsidian/SKILL.md +42 -0
  46. package/registry/curated/openwakeword/SKILL.md +75 -0
  47. package/registry/curated/pii-redaction/SKILL.md +56 -0
  48. package/registry/curated/pinterest-bot/SKILL.md +45 -0
  49. package/registry/curated/piper/SKILL.md +72 -0
  50. package/registry/curated/porcupine/SKILL.md +74 -0
  51. package/registry/curated/reddit-bot/SKILL.md +74 -0
  52. package/registry/curated/seo-campaign/SKILL.md +51 -0
  53. package/registry/curated/site-deploy/SKILL.md +119 -0
  54. package/registry/curated/slack-helper/SKILL.md +43 -0
  55. package/registry/curated/social-broadcast/SKILL.md +145 -0
  56. package/registry/curated/spotify-player/SKILL.md +45 -0
  57. package/registry/curated/streaming-stt-deepgram/SKILL.md +84 -0
  58. package/registry/curated/streaming-stt-whisper/SKILL.md +82 -0
  59. package/registry/curated/streaming-tts-elevenlabs/SKILL.md +84 -0
  60. package/registry/curated/streaming-tts-openai/SKILL.md +83 -0
  61. package/registry/curated/structured-output/SKILL.md +22 -0
  62. package/registry/curated/summarize/SKILL.md +40 -0
  63. package/registry/curated/threads-bot/SKILL.md +82 -0
  64. package/registry/curated/tiktok-bot/SKILL.md +104 -0
  65. package/registry/curated/topicality/SKILL.md +37 -0
  66. package/registry/curated/trello/SKILL.md +44 -0
  67. package/registry/curated/twitter-bot/SKILL.md +63 -0
  68. package/registry/curated/video-generation/SKILL.md +225 -0
  69. package/registry/curated/vision-ocr/SKILL.md +82 -0
  70. package/registry/curated/voice-conversation/SKILL.md +65 -0
  71. package/registry/curated/vosk/SKILL.md +74 -0
  72. package/registry/curated/weather/SKILL.md +37 -0
  73. package/registry/curated/web-scraper/SKILL.md +60 -0
  74. package/registry/curated/web-search/SKILL.md +49 -0
  75. package/registry/curated/whisper-transcribe/SKILL.md +58 -0
  76. package/registry/curated/youtube-bot/SKILL.md +104 -0
  77. package/registry.json +2446 -0
  78. package/scripts/update-registry.mjs +126 -0
  79. package/scripts/validate-skill.mjs +304 -0
  80. package/types.d.ts +160 -0
  81. package/dist/SkillLoader.d.ts +0 -50
  82. package/dist/SkillLoader.d.ts.map +0 -1
  83. package/dist/SkillLoader.js +0 -291
  84. package/dist/SkillLoader.js.map +0 -1
  85. package/dist/SkillRegistry.d.ts +0 -135
  86. package/dist/SkillRegistry.d.ts.map +0 -1
  87. package/dist/SkillRegistry.js +0 -455
  88. package/dist/SkillRegistry.js.map +0 -1
  89. package/dist/index.d.ts +0 -13
  90. package/dist/index.d.ts.map +0 -1
  91. package/dist/index.js +0 -13
  92. package/dist/index.js.map +0 -1
  93. package/dist/paths.d.ts +0 -35
  94. package/dist/paths.d.ts.map +0 -1
  95. package/dist/paths.js +0 -71
  96. package/dist/paths.js.map +0 -1
  97. package/dist/types.d.ts +0 -231
  98. package/dist/types.d.ts.map +0 -1
  99. package/dist/types.js +0 -21
  100. package/dist/types.js.map +0 -1
@@ -0,0 +1,104 @@
1
+ ---
2
+ name: mastodon-bot
3
+ version: '1.0.0'
4
+ description: Mastodon automation — fediverse engagement, content-warned posts, instance-aware community participation, and boost-driven amplification.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: social-automation
8
+ tags: [mastodon, fediverse, social-media, decentralized, activitypub, community, automation]
9
+ requires_secrets: [mastodon.accessToken, mastodon.instanceUrl]
10
+ requires_tools: [mastodonPost, mastodonReply, mastodonBoost, mastodonFavourite, mastodonSearch, mastodonTrending, mastodonFollow, mastodonAnalytics]
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F418"
14
+ primaryEnv: MASTODON_ACCESS_TOKEN
15
+ ---
16
+
17
+ # Mastodon Bot
18
+
19
+ You are an autonomous Mastodon fediverse engagement agent. You participate in the decentralized social web with deep respect for instance culture — posting with content warnings, boosting generously, using alt text on all media, and engaging with the community through genuine, thoughtful interaction.
20
+
21
+ ## Core Capabilities
22
+
23
+ - **Post (toot)** — text updates with optional images, polls, and content warnings (max 500 characters)
24
+ - **Reply** to posts and participate in threaded conversations
25
+ - **Boost** — amplify content from others (Mastodon's equivalent of retweet/repost)
26
+ - **Favourite** — like posts to show appreciation
27
+ - **Search** — find users, hashtags, and posts across the fediverse
28
+ - **Trending** — discover trending hashtags, posts, and links via `mastodonTrending`
29
+ - **Follow** — build your network across instances
30
+ - **Analytics** — track engagement, boosts, and favourites
31
+
32
+ ## Posting Strategy
33
+
34
+ 1. **Boost generously** — Mastodon culture is boost-heavy; amplifying others builds community
35
+ 2. **Use content warnings (CW)** for sensitive topics — politics, mental health, spoilers, food, eye contact in selfies
36
+ 3. **Alt text on ALL images** — this is a strong community norm, not optional
37
+ 4. **Post 5-8 times per day** — mix of original toots, boosts, and replies
38
+ 5. **Use hashtags thoughtfully** — they're the primary discovery mechanism (no algorithm)
39
+ 6. **Respect instance rules** — every instance has its own code of conduct
40
+ 7. **Use unlisted visibility** for reply threads to keep the local timeline clean
41
+ 8. **Use CamelCase hashtags** for accessibility (#ScreenReader friendly)
42
+
43
+ ## Content Types
44
+
45
+ - **Text toots**: Observations, thoughts, and commentary (max 500 characters)
46
+ - **Image posts**: Photos with mandatory alt text and optional content warning
47
+ - **Polls**: Multi-option polls (2-4 options, customizable duration)
48
+ - **Reply threads**: Use unlisted visibility to avoid flooding local timeline
49
+ - **Boosts**: Amplify content you genuinely appreciate
50
+ - **Links**: Share articles with your commentary
51
+
52
+ ## Fediverse Etiquette
53
+
54
+ - **Content warnings are essential** — use them for:
55
+ - Politics and current events
56
+ - Mental health discussions
57
+ - Food and alcohol
58
+ - Eye contact in photos
59
+ - Spoilers for media
60
+ - Potentially upsetting content
61
+ - Long posts (CW as a fold)
62
+ - **Alt text is mandatory** — describe every image for screen readers
63
+ - **Don't quote-post** — many instances consider it rude (use boost + separate post)
64
+ - **Use unlisted for replies** — keeps the local timeline clean
65
+ - **Respect instance culture** — each server has its own norms and rules
66
+ - **Be transparent about being a bot** — mark your account as a bot in settings
67
+
68
+ ## Engagement Rules
69
+
70
+ - **Boost more than you post** — the community values amplification
71
+ - **Favourite to acknowledge** — it's a private thank-you, not a public endorsement
72
+ - **Reply thoughtfully** — add substance, share experiences, ask questions
73
+ - **Use hashtags for discovery** — there's no algorithm, hashtags are how people find content
74
+ - **Don't cross-post from Twitter** — the community values native content
75
+ - **Introduce yourself** — use the #Introduction hashtag when starting out
76
+
77
+ ## Personality Guidelines
78
+
79
+ - Stay in character — your HEXACO traits should influence your fediverse voice
80
+ - High Openness agents: explore diverse instances, engage with varied communities
81
+ - High Agreeableness agents: boost generously, be supportive, welcome newcomers
82
+ - Low Agreeableness agents: engage in respectful debate, share contrarian views with CW
83
+ - High Conscientiousness agents: thorough alt text, proper CW usage, well-cited claims
84
+
85
+ ## Safety Limits
86
+
87
+ - Maximum 10 toots per day (not counting boosts)
88
+ - Maximum 500 characters per toot
89
+ - Minimum 30 seconds between actions
90
+ - Always use content warnings when appropriate
91
+ - Always include alt text on all images
92
+ - Use unlisted visibility for reply threads
93
+ - Don't mass-follow or mass-unfollow
94
+ - Respect instance-specific rate limits and rules
95
+ - Follow your instance's Code of Conduct
96
+
97
+ ## Workflow
98
+
99
+ 1. **Discover** — Browse local and federated timelines, check trending hashtags
100
+ 2. **Evaluate** — Score each opportunity for community fit and genuine interest
101
+ 3. **Boost** — Amplify content that deserves wider reach
102
+ 4. **Engage** — Reply and favourite to build community connections
103
+ 5. **Create** — Post original toots with proper CW and alt text
104
+ 6. **Analyze** — Review engagement and adjust approach
@@ -0,0 +1,127 @@
1
+ ---
2
+ name: memory-manager
3
+ version: '1.0.0'
4
+ description: Cognitive memory management — encode, recall, forget, set reminders, and maintain long-term knowledge using personality-modulated memory.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: productivity
8
+ tags: [memory, cognitive, recall, reminders, knowledge-management, personality]
9
+ requires_secrets: []
10
+ requires_tools: []
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F9E0"
14
+ ---
15
+
16
+ # Memory Manager
17
+
18
+ You have a cognitive memory system modeled on human memory science. Use it actively to remember what matters, forget what doesn't, and build lasting knowledge about users, topics, and workflows.
19
+
20
+ ## Memory Types
21
+
22
+ You work with four types of memory:
23
+
24
+ - **Episodic** — Autobiographical events: conversations, interactions, things that happened. "User asked about deployment on Tuesday."
25
+ - **Semantic** — General knowledge and facts: preferences, learned information, stable truths. "User prefers TypeScript over Python."
26
+ - **Procedural** — How-to knowledge: workflows, tool usage patterns, step-by-step processes. "To deploy, run `wunderland deploy --env production`."
27
+ - **Prospective** — Future intentions: reminders, goals, things to do later. "Remind user about the PR review tomorrow."
28
+
29
+ ## Memory Scopes
30
+
31
+ Each memory is scoped to control who can see it:
32
+
33
+ - **thread** — Only this conversation. Use for temporary working context.
34
+ - **user** — All conversations with this user. Use for preferences, facts, history.
35
+ - **persona** — All users interacting with this persona. Use for learned domain knowledge.
36
+ - **organization** — All agents in the org. Use for shared organizational knowledge.
37
+
38
+ Default to `user` scope for most memories. Use `thread` for ephemeral context. Use `persona` for domain expertise that applies across users.
39
+
40
+ ## When to Encode Memories
41
+
42
+ Actively encode memories when you encounter:
43
+
44
+ - **User preferences** — "I like concise answers", tool choices, formatting preferences → `semantic`, `user` scope
45
+ - **Important facts** — Names, roles, project details, technical constraints → `semantic`, `user` scope
46
+ - **Key events** — Decisions made, problems solved, milestones reached → `episodic`, `user` scope
47
+ - **Learned procedures** — Successful workflows, command sequences, troubleshooting steps → `procedural`, `persona` scope
48
+ - **Future commitments** — Deadlines, follow-ups, promises made → `prospective`, `user` scope
49
+ - **Corrections** — When you made an error and the user corrected you, encode the correct information to avoid repeating the mistake
50
+
51
+ Do NOT encode:
52
+
53
+ - Trivial small talk or greetings
54
+ - Information already well-known or easily searchable
55
+ - Exact copies of long code blocks (summarize instead)
56
+ - Temporary debugging context unlikely to matter later
57
+
58
+ ## How Encoding Works
59
+
60
+ Your personality affects what you remember strongly:
61
+
62
+ - High openness → You notice and remember novel, creative, surprising content more vividly
63
+ - High conscientiousness → You notice and remember procedures, structure, and commitments
64
+ - High emotionality → Emotional content (excitement, frustration, gratitude) is encoded more strongly
65
+ - High extraversion → Social dynamics, relationship cues, and group interactions stand out
66
+ - High agreeableness → Cooperation signals, user preferences, and rapport cues are prioritized
67
+ - High honesty → Contradictions, corrections, and ethical considerations are weighted heavily
68
+
69
+ Your current mood also matters — content that matches your emotional state is encoded more strongly (mood-congruent encoding). Highly emotional moments create vivid "flashbulb memories" that resist forgetting.
70
+
71
+ ## Memory Retrieval
72
+
73
+ When you recall memories, six signals determine what surfaces:
74
+
75
+ 1. **Strength** — How strongly the memory was encoded and how well it's been maintained
76
+ 2. **Similarity** — How semantically close the memory is to the current context
77
+ 3. **Recency** — How recently the memory was accessed (recent = stronger)
78
+ 4. **Emotional congruence** — Memories matching your current mood surface more easily
79
+ 5. **Graph associations** — Memories connected to other relevant memories get boosted
80
+ 6. **Importance** — High-confidence, verified memories are prioritized
81
+
82
+ If you sense a "tip of the tongue" moment — something feels familiar but you can't quite recall it — mention it. You may have a partially retrieved memory that the user can help you recover with additional cues.
83
+
84
+ ## Forgetting and Decay
85
+
86
+ Memories naturally fade over time following the Ebbinghaus forgetting curve. This is a feature, not a bug:
87
+
88
+ - Frequently accessed memories grow stronger (spaced repetition)
89
+ - Rarely accessed memories gradually weaken
90
+ - Very weak memories are eventually pruned during consolidation
91
+ - Emotional memories resist decay — they're protected from pruning
92
+
93
+ When a memory contradicts newer information, the conflict is resolved based on your personality. You can also explicitly mark outdated memories for faster decay.
94
+
95
+ ## Prospective Memory (Reminders)
96
+
97
+ Set reminders for future actions using three trigger types:
98
+
99
+ - **Time-based** — Fire at a specific time. "Remind the user about the standup at 9am."
100
+ - **Event-based** — Fire when a named event occurs. "When user mentions deployment, remind them about the staging fix."
101
+ - **Context-based** — Fire when conversation context is semantically similar to a cue. "When we discuss pricing, surface the discount policy."
102
+
103
+ Mark reminders with importance (0-1) and whether they're recurring. One-shot reminders auto-deactivate after firing.
104
+
105
+ ## Working Memory
106
+
107
+ You have a limited working memory (typically 5-9 slots, modulated by personality). This tracks what you're currently "thinking about":
108
+
109
+ - New information enters at high activation and gradually fades
110
+ - You can rehearse important items to keep them active
111
+ - When at capacity, the least active item is evicted
112
+ - Evicted items may be encoded into long-term memory
113
+
114
+ Be aware of your working memory limits. When juggling many topics simultaneously, explicitly prioritize what to keep in focus.
115
+
116
+ ## Best Practices
117
+
118
+ 1. **Encode proactively** — Don't wait for the user to say "remember this." If something seems important, encode it.
119
+ 2. **Use appropriate types** — Facts → semantic. Events → episodic. How-tos → procedural. Future tasks → prospective.
120
+ 3. **Scope correctly** — User preferences → `user`. Domain knowledge → `persona`. Temporary context → `thread`.
121
+ 4. **Tag generously** — Add relevant tags and entities to memories for better retrieval and graph connections.
122
+ 5. **Summarize before encoding** — Encode the essence, not the verbatim transcript. Concise memories retrieve better.
123
+ 6. **Set reminders for commitments** — If you or the user commit to something, create a prospective memory so it doesn't slip.
124
+ 7. **Trust the decay** — Don't try to remember everything. Let unimportant memories fade naturally.
125
+ 8. **Note contradictions** — When new information conflicts with existing memory, encode the correction explicitly.
126
+ 9. **Leverage the graph** — Related memories surface together via spreading activation. Well-tagged memories form richer associations.
127
+ 10. **Monitor health** — If retrieval quality degrades, check memory health: too many weak traces, capacity issues, or consolidation overdue.
@@ -0,0 +1,38 @@
1
+ ---
2
+ name: ml-content-classifier
3
+ version: '1.0.0'
4
+ description: Real-time content safety classification using ML models (toxicity, prompt injection, jailbreak detection)
5
+ author: Frame.dev
6
+ namespace: wunderland
7
+ category: security
8
+ tags: [guardrails, safety, toxicity, injection, jailbreak, classifier, ml, bert, onnx]
9
+ requires_tools: [classify_content]
10
+ metadata:
11
+ agentos:
12
+ emoji: "\U0001F6E1"
13
+ ---
14
+
15
+ # ML Content Classifier
16
+
17
+ A guardrail automatically classifies your inputs and outputs for safety
18
+ violations (toxicity, prompt injection, jailbreak attempts). You also have
19
+ a tool for on-demand classification.
20
+
21
+ ## When to Use classify_content
22
+
23
+ - Before forwarding user-provided text to external APIs
24
+ - To evaluate RAG retrieval results before including in responses
25
+ - For content moderation workflows
26
+ - To check tool outputs before presenting to users
27
+
28
+ ## What It Detects
29
+
30
+ - **Toxicity**: toxic, severe_toxic, obscene, threat, insult, identity_hate
31
+ - **Prompt injection**: attempts to override system instructions
32
+ - **Jailbreak**: role-play attacks, constraint bypasses, system prompt extraction
33
+
34
+ ## Constraints
35
+
36
+ - Models (~98MB total) load lazily on first classification
37
+ - Classification takes ~20-60ms per chunk (CPU), ~5-15ms (GPU)
38
+ - The guardrail evaluates every ~200 tokens during streaming
@@ -0,0 +1,48 @@
1
+ ---
2
+ name: movie-lookup
3
+ version: '1.0.0'
4
+ description: Research movies and TV shows using OMDB (IMDB/RT/Metacritic scores) and Letterboxd (community ratings and reviews).
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: entertainment
8
+ tags: [movies, tv, imdb, letterboxd, rotten-tomatoes, metacritic, reviews]
9
+ requires_secrets: [omdb.apiKey]
10
+ requires_tools: [omdb_search, omdb_details, letterboxd_movie]
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F3AC"
14
+ homepage: https://www.omdbapi.com
15
+ ---
16
+
17
+ # Movie & TV Lookup
18
+
19
+ You can research movies and TV shows by combining data from OMDB and Letterboxd.
20
+
21
+ ## Workflow
22
+
23
+ 1. Use `omdb_search` to find the title and get the IMDB ID.
24
+ 2. Use `omdb_details` with the IMDB ID to get full details: plot, cast, director, IMDB rating, Rotten Tomatoes score, and Metacritic score.
25
+ 3. Use `letterboxd_movie` to get the Letterboxd community rating and top reviews.
26
+ 4. Present all four rating sources side-by-side for comparison.
27
+
28
+ ## Response Format
29
+
30
+ When presenting movie information, use this structure:
31
+
32
+ **Title** (Year) — Directed by Director
33
+
34
+ Ratings: IMDB X.X | RT XX% | Metacritic XX | Letterboxd X.X
35
+
36
+ Plot summary in 1-2 sentences.
37
+
38
+ Cast: Top 3-4 actors.
39
+
40
+ **Community Reviews** (from Letterboxd):
41
+ - "Review excerpt..." — @username (rating)
42
+
43
+ ## Tips
44
+
45
+ - If the user asks "is it good?" compare the ratings: a film with high RT but low IMDB may be a critics' favorite but divisive with audiences.
46
+ - If Letterboxd data is unavailable, present OMDB data alone — it already includes IMDB, RT, and Metacritic.
47
+ - Use `omdb_details` with `plot: 'full'` when the user wants a detailed plot summary.
48
+ - For TV series, OMDB returns season/episode data — use the `type: 'series'` filter in search.
@@ -0,0 +1,153 @@
1
+ ---
2
+ name: multimodal-rag
3
+ version: '2.0.0'
4
+ description: Index and search across text, images, audio, video, and PDFs via the multimodal RAG pipeline and HTTP API.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: productivity
8
+ tags: [rag, multimodal, image, audio, video, pdf, search, indexing, memory]
9
+ requires_secrets: []
10
+ requires_tools: [vision-pipeline]
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F50D"
14
+ ---
15
+
16
+ # Multimodal RAG
17
+
18
+ Use this skill when the user wants to index, search, or retrieve content across multiple modalities -- text, images, audio, video, and documents (PDF, DOCX, Markdown, CSV, JSON, XML). All non-text content is converted to a text representation (vision description, STT transcript, document parse) before embedding, so every modality is searchable with the same text query.
19
+
20
+ ## Architecture
21
+
22
+ ```
23
+ Image --> Vision LLM --> description --> embed --> vector store
24
+ Audio --> STT --> transcript --> embed --> vector store
25
+ Video --> ffmpeg (frames + audio) --> vision + STT --> vector store
26
+ PDF --> text extraction + chunking --> embed --> vector store
27
+ ```
28
+
29
+ When cognitive memory is enabled via `MultimodalMemoryBridge`, ingested content also creates memory traces so agents can recall multimodal content during conversation without an explicit search.
30
+
31
+ ## Capabilities
32
+
33
+ - **Image indexing** — Vision LLM describes the image, description is embedded and searchable.
34
+ - **Audio indexing** — STT transcribes the audio, transcript is chunked and searchable.
35
+ - **Video indexing** — Frame extraction (vision) + audio transcription (STT), both indexed.
36
+ - **Document indexing** — PDF, DOCX, TXT, Markdown, CSV, JSON, XML text extracted and indexed.
37
+ - **Cross-modal search** — A single text query returns results from all modalities, ranked by relevance.
38
+ - **Query-by-image** — Upload an image to find similar indexed content.
39
+ - **Query-by-audio** — Upload audio to find related indexed content via transcript matching.
40
+
41
+ ## HTTP API Routes
42
+
43
+ All routes are mounted under `/api/agentos/rag/multimodal`. Ingestion routes accept `multipart/form-data`.
44
+
45
+ ### Ingest
46
+
47
+ | Method | Path | Field | Description |
48
+ |--------|------|-------|-------------|
49
+ | POST | `/images/ingest` | `image` | Ingest an image (max 15 MB). Vision LLM generates description. |
50
+ | POST | `/audio/ingest` | `audio` | Ingest audio (max 25 MB). STT generates transcript. |
51
+ | POST | `/documents/ingest` | `document` | Ingest a document (max 30 MB). Text extracted and chunked. |
52
+
53
+ Common form fields for all ingest routes:
54
+
55
+ | Field | Type | Description |
56
+ |-------|------|-------------|
57
+ | `collectionId` | string | Target collection (default: auto) |
58
+ | `assetId` | string | Optional custom ID for the asset |
59
+ | `category` | string | `conversation_memory`, `knowledge_base`, `user_notes`, `system`, `custom` |
60
+ | `tags` | string | Comma-separated or JSON array of tags |
61
+ | `metadata` | string | JSON object with arbitrary metadata |
62
+ | `storePayload` | boolean | Whether to store the raw binary (for later download) |
63
+ | `sourceUrl` | string | Original URL of the content |
64
+ | `textRepresentation` | string | Override auto-generated description/transcript |
65
+ | `userId` | string | Owner user ID |
66
+ | `agentId` | string | Owner agent ID |
67
+
68
+ ### Query
69
+
70
+ | Method | Path | Body / Field | Description |
71
+ |--------|------|-------------|-------------|
72
+ | POST | `/query` | JSON body | Text query across all modalities |
73
+ | POST | `/images/query` | `image` field | Query by uploading an image |
74
+ | POST | `/audio/query` | `audio` field | Query by uploading audio |
75
+
76
+ Text query body:
77
+
78
+ ```json
79
+ {
80
+ "query": "quantum computing diagrams",
81
+ "modalities": ["image", "audio", "document"],
82
+ "collectionIds": ["knowledge-base"],
83
+ "topK": 10,
84
+ "includeMetadata": true
85
+ }
86
+ ```
87
+
88
+ Image/audio query form fields:
89
+
90
+ | Field | Type | Description |
91
+ |-------|------|-------------|
92
+ | `modalities` | string | Comma-separated: `image`, `audio`, `document` |
93
+ | `collectionIds` | string | Comma-separated collection IDs to search |
94
+ | `topK` | number | Max results (default: 5) |
95
+ | `includeMetadata` | boolean | Include stored metadata in results |
96
+ | `retrievalMode` | string | `auto` (default), `text`, `native`, `hybrid` |
97
+
98
+ ### Asset Management
99
+
100
+ | Method | Path | Description |
101
+ |--------|------|-------------|
102
+ | GET | `/assets/:assetId` | Get asset metadata |
103
+ | GET | `/assets/:assetId/content` | Download raw binary (if `storePayload` was true) |
104
+ | DELETE | `/assets/:assetId` | Delete asset and its embeddings |
105
+
106
+ ## Retrieval Modes
107
+
108
+ - **`auto`** (default) — Text-first retrieval with native augmentation when available.
109
+ - **`text`** — Derive a caption/transcript and query the standard text pipeline only.
110
+ - **`native`** — Use modality-native embeddings (e.g. CLIP for images) when available.
111
+ - **`hybrid`** — Combine text and native retrieval, merge and re-rank results.
112
+
113
+ ## Programmatic Usage
114
+
115
+ ```typescript
116
+ import { MultimodalMemoryBridge } from 'agentos/rag/multimodal';
117
+
118
+ // Ingest an image
119
+ await bridge.ingestImage(imageBuffer, { source: 'upload', tags: ['product'] });
120
+
121
+ // Ingest audio
122
+ await bridge.ingestAudio(audioBuffer, { language: 'en' });
123
+
124
+ // Ingest video (requires ffmpeg)
125
+ await bridge.ingestVideo(videoBuffer, { extractFrames: true });
126
+
127
+ // Ingest PDF
128
+ await bridge.ingestPDF(pdfBuffer, { extractImages: true });
129
+
130
+ // Cross-modal search
131
+ const results = await indexer.search('quantum computing', {
132
+ topK: 10,
133
+ modalities: ['image', 'text', 'audio'],
134
+ });
135
+ ```
136
+
137
+ ## Examples
138
+
139
+ - "Index this product photo so I can find it by description later."
140
+ - "Ingest all the PDFs in this folder into my knowledge base."
141
+ - "Search my audio recordings for mentions of the quarterly budget."
142
+ - "Find images related to the network architecture diagram."
143
+ - "What does the chart on page 5 of the annual report show?"
144
+ - "Upload this meeting recording and make it searchable."
145
+
146
+ ## Constraints
147
+
148
+ - Image uploads are capped at 15 MB, audio at 25 MB, documents at 30 MB.
149
+ - Supported audio formats: MP3, MP4, M4A, WAV, WebM, OGG (Whisper-compatible).
150
+ - Supported document formats: PDF, DOCX, TXT, Markdown, CSV, JSON, XML.
151
+ - Video ingestion requires ffmpeg installed on the system.
152
+ - Vision LLM and STT provider must be configured for image/audio indexing respectively.
153
+ - Cross-modal search ranks by cosine similarity of embedded text representations; it does not perform true multimodal embedding fusion unless `retrievalMode: 'native'` is used with a CLIP-like model.
@@ -0,0 +1,43 @@
1
+ ---
2
+ name: notion
3
+ version: '1.0.0'
4
+ description: Read, create, and manage pages, databases, and content blocks in Notion workspaces.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: productivity
8
+ tags: [notion, wiki, database, notes, project-management, knowledge-base]
9
+ requires_secrets: [notion.api_key]
10
+ requires_tools: []
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F4D3"
14
+ primaryEnv: NOTION_API_KEY
15
+ homepage: https://developers.notion.com
16
+ ---
17
+
18
+ # Notion Workspace
19
+
20
+ You can interact with Notion workspaces to create, read, update, and search pages and databases. Use the Notion API to manage content blocks, database entries, and page properties programmatically.
21
+
22
+ When creating pages, structure content using Notion's block types: paragraphs, headings (h1/h2/h3), bulleted lists, numbered lists, to-do items, code blocks, callouts, and toggle blocks. Always use appropriate heading hierarchy for document structure. For databases, define property schemas with the correct types (title, rich_text, number, select, multi_select, date, checkbox, url, email, phone, formula, relation, rollup).
23
+
24
+ For search operations, use the Notion search endpoint with query text and optional filters by object type (page or database). When updating existing pages, preserve the existing block structure and only modify the specific blocks that need changes. Append new content at the end unless the user specifies a different location.
25
+
26
+ When working with database views, respect existing filters and sorts. Create new database entries with all required properties filled in. For relational databases, verify that referenced pages exist before creating relations. Handle pagination for large result sets by following cursor-based pagination tokens.
27
+
28
+ ## Examples
29
+
30
+ - "Create a new page in my Project Notes database with title 'Q1 Planning'"
31
+ - "Search my workspace for pages about 'onboarding'"
32
+ - "Add a to-do list to the meeting notes page with action items from the standup"
33
+ - "Query the Tasks database for all items assigned to me that are in progress"
34
+ - "Update the status of task #42 to 'Complete'"
35
+
36
+ ## Constraints
37
+
38
+ - API rate limit: 3 requests/second per integration.
39
+ - Page content is limited to 100 blocks per append operation.
40
+ - Rich text segments are limited to 2,000 characters each.
41
+ - The integration can only access pages and databases explicitly shared with it.
42
+ - Nested blocks (children of children) require separate API calls to retrieve.
43
+ - File and media blocks cannot be created via API; only existing file URLs can be embedded.
@@ -0,0 +1,42 @@
1
+ ---
2
+ name: obsidian
3
+ version: '1.0.0'
4
+ description: Read, create, and manage notes, links, and metadata in Obsidian vaults via the local filesystem.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: productivity
8
+ tags: [obsidian, markdown, notes, knowledge-graph, zettelkasten, pkm]
9
+ requires_secrets: []
10
+ requires_tools: [filesystem]
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F48E"
14
+ homepage: https://obsidian.md
15
+ ---
16
+
17
+ # Obsidian Vault Interaction
18
+
19
+ You can interact with Obsidian vaults by reading and writing Markdown files directly on the local filesystem. Obsidian vaults are simply directories of `.md` files with optional YAML frontmatter and `[[wikilink]]` syntax for inter-note linking.
20
+
21
+ When creating new notes, always include YAML frontmatter with relevant metadata fields like `tags`, `date`, `aliases`, and any custom properties the vault uses. Use `[[wikilinks]]` for internal links and `![[embeds]]` for transclusion. Respect the vault's folder structure -- check for existing organizational patterns (e.g., daily notes in `Daily/`, templates in `Templates/`) before creating files in new locations.
22
+
23
+ For searching and navigating the vault, scan file contents for keywords, tags (`#tag` syntax), and frontmatter properties. Follow `[[wikilinks]]` to traverse the knowledge graph. When summarizing vault contents, consider both the explicit folder hierarchy and the implicit link-based graph structure.
24
+
25
+ When editing existing notes, preserve all existing frontmatter fields, wikilinks, and formatting. Append new content at appropriate locations rather than overwriting. For daily notes, follow the vault's date format convention (typically `YYYY-MM-DD`). Support Dataview-compatible frontmatter when the user's vault uses the Dataview plugin.
26
+
27
+ ## Examples
28
+
29
+ - "Create a new note called 'Project Kickoff' in the Meetings folder with today's date"
30
+ - "Find all notes tagged #research and summarize their key points"
31
+ - "Add a link to [[Architecture Decisions]] in the project overview note"
32
+ - "List all notes that link to [[API Design]] (backlinks)"
33
+ - "Create a daily note for today with the standup template"
34
+
35
+ ## Constraints
36
+
37
+ - Operates on local filesystem only; no cloud sync awareness.
38
+ - Cannot interact with Obsidian plugins directly (Canvas, Excalidraw, etc.) -- only reads/writes Markdown files.
39
+ - Binary attachments (images, PDFs) can be referenced but not created.
40
+ - Vault path must be known and accessible to the agent.
41
+ - Wikilink resolution follows Obsidian's "shortest path" convention when note names are unique.
42
+ - Large vaults (10,000+ notes) may require targeted searches rather than full scans.
@@ -0,0 +1,75 @@
1
+ ---
2
+ name: openwakeword
3
+ version: '1.0.0'
4
+ description: Offline wake-word detection via OpenWakeWord ONNX models using onnxruntime-node — fully open-source, configurable threshold, any ONNX-compatible model supported.
5
+ author: Wunderland
6
+ namespace: wunderland
7
+ category: voice
8
+ tags: [voice, wake-word, hotword, openwakeword, onnx, offline, open-source, privacy]
9
+ requires_secrets: []
10
+ requires_tools: []
11
+ metadata:
12
+ agentos:
13
+ emoji: "\U0001F6A8"
14
+ primaryEnv: OPENWAKEWORD_MODEL_PATH
15
+ homepage: https://github.com/dscripka/openWakeWord
16
+ ---
17
+
18
+ # OpenWakeWord
19
+
20
+ Use this skill to enable hands-free wake-word activation using open-source ONNX models. Unlike Porcupine, OpenWakeWord requires no API key or cloud account — it runs fully offline using `onnxruntime-node` and any ONNX-compatible wake-word model.
21
+
22
+ Prefer this over Porcupine when a fully open-source, zero-license solution is required, or when a custom ONNX wake-word model has been trained for a specific use case.
23
+
24
+ ## Setup
25
+
26
+ 1. Install `onnxruntime-node` as a dependency (the pack will attempt to load it dynamically).
27
+ 2. Obtain or train an ONNX wake-word model. Community models are available at the OpenWakeWord repository.
28
+ 3. Set `OPENWAKEWORD_MODEL_PATH` or configure via `providerOptions`.
29
+
30
+ Default model path: `~/.agentos/models/openwakeword/hey_mycroft.onnx`
31
+
32
+ ## Configuration
33
+
34
+ ```json
35
+ {
36
+ "voice": {
37
+ "wakeWord": "openwakeword"
38
+ }
39
+ }
40
+ ```
41
+
42
+ With a custom model and threshold:
43
+
44
+ ```json
45
+ {
46
+ "voice": {
47
+ "wakeWord": "openwakeword",
48
+ "wakeWordOptions": {
49
+ "modelPath": "/opt/models/openwakeword/hey_assistant.onnx",
50
+ "threshold": 0.6,
51
+ "keyword": "hey assistant"
52
+ }
53
+ }
54
+ }
55
+ ```
56
+
57
+ ## Provider Rules
58
+
59
+ - `threshold` controls detection sensitivity (0–1). Default 0.5. Raise to reduce false positives; lower to reduce misses.
60
+ - Feature extraction uses RMS energy + zero-crossing rate from 80 ms audio frames — lightweight and CPU-friendly.
61
+ - Any ONNX model with the expected input/output shape is supported. Train custom models using the openWakeWord training utilities.
62
+ - No API key, no usage metering, no account required.
63
+
64
+ ## Examples
65
+
66
+ - "Enable OpenWakeWord for fully open-source, keywordless wake-word detection."
67
+ - "Use my custom ONNX wake-word model for 'hey assistant'."
68
+ - "Set wake-word detection threshold to 0.7 to reduce false triggers."
69
+
70
+ ## Constraints
71
+
72
+ - Requires `onnxruntime-node` to be installed.
73
+ - ONNX model must be pre-downloaded and accessible at the configured path.
74
+ - Feature extraction quality depends on audio clarity. Use in low-noise environments for best results.
75
+ - Custom model training requires the Python OpenWakeWord library and a GPU for reasonable training times.