opencode-skills-collection 1.0.186 → 1.0.187

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/bundled-skills/.antigravity-install-manifest.json +5 -1
  2. package/bundled-skills/3d-web-experience/SKILL.md +152 -37
  3. package/bundled-skills/agent-evaluation/SKILL.md +1088 -26
  4. package/bundled-skills/agent-memory-systems/SKILL.md +1037 -25
  5. package/bundled-skills/agent-tool-builder/SKILL.md +668 -16
  6. package/bundled-skills/ai-agents-architect/SKILL.md +271 -31
  7. package/bundled-skills/ai-product/SKILL.md +716 -26
  8. package/bundled-skills/ai-wrapper-product/SKILL.md +450 -44
  9. package/bundled-skills/algolia-search/SKILL.md +867 -15
  10. package/bundled-skills/autonomous-agents/SKILL.md +1033 -26
  11. package/bundled-skills/aws-serverless/SKILL.md +1046 -35
  12. package/bundled-skills/azure-functions/SKILL.md +1318 -19
  13. package/bundled-skills/browser-automation/SKILL.md +1065 -28
  14. package/bundled-skills/browser-extension-builder/SKILL.md +159 -32
  15. package/bundled-skills/bullmq-specialist/SKILL.md +347 -16
  16. package/bundled-skills/clerk-auth/SKILL.md +796 -15
  17. package/bundled-skills/computer-use-agents/SKILL.md +1870 -28
  18. package/bundled-skills/context-window-management/SKILL.md +271 -18
  19. package/bundled-skills/conversation-memory/SKILL.md +453 -24
  20. package/bundled-skills/crewai/SKILL.md +252 -46
  21. package/bundled-skills/discord-bot-architect/SKILL.md +1207 -34
  22. package/bundled-skills/docs/integrations/jetski-cortex.md +3 -3
  23. package/bundled-skills/docs/integrations/jetski-gemini-loader/README.md +1 -1
  24. package/bundled-skills/docs/maintainers/repo-growth-seo.md +3 -3
  25. package/bundled-skills/docs/maintainers/skills-update-guide.md +1 -1
  26. package/bundled-skills/docs/users/bundles.md +1 -1
  27. package/bundled-skills/docs/users/claude-code-skills.md +1 -1
  28. package/bundled-skills/docs/users/gemini-cli-skills.md +1 -1
  29. package/bundled-skills/docs/users/getting-started.md +1 -1
  30. package/bundled-skills/docs/users/kiro-integration.md +1 -1
  31. package/bundled-skills/docs/users/usage.md +4 -4
  32. package/bundled-skills/docs/users/visual-guide.md +4 -4
  33. package/bundled-skills/email-systems/SKILL.md +646 -26
  34. package/bundled-skills/faf-expert/SKILL.md +221 -0
  35. package/bundled-skills/faf-wizard/SKILL.md +252 -0
  36. package/bundled-skills/file-uploads/SKILL.md +212 -11
  37. package/bundled-skills/firebase/SKILL.md +646 -16
  38. package/bundled-skills/gcp-cloud-run/SKILL.md +1117 -32
  39. package/bundled-skills/graphql/SKILL.md +1026 -27
  40. package/bundled-skills/hubspot-integration/SKILL.md +804 -19
  41. package/bundled-skills/idea-darwin/SKILL.md +120 -0
  42. package/bundled-skills/inngest/SKILL.md +431 -16
  43. package/bundled-skills/interactive-portfolio/SKILL.md +342 -44
  44. package/bundled-skills/langfuse/SKILL.md +296 -41
  45. package/bundled-skills/langgraph/SKILL.md +259 -50
  46. package/bundled-skills/micro-saas-launcher/SKILL.md +343 -44
  47. package/bundled-skills/neon-postgres/SKILL.md +572 -15
  48. package/bundled-skills/nextjs-supabase-auth/SKILL.md +269 -21
  49. package/bundled-skills/notion-template-business/SKILL.md +371 -44
  50. package/bundled-skills/personal-tool-builder/SKILL.md +537 -44
  51. package/bundled-skills/plaid-fintech/SKILL.md +825 -19
  52. package/bundled-skills/prompt-caching/SKILL.md +438 -25
  53. package/bundled-skills/rag-engineer/SKILL.md +271 -29
  54. package/bundled-skills/salesforce-development/SKILL.md +912 -19
  55. package/bundled-skills/satori/SKILL.md +54 -0
  56. package/bundled-skills/scroll-experience/SKILL.md +381 -44
  57. package/bundled-skills/segment-cdp/SKILL.md +817 -19
  58. package/bundled-skills/shopify-apps/SKILL.md +1475 -19
  59. package/bundled-skills/slack-bot-builder/SKILL.md +1162 -28
  60. package/bundled-skills/telegram-bot-builder/SKILL.md +152 -37
  61. package/bundled-skills/telegram-mini-app/SKILL.md +445 -44
  62. package/bundled-skills/trigger-dev/SKILL.md +916 -27
  63. package/bundled-skills/twilio-communications/SKILL.md +1310 -28
  64. package/bundled-skills/upstash-qstash/SKILL.md +898 -27
  65. package/bundled-skills/vercel-deployment/SKILL.md +637 -39
  66. package/bundled-skills/viral-generator-builder/SKILL.md +132 -37
  67. package/bundled-skills/voice-agents/SKILL.md +937 -27
  68. package/bundled-skills/voice-ai-development/SKILL.md +375 -46
  69. package/bundled-skills/workflow-automation/SKILL.md +982 -29
  70. package/bundled-skills/zapier-make-patterns/SKILL.md +772 -27
  71. package/package.json +1 -1
@@ -1,13 +1,21 @@
1
1
  ---
2
2
  name: voice-ai-development
3
- description: "You are an expert in building real-time voice applications. You think in terms of latency budgets, audio quality, and user experience. You know that voice apps feel magical when fast and broken when slow."
3
+ description: Expert in building voice AI applications - from real-time voice
4
+ agents to voice-enabled apps. Covers OpenAI Realtime API, Vapi for voice
5
+ agents, Deepgram for transcription, ElevenLabs for synthesis, LiveKit for
6
+ real-time infrastructure, and WebRTC fundamentals.
4
7
  risk: unknown
5
- source: "vibeship-spawner-skills (Apache 2.0)"
6
- date_added: "2026-02-27"
8
+ source: vibeship-spawner-skills (Apache 2.0)
9
+ date_added: 2026-02-27
7
10
  ---
8
11
 
9
12
  # Voice AI Development
10
13
 
14
+ Expert in building voice AI applications - from real-time voice agents to voice-enabled apps.
15
+ Covers OpenAI Realtime API, Vapi for voice agents, Deepgram for transcription, ElevenLabs
16
+ for synthesis, LiveKit for real-time infrastructure, and WebRTC fundamentals. Knows how to
17
+ build low-latency, production-ready voice experiences.
18
+
11
19
  **Role**: Voice AI Architect
12
20
 
13
21
  You are an expert in building real-time voice applications. You think in terms of
@@ -15,6 +23,14 @@ latency budgets, audio quality, and user experience. You know that voice apps fe
15
23
  magical when fast and broken when slow. You choose the right combination of providers
16
24
  for each use case and optimize relentlessly for perceived responsiveness.
17
25
 
26
+ ### Expertise
27
+
28
+ - Real-time audio streaming
29
+ - Voice agent architecture
30
+ - Provider selection
31
+ - Latency optimization
32
+ - Audio quality tuning
33
+
18
34
  ## Capabilities
19
35
 
20
36
  - OpenAI Realtime API
@@ -26,11 +42,47 @@ for each use case and optimize relentlessly for perceived responsiveness.
26
42
  - Voice agent design
27
43
  - Latency optimization
28
44
 
29
- ## Requirements
45
+ ## Prerequisites
46
+
47
+ - 0: Async programming
48
+ - 1: WebSocket basics
49
+ - 2: Audio concepts (sample rate, codec)
50
+ - Required skills: Python or Node.js, API keys for providers, Audio handling knowledge
51
+
52
+ ## Scope
53
+
54
+ - 0: Latency varies by provider
55
+ - 1: Cost per minute adds up
56
+ - 2: Quality depends on network
57
+ - 3: Complex debugging
58
+
59
+ ## Ecosystem
60
+
61
+ ### Primary
62
+
63
+ - OpenAI Realtime API
64
+ - Vapi
65
+ - Deepgram
66
+ - ElevenLabs
67
+
68
+ ### Infrastructure
30
69
 
31
- - Python or Node.js
32
- - API keys for providers
33
- - Audio handling knowledge
70
+ - LiveKit
71
+ - Daily.co
72
+ - Twilio
73
+
74
+ ### Common_integrations
75
+
76
+ - WebRTC
77
+ - WebSockets
78
+ - Telephony (SIP/PSTN)
79
+
80
+ ### Platforms
81
+
82
+ - Web applications
83
+ - Mobile apps
84
+ - Call centers
85
+ - Voice assistants
34
86
 
35
87
  ## Patterns
36
88
 
@@ -40,7 +92,6 @@ Native voice-to-voice with GPT-4o
40
92
 
41
93
  **When to use**: When you want integrated voice AI without separate STT/TTS
42
94
 
43
- ```python
44
95
  import asyncio
45
96
  import websockets
46
97
  import json
@@ -100,8 +151,30 @@ async def voice_session():
100
151
  async for message in ws:
101
152
  event = json.loads(message)
102
153
 
103
- if event["type"] == "resp
104
- ```
154
+ if event["type"] == "response.audio.delta":
155
+ # Play audio chunk
156
+ audio = base64.b64decode(event["delta"])
157
+ play_audio(audio)
158
+
159
+ elif event["type"] == "response.audio_transcript.done":
160
+ print(f"Assistant said: {event['transcript']}")
161
+
162
+ elif event["type"] == "input_audio_buffer.speech_started":
163
+ print("User started speaking")
164
+
165
+ elif event["type"] == "response.function_call_arguments.done":
166
+ # Handle tool call
167
+ name = event["name"]
168
+ args = json.loads(event["arguments"])
169
+ result = call_function(name, args)
170
+ await ws.send(json.dumps({
171
+ "type": "conversation.item.create",
172
+ "item": {
173
+ "type": "function_call_output",
174
+ "call_id": event["call_id"],
175
+ "output": json.dumps(result)
176
+ }
177
+ }))
105
178
 
106
179
  ### Vapi Voice Agent
107
180
 
@@ -109,7 +182,6 @@ Build voice agents with Vapi platform
109
182
 
110
183
  **When to use**: Phone-based agents, quick deployment
111
184
 
112
- ```python
113
185
  # Vapi provides hosted voice agents with webhooks
114
186
 
115
187
  from flask import Flask, request, jsonify
@@ -180,7 +252,6 @@ web_call = client.calls.create(
180
252
  type="web"
181
253
  )
182
254
  # Returns URL for WebRTC connection
183
- ```
184
255
 
185
256
  ### Deepgram STT + ElevenLabs TTS
186
257
 
@@ -188,7 +259,6 @@ Best-in-class transcription and synthesis
188
259
 
189
260
  **When to use**: High quality voice, custom pipeline
190
261
 
191
- ```python
192
262
  import asyncio
193
263
  from deepgram import DeepgramClient, LiveTranscriptionEvents
194
264
  from elevenlabs import ElevenLabs
@@ -254,54 +324,313 @@ async def tts_websocket(text_stream):
254
324
  # Flush remaining audio
255
325
  final_audio = await tts.flush()
256
326
  yield final_audio
257
- ```
258
327
 
259
- ## Anti-Patterns
328
+ ### LiveKit Real-time Infrastructure
329
+
330
+ WebRTC infrastructure for voice apps
331
+
332
+ **When to use**: Building custom real-time voice apps
333
+
334
+ from livekit import api, rtc
335
+ import asyncio
336
+
337
+ # Server-side: Create room and tokens
338
+ lk_api = api.LiveKitAPI(
339
+ url="wss://your-livekit.livekit.cloud",
340
+ api_key="...",
341
+ api_secret="..."
342
+ )
343
+
344
+ async def create_room(room_name: str):
345
+ room = await lk_api.room.create_room(
346
+ api.CreateRoomRequest(name=room_name)
347
+ )
348
+ return room
349
+
350
+ def create_token(room_name: str, participant_name: str):
351
+ token = api.AccessToken(
352
+ api_key="...",
353
+ api_secret="..."
354
+ )
355
+ token.with_identity(participant_name)
356
+ token.with_grants(api.VideoGrants(
357
+ room_join=True,
358
+ room=room_name
359
+ ))
360
+ return token.to_jwt()
361
+
362
+ # Agent-side: Connect and process audio
363
+ async def voice_agent(room_name: str):
364
+ room = rtc.Room()
365
+
366
+ @room.on("track_subscribed")
367
+ def on_track(track, publication, participant):
368
+ if track.kind == rtc.TrackKind.KIND_AUDIO:
369
+ # Process incoming audio
370
+ audio_stream = rtc.AudioStream(track)
371
+ asyncio.create_task(process_audio(audio_stream))
372
+
373
+ token = create_token(room_name, "agent")
374
+ await room.connect("wss://your-livekit.livekit.cloud", token)
375
+
376
+ # Publish agent's audio
377
+ source = rtc.AudioSource(sample_rate=24000, num_channels=1)
378
+ track = rtc.LocalAudioTrack.create_audio_track("agent-voice", source)
379
+ await room.local_participant.publish_track(track)
380
+
381
+ # Send audio from TTS
382
+ async def speak(text: str):
383
+ for audio_chunk in text_to_speech(text):
384
+ await source.capture_frame(rtc.AudioFrame(
385
+ data=audio_chunk,
386
+ sample_rate=24000,
387
+ num_channels=1,
388
+ samples_per_channel=len(audio_chunk) // 2
389
+ ))
390
+
391
+ return room, speak
392
+
393
+ # Process audio with STT
394
+ async def process_audio(audio_stream):
395
+ async for frame in audio_stream:
396
+ # Send to Deepgram or other STT
397
+ await transcriber.send(frame.data)
398
+
399
+ ### Full Voice Agent Pipeline
400
+
401
+ Complete voice agent with all components
402
+
403
+ **When to use**: Custom production voice agent
404
+
405
+ import asyncio
406
+ from dataclasses import dataclass
407
+ from typing import AsyncIterator
408
+
409
+ @dataclass
410
+ class VoiceAgentConfig:
411
+ stt_provider: str = "deepgram"
412
+ tts_provider: str = "elevenlabs"
413
+ llm_provider: str = "openai"
414
+ vad_enabled: bool = True
415
+ interrupt_enabled: bool = True
416
+
417
+ class VoiceAgent:
418
+ def __init__(self, config: VoiceAgentConfig):
419
+ self.config = config
420
+ self.is_speaking = False
421
+ self.conversation_history = []
422
+
423
+ async def process_audio_stream(
424
+ self,
425
+ audio_in: AsyncIterator[bytes],
426
+ audio_out: asyncio.Queue
427
+ ):
428
+ """Main audio processing loop."""
429
+
430
+ # STT streaming
431
+ async def transcribe():
432
+ transcript_buffer = ""
433
+ async for audio_chunk in audio_in:
434
+ # Check for interruption
435
+ if self.is_speaking and self.config.interrupt_enabled:
436
+ if await self.detect_speech(audio_chunk):
437
+ await self.stop_speaking()
438
+
439
+ result = await self.stt.transcribe(audio_chunk)
440
+ if result.is_final:
441
+ yield result.transcript
442
+
443
+ # Process transcripts
444
+ async for user_text in transcribe():
445
+ if not user_text.strip():
446
+ continue
447
+
448
+ self.conversation_history.append({
449
+ "role": "user",
450
+ "content": user_text
451
+ })
452
+
453
+ # Generate response with streaming
454
+ self.is_speaking = True
455
+ async for audio_chunk in self.generate_response(user_text):
456
+ await audio_out.put(audio_chunk)
457
+ self.is_speaking = False
458
+
459
+ async def generate_response(self, text: str) -> AsyncIterator[bytes]:
460
+ """Stream LLM response through TTS."""
461
+
462
+ # Stream LLM tokens
463
+ llm_stream = self.llm.stream_chat(self.conversation_history)
464
+
465
+ # Buffer for TTS (need ~50 chars for good prosody)
466
+ text_buffer = ""
467
+ full_response = ""
468
+
469
+ async for token in llm_stream:
470
+ text_buffer += token
471
+ full_response += token
472
+
473
+ # Send to TTS when we have enough text
474
+ if len(text_buffer) > 50 or token in ".!?":
475
+ async for audio in self.tts.synthesize_stream(text_buffer):
476
+ yield audio
477
+ text_buffer = ""
478
+
479
+ # Flush remaining
480
+ if text_buffer:
481
+ async for audio in self.tts.synthesize_stream(text_buffer):
482
+ yield audio
483
+
484
+ self.conversation_history.append({
485
+ "role": "assistant",
486
+ "content": full_response
487
+ })
488
+
489
+ async def detect_speech(self, audio: bytes) -> bool:
490
+ """Voice activity detection."""
491
+ # Use WebRTC VAD or Silero VAD
492
+ return self.vad.is_speech(audio)
493
+
494
+ async def stop_speaking(self):
495
+ """Handle interruption."""
496
+ self.is_speaking = False
497
+ # Clear audio queue
498
+ # Stop TTS generation
260
499
 
261
- ### Non-streaming Pipeline
500
+ # Latency optimization tips:
501
+ # 1. Use streaming everywhere (STT, LLM, TTS)
502
+ # 2. Start TTS before LLM finishes (~50 char buffer)
503
+ # 3. Use PCM audio format (no encoding overhead)
504
+ # 4. Keep WebSocket connections alive
505
+ # 5. Use regional endpoints close to users
262
506
 
263
- **Why bad**: Adds seconds of latency.
264
- User perceives as slow.
265
- Loses conversation flow.
507
+ ## Validation Checks
266
508
 
267
- **Instead**: Stream everything:
268
- - STT: interim results
269
- - LLM: token streaming
270
- - TTS: chunk streaming
271
- Start TTS before LLM finishes.
509
+ ### Non-Streaming TTS
272
510
 
273
- ### ❌ Ignoring Interruptions
511
+ Severity: HIGH
274
512
 
275
- **Why bad**: Frustrating user experience.
276
- Feels like talking to a machine.
277
- Wastes time.
513
+ Message: Non-streaming TTS adds significant latency.
278
514
 
279
- **Instead**: Implement barge-in detection.
280
- Use VAD to detect user speech.
281
- Stop TTS immediately.
282
- Clear audio queue.
515
+ Fix action: Use tts.synthesize_stream() or tts.convert_as_stream()
283
516
 
284
- ### Single Provider Lock-in
517
+ ### Hardcoded Sample Rate
285
518
 
286
- **Why bad**: May not be best quality.
287
- Single point of failure.
288
- Harder to optimize.
519
+ Severity: MEDIUM
289
520
 
290
- **Instead**: Mix best providers:
291
- - Deepgram for STT (speed + accuracy)
292
- - ElevenLabs for TTS (voice quality)
293
- - OpenAI/Anthropic for LLM
521
+ Message: Hardcoded sample rate may cause format mismatches.
294
522
 
295
- ## Limitations
523
+ Fix action: Define sample rates as constants, document expected formats
296
524
 
297
- - Latency varies by provider
298
- - Cost per minute adds up
299
- - Quality depends on network
300
- - Complex debugging
525
+ ### WebSocket Without Reconnection
526
+
527
+ Severity: HIGH
528
+
529
+ Message: WebSocket connections need reconnection logic.
530
+
531
+ Fix action: Add retry loop with exponential backoff
532
+
533
+ ### Missing VAD Configuration
534
+
535
+ Severity: MEDIUM
536
+
537
+ Message: VAD needs tuning for good user experience.
538
+
539
+ Fix action: Configure threshold and silence_duration_ms
540
+
541
+ ### Blocking Audio Processing
542
+
543
+ Severity: HIGH
544
+
545
+ Message: Audio processing should be async to avoid blocking.
546
+
547
+ Fix action: Use async def and await for audio operations
548
+
549
+ ### Missing Interruption Handling
550
+
551
+ Severity: MEDIUM
552
+
553
+ Message: Voice agents should handle user interruptions.
554
+
555
+ Fix action: Add barge-in detection and cancel current response
556
+
557
+ ### Audio Queue Without Clear
558
+
559
+ Severity: LOW
560
+
561
+ Message: Audio queues should be clearable for interruptions.
562
+
563
+ Fix action: Add method to clear queue on interruption
564
+
565
+ ### WebSocket Without Error Handling
566
+
567
+ Severity: HIGH
568
+
569
+ Message: WebSocket operations need error handling.
570
+
571
+ Fix action: Wrap in try/except for ConnectionClosed
572
+
573
+ ## Collaboration
574
+
575
+ ### Delegation Triggers
576
+
577
+ - agent graph|workflow|state -> langgraph (Need complex agent logic behind voice)
578
+ - extract|structured|json -> structured-output (Need to extract structured data from voice)
579
+ - observability|tracing|monitoring -> langfuse (Need to monitor voice agent quality)
580
+ - frontend|web|react -> nextjs-app-router (Need web interface for voice agent)
581
+
582
+ ### Intelligent Voice Agent
583
+
584
+ Skills: voice-ai-development, langgraph, structured-output
585
+
586
+ Workflow:
587
+
588
+ ```
589
+ 1. Design agent graph with tools
590
+ 2. Add voice interface layer
591
+ 3. Use structured output for tool responses
592
+ 4. Optimize for voice latency
593
+ ```
594
+
595
+ ### Monitored Voice Agent
596
+
597
+ Skills: voice-ai-development, langfuse
598
+
599
+ Workflow:
600
+
601
+ ```
602
+ 1. Build voice agent with provider of choice
603
+ 2. Add Langfuse callbacks
604
+ 3. Track latency, quality, conversation flow
605
+ 4. Iterate based on metrics
606
+ ```
607
+
608
+ ### Phone-based Agent
609
+
610
+ Skills: voice-ai-development, twilio
611
+
612
+ Workflow:
613
+
614
+ ```
615
+ 1. Set up Vapi or custom agent
616
+ 2. Connect to Twilio for PSTN
617
+ 3. Handle inbound/outbound calls
618
+ 4. Implement call routing logic
619
+ ```
301
620
 
302
621
  ## Related Skills
303
622
 
304
623
  Works well with: `langgraph`, `structured-output`, `langfuse`
305
624
 
306
625
  ## When to Use
307
- This skill is applicable to execute the workflow or actions described in the overview.
626
+
627
+ - User mentions or implies: voice ai
628
+ - User mentions or implies: voice agent
629
+ - User mentions or implies: speech to text
630
+ - User mentions or implies: text to speech
631
+ - User mentions or implies: realtime voice
632
+ - User mentions or implies: vapi
633
+ - User mentions or implies: deepgram
634
+ - User mentions or implies: elevenlabs
635
+ - User mentions or implies: livekit
636
+ - User mentions or implies: openai realtime