@bgicli/bgicli 2.2.8 → 2.2.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/data/skills/anthropic-algorithmic-art/SKILL.md +405 -0
- package/data/skills/anthropic-canvas-design/SKILL.md +130 -0
- package/data/skills/anthropic-claude-api/SKILL.md +243 -0
- package/data/skills/anthropic-doc-coauthoring/SKILL.md +375 -0
- package/data/skills/anthropic-docx/SKILL.md +590 -0
- package/data/skills/anthropic-frontend-design/SKILL.md +42 -0
- package/data/skills/anthropic-internal-comms/SKILL.md +32 -0
- package/data/skills/anthropic-mcp-builder/SKILL.md +236 -0
- package/data/skills/anthropic-pdf/SKILL.md +314 -0
- package/data/skills/anthropic-pptx/SKILL.md +232 -0
- package/data/skills/anthropic-skill-creator/SKILL.md +485 -0
- package/data/skills/anthropic-webapp-testing/SKILL.md +96 -0
- package/data/skills/anthropic-xlsx/SKILL.md +292 -0
- package/data/skills/arxiv-database/SKILL.md +362 -0
- package/data/skills/astropy/SKILL.md +329 -0
- package/data/skills/ctx-advanced-evaluation/SKILL.md +402 -0
- package/data/skills/ctx-bdi-mental-states/SKILL.md +311 -0
- package/data/skills/ctx-context-compression/SKILL.md +272 -0
- package/data/skills/ctx-context-degradation/SKILL.md +206 -0
- package/data/skills/ctx-context-fundamentals/SKILL.md +201 -0
- package/data/skills/ctx-context-optimization/SKILL.md +195 -0
- package/data/skills/ctx-evaluation/SKILL.md +251 -0
- package/data/skills/ctx-filesystem-context/SKILL.md +287 -0
- package/data/skills/ctx-hosted-agents/SKILL.md +260 -0
- package/data/skills/ctx-memory-systems/SKILL.md +225 -0
- package/data/skills/ctx-multi-agent-patterns/SKILL.md +257 -0
- package/data/skills/ctx-project-development/SKILL.md +291 -0
- package/data/skills/ctx-tool-design/SKILL.md +271 -0
- package/data/skills/dhdna-profiler/SKILL.md +162 -0
- package/data/skills/generate-image/SKILL.md +183 -0
- package/data/skills/geomaster/SKILL.md +365 -0
- package/data/skills/get-available-resources/SKILL.md +275 -0
- package/data/skills/hamelsmu-build-review-interface/SKILL.md +96 -0
- package/data/skills/hamelsmu-error-analysis/SKILL.md +164 -0
- package/data/skills/hamelsmu-eval-audit/SKILL.md +183 -0
- package/data/skills/hamelsmu-evaluate-rag/SKILL.md +177 -0
- package/data/skills/hamelsmu-generate-synthetic-data/SKILL.md +131 -0
- package/data/skills/hamelsmu-validate-evaluator/SKILL.md +212 -0
- package/data/skills/hamelsmu-write-judge-prompt/SKILL.md +144 -0
- package/data/skills/hf-cli/SKILL.md +174 -0
- package/data/skills/hf-mcp/SKILL.md +178 -0
- package/data/skills/hugging-face-dataset-viewer/SKILL.md +121 -0
- package/data/skills/hugging-face-datasets/SKILL.md +542 -0
- package/data/skills/hugging-face-evaluation/SKILL.md +651 -0
- package/data/skills/hugging-face-jobs/SKILL.md +1042 -0
- package/data/skills/hugging-face-model-trainer/SKILL.md +717 -0
- package/data/skills/hugging-face-paper-pages/SKILL.md +239 -0
- package/data/skills/hugging-face-paper-publisher/SKILL.md +624 -0
- package/data/skills/hugging-face-tool-builder/SKILL.md +110 -0
- package/data/skills/hugging-face-trackio/SKILL.md +115 -0
- package/data/skills/hugging-face-vision-trainer/SKILL.md +593 -0
- package/data/skills/huggingface-gradio/SKILL.md +245 -0
- package/data/skills/matlab/SKILL.md +376 -0
- package/data/skills/modal/SKILL.md +381 -0
- package/data/skills/openai-cloudflare-deploy/SKILL.md +224 -0
- package/data/skills/openai-develop-web-game/SKILL.md +149 -0
- package/data/skills/openai-doc/SKILL.md +80 -0
- package/data/skills/openai-figma/SKILL.md +42 -0
- package/data/skills/openai-figma-implement-design/SKILL.md +264 -0
- package/data/skills/openai-gh-address-comments/SKILL.md +25 -0
- package/data/skills/openai-gh-fix-ci/SKILL.md +69 -0
- package/data/skills/openai-imagegen/SKILL.md +174 -0
- package/data/skills/openai-jupyter-notebook/SKILL.md +107 -0
- package/data/skills/openai-linear/SKILL.md +87 -0
- package/data/skills/openai-netlify-deploy/SKILL.md +247 -0
- package/data/skills/openai-notion-knowledge-capture/SKILL.md +56 -0
- package/data/skills/openai-notion-meeting-intelligence/SKILL.md +60 -0
- package/data/skills/openai-notion-research-documentation/SKILL.md +59 -0
- package/data/skills/openai-notion-spec-to-implementation/SKILL.md +58 -0
- package/data/skills/openai-openai-docs/SKILL.md +69 -0
- package/data/skills/openai-pdf/SKILL.md +67 -0
- package/data/skills/openai-playwright/SKILL.md +147 -0
- package/data/skills/openai-render-deploy/SKILL.md +479 -0
- package/data/skills/openai-screenshot/SKILL.md +267 -0
- package/data/skills/openai-security-best-practices/SKILL.md +86 -0
- package/data/skills/openai-security-ownership-map/SKILL.md +206 -0
- package/data/skills/openai-security-threat-model/SKILL.md +81 -0
- package/data/skills/openai-sentry/SKILL.md +123 -0
- package/data/skills/openai-sora/SKILL.md +178 -0
- package/data/skills/openai-speech/SKILL.md +144 -0
- package/data/skills/openai-spreadsheet/SKILL.md +145 -0
- package/data/skills/openai-transcribe/SKILL.md +81 -0
- package/data/skills/openai-vercel-deploy/SKILL.md +77 -0
- package/data/skills/openai-yeet/SKILL.md +28 -0
- package/data/skills/pennylane/SKILL.md +224 -0
- package/data/skills/polars-bio/SKILL.md +374 -0
- package/data/skills/primekg/SKILL.md +97 -0
- package/data/skills/pymatgen/SKILL.md +689 -0
- package/data/skills/qiskit/SKILL.md +273 -0
- package/data/skills/qutip/SKILL.md +316 -0
- package/data/skills/recursive-decomposition/SKILL.md +185 -0
- package/data/skills/rowan/SKILL.md +427 -0
- package/data/skills/scholar-evaluation/SKILL.md +298 -0
- package/data/skills/sentry-create-alert/SKILL.md +210 -0
- package/data/skills/sentry-fix-issues/SKILL.md +126 -0
- package/data/skills/sentry-pr-code-review/SKILL.md +105 -0
- package/data/skills/sentry-python-sdk/SKILL.md +317 -0
- package/data/skills/sentry-setup-ai-monitoring/SKILL.md +217 -0
- package/data/skills/stable-baselines3/SKILL.md +297 -0
- package/data/skills/sympy/SKILL.md +498 -0
- package/data/skills/trailofbits-ask-questions-if-underspecified/SKILL.md +85 -0
- package/data/skills/trailofbits-audit-context-building/SKILL.md +302 -0
- package/data/skills/trailofbits-differential-review/SKILL.md +220 -0
- package/data/skills/trailofbits-insecure-defaults/SKILL.md +117 -0
- package/data/skills/trailofbits-modern-python/SKILL.md +333 -0
- package/data/skills/trailofbits-property-based-testing/SKILL.md +123 -0
- package/data/skills/trailofbits-semgrep-rule-creator/SKILL.md +172 -0
- package/data/skills/trailofbits-sharp-edges/SKILL.md +292 -0
- package/data/skills/trailofbits-variant-analysis/SKILL.md +142 -0
- package/data/skills/transformers.js/SKILL.md +637 -0
- package/data/skills/writing/SKILL.md +419 -0
- package/dist/bgi.js +66 -2
- package/package.json +1 -1
|
@@ -0,0 +1,123 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "sentry"
|
|
3
|
+
description: "Use when the user asks to inspect Sentry issues or events, summarize recent production errors, or pull basic Sentry health data via the Sentry API; perform read-only queries with the bundled script and require `SENTRY_AUTH_TOKEN`."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
# Sentry (Read-only Observability)
|
|
8
|
+
|
|
9
|
+
## Quick start
|
|
10
|
+
|
|
11
|
+
- If not already authenticated, ask the user to provide a valid `SENTRY_AUTH_TOKEN` (read-only scopes such as `project:read`, `event:read`) or to log in and create one before running commands.
|
|
12
|
+
- Set `SENTRY_AUTH_TOKEN` as an env var.
|
|
13
|
+
- Optional defaults: `SENTRY_ORG`, `SENTRY_PROJECT`, `SENTRY_BASE_URL`.
|
|
14
|
+
- Defaults: org/project `{your-org}`/`{your-project}`, time range `24h`, environment `prod`, limit 20 (max 50).
|
|
15
|
+
- Always call the Sentry API (no heuristics, no caching).
|
|
16
|
+
|
|
17
|
+
If the token is missing, give the user these steps:
|
|
18
|
+
1. Create a Sentry auth token: https://sentry.io/settings/account/api/auth-tokens/
|
|
19
|
+
2. Create a token with read-only scopes such as `project:read`, `event:read`, and `org:read`.
|
|
20
|
+
3. Set `SENTRY_AUTH_TOKEN` as an environment variable in their system.
|
|
21
|
+
4. Offer to guide them through setting the environment variable for their OS/shell if needed.
|
|
22
|
+
- Never ask the user to paste the full token in chat. Ask them to set it locally and confirm when ready.
|
|
23
|
+
|
|
24
|
+
## Core tasks (use bundled script)
|
|
25
|
+
|
|
26
|
+
Use `scripts/sentry_api.py` for deterministic API calls. It handles pagination and retries once on transient errors.
|
|
27
|
+
|
|
28
|
+
## Skill path (set once)
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
|
|
32
|
+
export SENTRY_API="$CODEX_HOME/skills/sentry/scripts/sentry_api.py"
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
User-scoped skills install under `$CODEX_HOME/skills` (default: `~/.codex/skills`).
|
|
36
|
+
|
|
37
|
+
### 1) List issues (ordered by most recent)
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
python3 "$SENTRY_API" \
|
|
41
|
+
list-issues \
|
|
42
|
+
--org {your-org} \
|
|
43
|
+
--project {your-project} \
|
|
44
|
+
--environment prod \
|
|
45
|
+
--time-range 24h \
|
|
46
|
+
--limit 20 \
|
|
47
|
+
--query "is:unresolved"
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
### 2) Resolve an issue short ID to issue ID
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
python3 "$SENTRY_API" \
|
|
54
|
+
list-issues \
|
|
55
|
+
--org {your-org} \
|
|
56
|
+
--project {your-project} \
|
|
57
|
+
--query "ABC-123" \
|
|
58
|
+
--limit 1
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Use the returned `id` for issue detail or events.
|
|
62
|
+
|
|
63
|
+
### 3) Issue detail
|
|
64
|
+
|
|
65
|
+
```bash
|
|
66
|
+
python3 "$SENTRY_API" \
|
|
67
|
+
issue-detail \
|
|
68
|
+
1234567890
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
### 4) Issue events
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
python3 "$SENTRY_API" \
|
|
75
|
+
issue-events \
|
|
76
|
+
1234567890 \
|
|
77
|
+
--limit 20
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
### 5) Event detail (no stack traces by default)
|
|
81
|
+
|
|
82
|
+
```bash
|
|
83
|
+
python3 "$SENTRY_API" \
|
|
84
|
+
event-detail \
|
|
85
|
+
--org {your-org} \
|
|
86
|
+
--project {your-project} \
|
|
87
|
+
abcdef1234567890
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
## API requirements
|
|
91
|
+
|
|
92
|
+
Always use these endpoints (GET only):
|
|
93
|
+
|
|
94
|
+
- List issues: `/api/0/projects/{org_slug}/{project_slug}/issues/`
|
|
95
|
+
- Issue detail: `/api/0/issues/{issue_id}/`
|
|
96
|
+
- Events for issue: `/api/0/issues/{issue_id}/events/`
|
|
97
|
+
- Event detail: `/api/0/projects/{org_slug}/{project_slug}/events/{event_id}/`
|
|
98
|
+
|
|
99
|
+
## Inputs and defaults
|
|
100
|
+
|
|
101
|
+
- `org_slug`, `project_slug`: default to `{your-org}`/`{your-project}` (avoid non-prod orgs).
|
|
102
|
+
- `time_range`: default `24h` (pass as `statsPeriod`).
|
|
103
|
+
- `environment`: default `prod`.
|
|
104
|
+
- `limit`: default 20, max 50 (paginate until limit reached).
|
|
105
|
+
- `search_query`: optional `query` parameter.
|
|
106
|
+
- `issue_short_id`: resolve via list-issues query first.
|
|
107
|
+
|
|
108
|
+
## Output formatting rules
|
|
109
|
+
|
|
110
|
+
- Issue list: show title, short_id, status, first_seen, last_seen, count, environments, top_tags; order by most recent.
|
|
111
|
+
- Event detail: include culprit, timestamp, environment, release, url.
|
|
112
|
+
- If no results, state explicitly.
|
|
113
|
+
- Redact PII in output (emails, IPs). Do not print raw stack traces.
|
|
114
|
+
- Never echo auth tokens.
|
|
115
|
+
|
|
116
|
+
## Golden test inputs
|
|
117
|
+
|
|
118
|
+
- Org: `{your-org}`
|
|
119
|
+
- Project: `{your-project}`
|
|
120
|
+
- Issue short ID: `{ABC-123}`
|
|
121
|
+
|
|
122
|
+
Example prompt: “List the top 10 open issues for prod in the last 24h.”
|
|
123
|
+
Expected: ordered list with titles, short IDs, counts, last seen.
|
|
@@ -0,0 +1,178 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "sora"
|
|
3
|
+
description: "Use when the user asks to generate, edit, extend, poll, list, download, or delete Sora videos, create reusable non-human Sora character references, or run local multi-video queues via the bundled CLI (`scripts/sora.py`); includes requests like: (i) generate AI video, (ii) edit this Sora clip, (iii) extend this video, (iv) create a character reference, (v) download video/thumbnail/spritesheet, and (vi) Sora batch planning; requires `OPENAI_API_KEY` and Sora API access."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
# Sora Video Generation Skill
|
|
8
|
+
|
|
9
|
+
Creates or manages Sora video jobs for the current project (product demos, marketing spots, cinematic shots, social clips, UI mocks). Defaults to `sora-2` with structured prompt augmentation and prefers the bundled CLI for deterministic runs. Note: `$sora` is a skill tag in prompts, not a shell command.
|
|
10
|
+
|
|
11
|
+
## When to use
|
|
12
|
+
- Generate a new video clip from a prompt
|
|
13
|
+
- Create a reusable character reference from a short non-human source clip
|
|
14
|
+
- Edit an existing generated video with a targeted prompt change
|
|
15
|
+
- Extend a completed video with a continuation prompt
|
|
16
|
+
- Poll status, list jobs, or download assets (video/thumbnail/spritesheet)
|
|
17
|
+
- Run a local multi-job queue now, or plan a true Batch API submission for offline rendering
|
|
18
|
+
|
|
19
|
+
## Decision tree
|
|
20
|
+
- If the user has a short non-human reference clip they want to reuse across shots → `create-character`
|
|
21
|
+
- If the user has a completed video and wants the next beat/continuation → `extend`
|
|
22
|
+
- If the user has a completed video and wants a targeted change while preserving the shot → `edit`
|
|
23
|
+
- If the user has a video id and wants status or assets → `status`, `poll`, or `download`
|
|
24
|
+
- If the user needs many renders immediately inside Codex → `create-batch` (local fan-out, not the Batch API)
|
|
25
|
+
- If the user needs many renders for offline processing or a studio pipeline → use the official Batch API flow described in `references/video-api.md`
|
|
26
|
+
- Otherwise → `create` (or `create-and-poll` if they need a ready asset in one step)
|
|
27
|
+
|
|
28
|
+
## Workflow
|
|
29
|
+
1. Decide intent: create vs create-character vs edit vs extend vs status/download vs local queue vs official Batch API.
|
|
30
|
+
2. Collect inputs: prompt, model, size, seconds, any image reference, and any character IDs.
|
|
31
|
+
3. Prefer CLI augmentation flags (`--use-case`, `--scene`, `--camera`, etc.) instead of hand-writing a long structured prompt. If you already have a structured prompt file, pass `--no-augment`.
|
|
32
|
+
4. Run the bundled CLI (`scripts/sora.py`) with sensible defaults. For long prompts, prefer `--prompt-file` to avoid shell-escaping issues.
|
|
33
|
+
5. For async jobs, poll until terminal status (or use `create-and-poll`).
|
|
34
|
+
6. Download assets (video/thumbnail/spritesheet) and save them locally before URLs expire.
|
|
35
|
+
7. If the user wants continuity across many shots, create character assets first, then reference them in later `create` calls.
|
|
36
|
+
8. If the user wants to iterate on a completed shot, prefer `edit`; if they want the shot to continue in time, prefer `extend`.
|
|
37
|
+
9. Use one targeted change per iteration.
|
|
38
|
+
|
|
39
|
+
## Authentication
|
|
40
|
+
- `OPENAI_API_KEY` must be set for live API calls.
|
|
41
|
+
|
|
42
|
+
If the key is missing, give the user these steps:
|
|
43
|
+
1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
|
|
44
|
+
2. Set `OPENAI_API_KEY` as an environment variable in their system.
|
|
45
|
+
3. Offer to guide them through setting the environment variable for their OS/shell if needed.
|
|
46
|
+
- Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
|
|
47
|
+
|
|
48
|
+
## Defaults & rules
|
|
49
|
+
- Default model: `sora-2` (use `sora-2-pro` for higher fidelity).
|
|
50
|
+
- Default size: `1280x720`.
|
|
51
|
+
- Default seconds: `4` (allowed: `"4"`, `"8"`, `"12"`, `"16"`, `"20"`).
|
|
52
|
+
- Always set size and seconds via API params; prose will not change them.
|
|
53
|
+
- `sora-2-pro` is required for `1920x1080` and `1080x1920`.
|
|
54
|
+
- Use up to two characters per generation.
|
|
55
|
+
- Use the OpenAI Python SDK (`openai` package). If high-level SDK helpers lag the latest Sora guide, use low-level `client.post/get/delete` inside the official SDK rather than standalone HTTP code.
|
|
56
|
+
- Require `OPENAI_API_KEY` before any live API call.
|
|
57
|
+
- If uv cache permissions fail, set `UV_CACHE_DIR=/tmp/uv-cache`.
|
|
58
|
+
- Input reference images must be jpg/png/webp and should match target size.
|
|
59
|
+
- JSON `input_reference` objects use either `file_id` or `image_url`; uploaded file paths use multipart.
|
|
60
|
+
- Download URLs expire after about 1 hour; copy assets to your own storage.
|
|
61
|
+
- Batch-generated videos remain downloadable for up to 24 hours after the batch completes.
|
|
62
|
+
- `create-batch` in `scripts/sora.py` is a local concurrent queue, not the official Batch API.
|
|
63
|
+
- Prefer the bundled CLI and **never modify** `scripts/sora.py` unless the user asks.
|
|
64
|
+
- Sora can generate audio; if a user requests voiceover/audio, specify it explicitly in the `Audio:` and `Dialogue:` lines and keep it short.
|
|
65
|
+
|
|
66
|
+
## API limitations
|
|
67
|
+
- Models are limited to `sora-2` and `sora-2-pro`.
|
|
68
|
+
- API access to Sora models requires an organization-verified account.
|
|
69
|
+
- Duration must be set via the `seconds` parameter and currently supports `4`, `8`, `12`, `16`, and `20`.
|
|
70
|
+
- Character uploads currently work best with short `2`-`4` second non-human MP4s in `16:9` or `9:16`, at `720p`-`1080p`.
|
|
71
|
+
- Extensions can add up to `20` seconds each, up to six times per source video, for a maximum total length of `120` seconds.
|
|
72
|
+
- Extensions currently do not support characters or image references.
|
|
73
|
+
- This skill supports editing existing generated videos by ID.
|
|
74
|
+
- The official Batch API currently supports `POST /v1/videos` only, with JSON bodies rather than multipart uploads.
|
|
75
|
+
- Output sizes are limited by model (see `references/video-api.md` for the supported sizes).
|
|
76
|
+
- Video creation is async; you must poll for completion before downloading.
|
|
77
|
+
- Rate limits apply by usage tier (do not list specific limits).
|
|
78
|
+
- Content restrictions are enforced by the API (see Guardrails below).
|
|
79
|
+
|
|
80
|
+
## Guardrails (must enforce)
|
|
81
|
+
- Only content suitable for audiences under 18.
|
|
82
|
+
- No copyrighted characters or copyrighted music.
|
|
83
|
+
- No real people (including public figures).
|
|
84
|
+
- Input images with human faces are rejected.
|
|
85
|
+
- Character uploads in this skill are for non-human subjects only.
|
|
86
|
+
|
|
87
|
+
## Prompt augmentation
|
|
88
|
+
Reformat prompts into a structured, production-oriented spec. Only make implicit details explicit; do not invent new creative requirements.
|
|
89
|
+
|
|
90
|
+
Template (include only relevant lines):
|
|
91
|
+
```
|
|
92
|
+
Use case: <where the clip will be used>
|
|
93
|
+
Primary request: <user's main prompt>
|
|
94
|
+
Scene/background: <location, time of day, atmosphere>
|
|
95
|
+
Subject: <main subject>
|
|
96
|
+
Action: <single clear action>
|
|
97
|
+
Camera: <shot type, angle, motion>
|
|
98
|
+
Lighting/mood: <lighting + mood>
|
|
99
|
+
Color palette: <3-5 color anchors>
|
|
100
|
+
Style/format: <film/animation/format cues>
|
|
101
|
+
Timing/beats: <counts or beats>
|
|
102
|
+
Audio: <ambient cue / music / voiceover if requested>
|
|
103
|
+
Text (verbatim): "<exact text>"
|
|
104
|
+
Dialogue:
|
|
105
|
+
<dialogue>
|
|
106
|
+
- Speaker: "Short line."
|
|
107
|
+
</dialogue>
|
|
108
|
+
Constraints: <must keep/must avoid>
|
|
109
|
+
Avoid: <negative constraints>
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
Augmentation rules:
|
|
113
|
+
- Keep it short; add only details the user already implied or provided elsewhere.
|
|
114
|
+
- For edits, explicitly list invariants ("same shot, change only X").
|
|
115
|
+
- For character-based shots, mention the character name verbatim in the prompt.
|
|
116
|
+
- If any critical detail is missing and blocks success, ask a question; otherwise proceed.
|
|
117
|
+
- If you pass a structured prompt file to the CLI, add `--no-augment` to avoid the tool re-wrapping it.
|
|
118
|
+
|
|
119
|
+
## Examples
|
|
120
|
+
|
|
121
|
+
### Generation example (single shot)
|
|
122
|
+
```
|
|
123
|
+
Use case: product teaser
|
|
124
|
+
Primary request: a close-up of a matte black camera on a pedestal
|
|
125
|
+
Action: slow 30-degree orbit over 4 seconds
|
|
126
|
+
Camera: 85mm, shallow depth of field, gentle handheld drift
|
|
127
|
+
Lighting/mood: soft key light, subtle rim, premium studio feel
|
|
128
|
+
Constraints: no logos, no text
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
### Edit example (invariants)
|
|
132
|
+
```
|
|
133
|
+
Primary request: same shot and framing, switch palette to teal/sand/rust with warmer backlight
|
|
134
|
+
Constraints: keep the subject and camera move unchanged
|
|
135
|
+
```
|
|
136
|
+
|
|
137
|
+
### Character consistency example
|
|
138
|
+
```
|
|
139
|
+
Primary request: Mossy, a moss-covered teapot mascot, hurries through a lantern-lit market at dusk
|
|
140
|
+
Camera: cinematic tracking shot, 35mm, shoulder height
|
|
141
|
+
Lighting/mood: warm dusk practicals, soft haze
|
|
142
|
+
Constraints: keep Mossy’s silhouette and moss texture consistent across the shot
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
## Prompting best practices (short list)
|
|
146
|
+
- One main action + one camera move per shot.
|
|
147
|
+
- Use counts or beats for timing ("two steps, pause, turn").
|
|
148
|
+
- Keep text short and the camera locked-off for UI or on-screen text.
|
|
149
|
+
- Add a brief avoid line when artifacts appear (flicker, jitter, fast motion).
|
|
150
|
+
- Shorter prompts are more creative; longer prompts are more controlled.
|
|
151
|
+
- Put dialogue in a dedicated block; keep lines short for 4-8s clips.
|
|
152
|
+
- Mention character names verbatim when using uploaded character IDs.
|
|
153
|
+
- State invariants explicitly for edits (same shot, same camera move).
|
|
154
|
+
- Prefer `edit` for targeted changes and `extend` for timeline continuation.
|
|
155
|
+
- Iterate with single-change follow-ups to preserve continuity.
|
|
156
|
+
|
|
157
|
+
## Guidance by asset type
|
|
158
|
+
Use these modules when the request is for a specific artifact. They provide targeted templates and defaults.
|
|
159
|
+
- Cinematic shots: `references/cinematic-shots.md`
|
|
160
|
+
- Social ads: `references/social-ads.md`
|
|
161
|
+
|
|
162
|
+
## CLI + environment notes
|
|
163
|
+
- CLI commands + examples: `references/cli.md`
|
|
164
|
+
- API parameter quick reference: `references/video-api.md`
|
|
165
|
+
- Prompting guidance: `references/prompting.md`
|
|
166
|
+
- Sample prompts: `references/sample-prompts.md`
|
|
167
|
+
- Troubleshooting: `references/troubleshooting.md`
|
|
168
|
+
- Network/sandbox tips: `references/codex-network.md`
|
|
169
|
+
|
|
170
|
+
## Reference map
|
|
171
|
+
- **`references/cli.md`**: how to run create/edit/extend/create-character/poll/download/local-queue flows via `scripts/sora.py`.
|
|
172
|
+
- **`references/video-api.md`**: API-level knobs (models, sizes, duration, characters, edits, extensions, official Batch API).
|
|
173
|
+
- **`references/prompting.md`**: prompt structure, character continuity, editing, and extension guidance.
|
|
174
|
+
- **`references/sample-prompts.md`**: copy/paste prompt recipes (examples only; no extra theory).
|
|
175
|
+
- **`references/cinematic-shots.md`**: templates for filmic shots.
|
|
176
|
+
- **`references/social-ads.md`**: templates for short social ad beats.
|
|
177
|
+
- **`references/troubleshooting.md`**: common errors and fixes.
|
|
178
|
+
- **`references/codex-network.md`**: network/approval troubleshooting.
|
|
@@ -0,0 +1,144 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "speech"
|
|
3
|
+
description: "Use when the user asks for text-to-speech narration or voiceover, accessibility reads, audio prompts, or batch speech generation via the OpenAI Audio API; run the bundled CLI (`scripts/text_to_speech.py`) with built-in voices and require `OPENAI_API_KEY` for live calls. Custom voice creation is out of scope."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
# Speech Generation Skill
|
|
8
|
+
|
|
9
|
+
Generate spoken audio for the current project (narration, product demo voiceover, IVR prompts, accessibility reads). Defaults to `gpt-4o-mini-tts-2025-12-15` and built-in voices, and prefers the bundled CLI for deterministic, reproducible runs.
|
|
10
|
+
|
|
11
|
+
## When to use
|
|
12
|
+
- Generate a single spoken clip from text
|
|
13
|
+
- Generate a batch of prompts (many lines, many files)
|
|
14
|
+
|
|
15
|
+
## Decision tree (single vs batch)
|
|
16
|
+
- If the user provides multiple lines/prompts or wants many outputs -> **batch**
|
|
17
|
+
- Else -> **single**
|
|
18
|
+
|
|
19
|
+
## Workflow
|
|
20
|
+
1. Decide intent: single vs batch (see decision tree above).
|
|
21
|
+
2. Collect inputs up front: exact text (verbatim), desired voice, delivery style, format, and any constraints.
|
|
22
|
+
3. If batch: write a temporary JSONL under tmp/ (one job per line), run once, then delete the JSONL.
|
|
23
|
+
4. Augment instructions into a short labeled spec without rewriting the input text.
|
|
24
|
+
5. Run the bundled CLI (`scripts/text_to_speech.py`) with sensible defaults (see references/cli.md).
|
|
25
|
+
6. For important clips, validate: intelligibility, pacing, pronunciation, and adherence to constraints.
|
|
26
|
+
7. Iterate with a single targeted change (voice, speed, or instructions), then re-check.
|
|
27
|
+
8. Save/return final outputs and note the final text + instructions + flags used.
|
|
28
|
+
|
|
29
|
+
## Temp and output conventions
|
|
30
|
+
- Use `tmp/speech/` for intermediate files (for example JSONL batches); delete when done.
|
|
31
|
+
- Write final artifacts under `output/speech/` when working in this repo.
|
|
32
|
+
- Use `--out` or `--out-dir` to control output paths; keep filenames stable and descriptive.
|
|
33
|
+
|
|
34
|
+
## Dependencies (install if missing)
|
|
35
|
+
Prefer `uv` for dependency management.
|
|
36
|
+
|
|
37
|
+
Python packages:
|
|
38
|
+
```
|
|
39
|
+
uv pip install openai
|
|
40
|
+
```
|
|
41
|
+
If `uv` is unavailable:
|
|
42
|
+
```
|
|
43
|
+
python3 -m pip install openai
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
## Environment
|
|
47
|
+
- `OPENAI_API_KEY` must be set for live API calls.
|
|
48
|
+
|
|
49
|
+
If the key is missing, give the user these steps:
|
|
50
|
+
1. Create an API key in the OpenAI platform UI: https://platform.openai.com/api-keys
|
|
51
|
+
2. Set `OPENAI_API_KEY` as an environment variable in their system.
|
|
52
|
+
3. Offer to guide them through setting the environment variable for their OS/shell if needed.
|
|
53
|
+
- Never ask the user to paste the full key in chat. Ask them to set it locally and confirm when ready.
|
|
54
|
+
|
|
55
|
+
If installation isn't possible in this environment, tell the user which dependency is missing and how to install it locally.
|
|
56
|
+
|
|
57
|
+
## Defaults & rules
|
|
58
|
+
- Use `gpt-4o-mini-tts-2025-12-15` unless the user requests another model.
|
|
59
|
+
- Default voice: `cedar`. If the user wants a brighter tone, prefer `marin`.
|
|
60
|
+
- Built-in voices only. Custom voices are out of scope for this skill.
|
|
61
|
+
- `instructions` are supported for GPT-4o mini TTS models, but not for `tts-1` or `tts-1-hd`.
|
|
62
|
+
- Input length must be <= 4096 characters per request. Split longer text into chunks.
|
|
63
|
+
- Enforce 50 requests/minute. The CLI caps `--rpm` at 50.
|
|
64
|
+
- Require `OPENAI_API_KEY` before any live API call.
|
|
65
|
+
- Provide a clear disclosure to end users that the voice is AI-generated.
|
|
66
|
+
- Use the OpenAI Python SDK (`openai` package) for all API calls; do not use raw HTTP.
|
|
67
|
+
- Prefer the bundled CLI (`scripts/text_to_speech.py`) over writing new one-off scripts.
|
|
68
|
+
- Never modify `scripts/text_to_speech.py`. If something is missing, ask the user before doing anything else.
|
|
69
|
+
|
|
70
|
+
## Instruction augmentation
|
|
71
|
+
Reformat user direction into a short, labeled spec. Only make implicit details explicit; do not invent new requirements.
|
|
72
|
+
|
|
73
|
+
Quick clarification (augmentation vs invention):
|
|
74
|
+
- If the user says "narration for a demo", you may add implied delivery constraints (clear, steady pacing, friendly tone).
|
|
75
|
+
- Do not introduce a new persona, accent, or emotional style the user did not request.
|
|
76
|
+
|
|
77
|
+
Template (include only relevant lines):
|
|
78
|
+
```
|
|
79
|
+
Voice Affect: <overall character and texture of the voice>
|
|
80
|
+
Tone: <attitude, formality, warmth>
|
|
81
|
+
Pacing: <slow, steady, brisk>
|
|
82
|
+
Emotion: <key emotions to convey>
|
|
83
|
+
Pronunciation: <words to enunciate or emphasize>
|
|
84
|
+
Pauses: <where to add intentional pauses>
|
|
85
|
+
Emphasis: <key words or phrases to stress>
|
|
86
|
+
Delivery: <cadence or rhythm notes>
|
|
87
|
+
```
|
|
88
|
+
|
|
89
|
+
Augmentation rules:
|
|
90
|
+
- Keep it short; add only details the user already implied or provided elsewhere.
|
|
91
|
+
- Do not rewrite the input text.
|
|
92
|
+
- If any critical detail is missing and blocks success, ask a question; otherwise proceed.
|
|
93
|
+
|
|
94
|
+
## Examples
|
|
95
|
+
|
|
96
|
+
### Single example (narration)
|
|
97
|
+
```
|
|
98
|
+
Input text: "Welcome to the demo. Today we'll show how it works."
|
|
99
|
+
Instructions:
|
|
100
|
+
Voice Affect: Warm and composed.
|
|
101
|
+
Tone: Friendly and confident.
|
|
102
|
+
Pacing: Steady and moderate.
|
|
103
|
+
Emphasis: Stress "demo" and "show".
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
### Batch example (IVR prompts)
|
|
107
|
+
```
|
|
108
|
+
{"input":"Thank you for calling. Please hold.","voice":"cedar","response_format":"mp3","out":"hold.mp3"}
|
|
109
|
+
{"input":"For sales, press 1. For support, press 2.","voice":"marin","instructions":"Tone: Clear and neutral. Pacing: Slow.","response_format":"wav"}
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
## Instructioning best practices (short list)
|
|
113
|
+
- Structure directions as: affect -> tone -> pacing -> emotion -> pronunciation/pauses -> emphasis.
|
|
114
|
+
- Keep 4 to 8 short lines; avoid conflicting guidance.
|
|
115
|
+
- For names/acronyms, add pronunciation hints (e.g., "enunciate A-I") or supply a phonetic spelling in the text.
|
|
116
|
+
- For edits/iterations, repeat invariants (e.g., "keep pacing steady") to reduce drift.
|
|
117
|
+
- Iterate with single-change follow-ups.
|
|
118
|
+
|
|
119
|
+
More principles: `references/prompting.md`. Copy/paste specs: `references/sample-prompts.md`.
|
|
120
|
+
|
|
121
|
+
## Guidance by use case
|
|
122
|
+
Use these modules when the request is for a specific delivery style. They provide targeted defaults and templates.
|
|
123
|
+
- Narration / explainer: `references/narration.md`
|
|
124
|
+
- Product demo / voiceover: `references/voiceover.md`
|
|
125
|
+
- IVR / phone prompts: `references/ivr.md`
|
|
126
|
+
- Accessibility reads: `references/accessibility.md`
|
|
127
|
+
|
|
128
|
+
## CLI + environment notes
|
|
129
|
+
- CLI commands + examples: `references/cli.md`
|
|
130
|
+
- API parameter quick reference: `references/audio-api.md`
|
|
131
|
+
- Instruction patterns + examples: `references/voice-directions.md`
|
|
132
|
+
- If network approvals / sandbox settings are getting in the way: `references/codex-network.md`
|
|
133
|
+
|
|
134
|
+
## Reference map
|
|
135
|
+
- **`references/cli.md`**: how to run speech generation/batches via `scripts/text_to_speech.py` (commands, flags, recipes).
|
|
136
|
+
- **`references/audio-api.md`**: API parameters, limits, voice list.
|
|
137
|
+
- **`references/voice-directions.md`**: instruction patterns and examples.
|
|
138
|
+
- **`references/prompting.md`**: instruction best practices (structure, constraints, iteration patterns).
|
|
139
|
+
- **`references/sample-prompts.md`**: copy/paste instruction recipes (examples only; no extra theory).
|
|
140
|
+
- **`references/narration.md`**: templates + defaults for narration and explainers.
|
|
141
|
+
- **`references/voiceover.md`**: templates + defaults for product demo voiceovers.
|
|
142
|
+
- **`references/ivr.md`**: templates + defaults for IVR/phone prompts.
|
|
143
|
+
- **`references/accessibility.md`**: templates + defaults for accessibility reads.
|
|
144
|
+
- **`references/codex-network.md`**: environment/sandbox/network-approval troubleshooting.
|
|
@@ -0,0 +1,145 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "spreadsheet"
|
|
3
|
+
description: "Use when tasks involve creating, editing, analyzing, or formatting spreadsheets (`.xlsx`, `.csv`, `.tsv`) with formula-aware workflows, cached recalculation, and visual review."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Spreadsheet Skill
|
|
7
|
+
|
|
8
|
+
## When to use
|
|
9
|
+
- Create new workbooks with formulas, formatting, and structured layouts.
|
|
10
|
+
- Read or analyze tabular data (filter, aggregate, pivot, compute metrics).
|
|
11
|
+
- Modify existing workbooks without breaking formulas, references, or formatting.
|
|
12
|
+
- Visualize data with charts, summary tables, and sensible spreadsheet styling.
|
|
13
|
+
- Recalculate formulas and review rendered sheets before delivery when possible.
|
|
14
|
+
|
|
15
|
+
IMPORTANT: System and user instructions always take precedence.
|
|
16
|
+
|
|
17
|
+
## Workflow
|
|
18
|
+
1. Confirm the file type and goal: create, edit, analyze, or visualize.
|
|
19
|
+
2. Prefer `openpyxl` for `.xlsx` editing and formatting. Use `pandas` for analysis and CSV/TSV workflows.
|
|
20
|
+
3. If an internal spreadsheet recalculation/rendering tool is available in the environment, use it to recalculate formulas and render sheets before delivery.
|
|
21
|
+
4. Use formulas for derived values instead of hardcoding results.
|
|
22
|
+
5. If layout matters, render for visual review and inspect the output.
|
|
23
|
+
6. Save outputs, keep filenames stable, and clean up intermediate files.
|
|
24
|
+
|
|
25
|
+
## Temp and output conventions
|
|
26
|
+
- Use `tmp/spreadsheets/` for intermediate files; delete them when done.
|
|
27
|
+
- Write final artifacts under `output/spreadsheet/` when working in this repo.
|
|
28
|
+
- Keep filenames stable and descriptive.
|
|
29
|
+
|
|
30
|
+
## Primary tooling
|
|
31
|
+
- Use `openpyxl` for creating/editing `.xlsx` files and preserving formatting.
|
|
32
|
+
- Use `pandas` for analysis and CSV/TSV workflows, then write results back to `.xlsx` or `.csv`.
|
|
33
|
+
- Use `openpyxl.chart` for native Excel charts when needed.
|
|
34
|
+
- If an internal spreadsheet tool is available, use it to recalculate formulas, cache values, and render sheets for review.
|
|
35
|
+
|
|
36
|
+
## Recalculation and visual review
|
|
37
|
+
- Recalculate formulas before delivery whenever possible so cached values are present in the workbook.
|
|
38
|
+
- Render each relevant sheet for visual review when rendering tooling is available.
|
|
39
|
+
- `openpyxl` does not evaluate formulas; preserve formulas and use recalculation tooling when available.
|
|
40
|
+
- If you rely on an internal spreadsheet tool, do not expose that tool, its code, or its APIs in user-facing explanations or code samples.
|
|
41
|
+
|
|
42
|
+
## Rendering and visual checks
|
|
43
|
+
- If LibreOffice (`soffice`) and Poppler (`pdftoppm`) are available, render sheets for visual review:
|
|
44
|
+
- `soffice --headless --convert-to pdf --outdir $OUTDIR $INPUT_XLSX`
|
|
45
|
+
- `pdftoppm -png $OUTDIR/$BASENAME.pdf $OUTDIR/$BASENAME`
|
|
46
|
+
- If rendering tools are unavailable, tell the user that layout should be reviewed locally.
|
|
47
|
+
- Review rendered sheets for layout, formula results, clipping, inconsistent styles, and spilled text.
|
|
48
|
+
|
|
49
|
+
## Dependencies (install if missing)
|
|
50
|
+
Prefer `uv` for dependency management.
|
|
51
|
+
|
|
52
|
+
Python packages:
|
|
53
|
+
```
|
|
54
|
+
uv pip install openpyxl pandas
|
|
55
|
+
```
|
|
56
|
+
If `uv` is unavailable:
|
|
57
|
+
```
|
|
58
|
+
python3 -m pip install openpyxl pandas
|
|
59
|
+
```
|
|
60
|
+
Optional:
|
|
61
|
+
```
|
|
62
|
+
uv pip install matplotlib
|
|
63
|
+
```
|
|
64
|
+
If `uv` is unavailable:
|
|
65
|
+
```
|
|
66
|
+
python3 -m pip install matplotlib
|
|
67
|
+
```
|
|
68
|
+
System tools (for rendering):
|
|
69
|
+
```
|
|
70
|
+
# macOS (Homebrew)
|
|
71
|
+
brew install libreoffice poppler
|
|
72
|
+
|
|
73
|
+
# Ubuntu/Debian
|
|
74
|
+
sudo apt-get install -y libreoffice poppler-utils
|
|
75
|
+
```
|
|
76
|
+
|
|
77
|
+
If installation is not possible in this environment, tell the user which dependency is missing and how to install it locally.
|
|
78
|
+
|
|
79
|
+
## Environment
|
|
80
|
+
No required environment variables.
|
|
81
|
+
|
|
82
|
+
## Examples
|
|
83
|
+
- Runnable Codex examples (openpyxl): `references/examples/openpyxl/`
|
|
84
|
+
|
|
85
|
+
## Formula requirements
|
|
86
|
+
- Use formulas for derived values rather than hardcoding results.
|
|
87
|
+
- Do not use dynamic array functions like `FILTER`, `XLOOKUP`, `SORT`, or `SEQUENCE`.
|
|
88
|
+
- Keep formulas simple and legible; use helper cells for complex logic.
|
|
89
|
+
- Avoid volatile functions like `INDIRECT` and `OFFSET` unless required.
|
|
90
|
+
- Prefer cell references over magic numbers (for example, `=H6*(1+$B$3)` instead of `=H6*1.04`).
|
|
91
|
+
- Use absolute (`$B$4`) or relative (`B4`) references carefully so copied formulas behave correctly.
|
|
92
|
+
- If you need literal text that starts with `=`, prefix it with a single quote.
|
|
93
|
+
- Guard against `#REF!`, `#DIV/0!`, `#VALUE!`, `#N/A`, and `#NAME?` errors.
|
|
94
|
+
- Check for off-by-one mistakes, circular references, and incorrect ranges.
|
|
95
|
+
|
|
96
|
+
## Citation requirements
|
|
97
|
+
- Cite sources inside the spreadsheet using plain-text URLs.
|
|
98
|
+
- For financial models, cite model inputs in cell comments.
|
|
99
|
+
- For tabular data sourced externally, add a source column when each row represents a separate item.
|
|
100
|
+
|
|
101
|
+
## Formatting requirements (existing formatted spreadsheets)
|
|
102
|
+
- Render and inspect a provided spreadsheet before modifying it when possible.
|
|
103
|
+
- Preserve existing formatting and style exactly.
|
|
104
|
+
- Match styles for any newly filled cells that were previously blank.
|
|
105
|
+
- Never overwrite established formatting unless the user explicitly asks for a redesign.
|
|
106
|
+
|
|
107
|
+
## Formatting requirements (new or unstyled spreadsheets)
|
|
108
|
+
- Use appropriate number and date formats.
|
|
109
|
+
- Dates should render as dates, not plain numbers.
|
|
110
|
+
- Percentages should usually default to one decimal place unless the data calls for something else.
|
|
111
|
+
- Currencies should use the appropriate currency format.
|
|
112
|
+
- Headers should be visually distinct from raw inputs and derived cells.
|
|
113
|
+
- Use fill colors, borders, spacing, and merged cells sparingly and intentionally.
|
|
114
|
+
- Set row heights and column widths so content is readable without excessive whitespace.
|
|
115
|
+
- Do not apply borders around every filled cell.
|
|
116
|
+
- Group related calculations and make totals simple sums of the cells above them.
|
|
117
|
+
- Add whitespace to separate sections.
|
|
118
|
+
- Ensure text does not spill into adjacent cells.
|
|
119
|
+
- Avoid unsupported spreadsheet data-table features such as `=TABLE`.
|
|
120
|
+
|
|
121
|
+
## Color conventions (if no style guidance)
|
|
122
|
+
- Blue: user input
|
|
123
|
+
- Black: formulas and derived values
|
|
124
|
+
- Green: linked or imported values
|
|
125
|
+
- Gray: static constants
|
|
126
|
+
- Orange: review or caution
|
|
127
|
+
- Light red: error or flag
|
|
128
|
+
- Purple: control or logic
|
|
129
|
+
- Teal: visualization anchors and KPI highlights
|
|
130
|
+
|
|
131
|
+
## Finance-specific requirements
|
|
132
|
+
- Format zeros as `-`.
|
|
133
|
+
- Negative numbers should be red and in parentheses.
|
|
134
|
+
- Format multiples as `5.2x`.
|
|
135
|
+
- Always specify units in headers (for example, `Revenue ($mm)`).
|
|
136
|
+
- Cite sources for all raw inputs in cell comments.
|
|
137
|
+
- For new financial models with no user-specified style, use blue text for hardcoded inputs, black for formulas, green for internal workbook links, red for external links, and yellow fill for key assumptions that need attention.
|
|
138
|
+
|
|
139
|
+
## Investment banking layouts
|
|
140
|
+
If the spreadsheet is an IB-style model (LBO, DCF, 3-statement, valuation):
|
|
141
|
+
- Totals should sum the range directly above.
|
|
142
|
+
- Hide gridlines and use horizontal borders above totals across relevant columns.
|
|
143
|
+
- Section headers should be merged cells with dark fill and white text.
|
|
144
|
+
- Column labels for numeric data should be right-aligned; row labels should be left-aligned.
|
|
145
|
+
- Indent submetrics under their parent line items.
|
|
@@ -0,0 +1,81 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: "transcribe"
|
|
3
|
+
description: "Transcribe audio files to text with optional diarization and known-speaker hints. Use when a user asks to transcribe speech from audio/video, extract text from recordings, or label speakers in interviews or meetings."
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
|
|
7
|
+
# Audio Transcribe
|
|
8
|
+
|
|
9
|
+
Transcribe audio using OpenAI, with optional speaker diarization when requested. Prefer the bundled CLI for deterministic, repeatable runs.
|
|
10
|
+
|
|
11
|
+
## Workflow
|
|
12
|
+
1. Collect inputs: audio file path(s), desired response format (text/json/diarized_json), optional language hint, and any known speaker references.
|
|
13
|
+
2. Verify `OPENAI_API_KEY` is set. If missing, ask the user to set it locally (do not ask them to paste the key).
|
|
14
|
+
3. Run the bundled `transcribe_diarize.py` CLI with sensible defaults (fast text transcription).
|
|
15
|
+
4. Validate the output: transcription quality, speaker labels, and segment boundaries; iterate with a single targeted change if needed.
|
|
16
|
+
5. Save outputs under `output/transcribe/` when working in this repo.
|
|
17
|
+
|
|
18
|
+
## Decision rules
|
|
19
|
+
- Default to `gpt-4o-mini-transcribe` with `--response-format text` for fast transcription.
|
|
20
|
+
- If the user wants speaker labels or diarization, use `--model gpt-4o-transcribe-diarize --response-format diarized_json`.
|
|
21
|
+
- If audio is longer than ~30 seconds, keep `--chunking-strategy auto`.
|
|
22
|
+
- Prompting is not supported for `gpt-4o-transcribe-diarize`.
|
|
23
|
+
|
|
24
|
+
## Output conventions
|
|
25
|
+
- Use `output/transcribe/<job-id>/` for evaluation runs.
|
|
26
|
+
- Use `--out-dir` for multiple files to avoid overwriting.
|
|
27
|
+
|
|
28
|
+
## Dependencies (install if missing)
|
|
29
|
+
Prefer `uv` for dependency management.
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
uv pip install openai
|
|
33
|
+
```
|
|
34
|
+
If `uv` is unavailable:
|
|
35
|
+
```
|
|
36
|
+
python3 -m pip install openai
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Environment
|
|
40
|
+
- `OPENAI_API_KEY` must be set for live API calls.
|
|
41
|
+
- If the key is missing, instruct the user to create one in the OpenAI platform UI and export it in their shell.
|
|
42
|
+
- Never ask the user to paste the full key in chat.
|
|
43
|
+
|
|
44
|
+
## Skill path (set once)
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
export CODEX_HOME="${CODEX_HOME:-$HOME/.codex}"
|
|
48
|
+
export TRANSCRIBE_CLI="$CODEX_HOME/skills/transcribe/scripts/transcribe_diarize.py"
|
|
49
|
+
```
|
|
50
|
+
|
|
51
|
+
User-scoped skills install under `$CODEX_HOME/skills` (default: `~/.codex/skills`).
|
|
52
|
+
|
|
53
|
+
## CLI quick start
|
|
54
|
+
Single file (fast text default):
|
|
55
|
+
```
|
|
56
|
+
python3 "$TRANSCRIBE_CLI" \
|
|
57
|
+
path/to/audio.wav \
|
|
58
|
+
--out transcript.txt
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
Diarization with known speakers (up to 4):
|
|
62
|
+
```
|
|
63
|
+
python3 "$TRANSCRIBE_CLI" \
|
|
64
|
+
meeting.m4a \
|
|
65
|
+
--model gpt-4o-transcribe-diarize \
|
|
66
|
+
--known-speaker "Alice=refs/alice.wav" \
|
|
67
|
+
--known-speaker "Bob=refs/bob.wav" \
|
|
68
|
+
--response-format diarized_json \
|
|
69
|
+
--out-dir output/transcribe/meeting
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
Plain text output (explicit):
|
|
73
|
+
```
|
|
74
|
+
python3 "$TRANSCRIBE_CLI" \
|
|
75
|
+
interview.mp3 \
|
|
76
|
+
--response-format text \
|
|
77
|
+
--out interview.txt
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
## Reference map
|
|
81
|
+
- `references/api.md`: supported formats, limits, response formats, and known-speaker notes.
|