escribano 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (124) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +297 -0
  3. package/dist/0_types.js +279 -0
  4. package/dist/actions/classify-session.js +77 -0
  5. package/dist/actions/create-contexts.js +44 -0
  6. package/dist/actions/create-topic-blocks.js +68 -0
  7. package/dist/actions/extract-metadata.js +24 -0
  8. package/dist/actions/generate-artifact-v3.js +296 -0
  9. package/dist/actions/generate-artifact.js +61 -0
  10. package/dist/actions/generate-summary-v3.js +260 -0
  11. package/dist/actions/outline-index.js +204 -0
  12. package/dist/actions/process-recording-v2.js +494 -0
  13. package/dist/actions/process-recording-v3.js +412 -0
  14. package/dist/actions/process-session.js +183 -0
  15. package/dist/actions/publish-summary-v3.js +303 -0
  16. package/dist/actions/sync-to-outline.js +196 -0
  17. package/dist/adapters/audio.silero.adapter.js +69 -0
  18. package/dist/adapters/cap.adapter.js +94 -0
  19. package/dist/adapters/capture.cap.adapter.js +107 -0
  20. package/dist/adapters/capture.filesystem.adapter.js +124 -0
  21. package/dist/adapters/embedding.ollama.adapter.js +141 -0
  22. package/dist/adapters/intelligence.adapter.js +202 -0
  23. package/dist/adapters/intelligence.mlx.adapter.js +395 -0
  24. package/dist/adapters/intelligence.ollama.adapter.js +741 -0
  25. package/dist/adapters/publishing.outline.adapter.js +75 -0
  26. package/dist/adapters/storage.adapter.js +81 -0
  27. package/dist/adapters/storage.fs.adapter.js +83 -0
  28. package/dist/adapters/transcription.whisper.adapter.js +206 -0
  29. package/dist/adapters/video.ffmpeg.adapter.js +405 -0
  30. package/dist/adapters/whisper.adapter.js +168 -0
  31. package/dist/batch-context.js +329 -0
  32. package/dist/db/helpers.js +50 -0
  33. package/dist/db/index.js +95 -0
  34. package/dist/db/migrate.js +80 -0
  35. package/dist/db/repositories/artifact.sqlite.js +77 -0
  36. package/dist/db/repositories/cluster.sqlite.js +92 -0
  37. package/dist/db/repositories/context.sqlite.js +75 -0
  38. package/dist/db/repositories/index.js +10 -0
  39. package/dist/db/repositories/observation.sqlite.js +70 -0
  40. package/dist/db/repositories/recording.sqlite.js +56 -0
  41. package/dist/db/repositories/subject.sqlite.js +64 -0
  42. package/dist/db/repositories/topic-block.sqlite.js +45 -0
  43. package/dist/db/types.js +4 -0
  44. package/dist/domain/classification.js +60 -0
  45. package/dist/domain/context.js +97 -0
  46. package/dist/domain/index.js +2 -0
  47. package/dist/domain/observation.js +17 -0
  48. package/dist/domain/recording.js +41 -0
  49. package/dist/domain/segment.js +93 -0
  50. package/dist/domain/session.js +93 -0
  51. package/dist/domain/time-range.js +38 -0
  52. package/dist/domain/transcript.js +79 -0
  53. package/dist/index.js +173 -0
  54. package/dist/pipeline/context.js +162 -0
  55. package/dist/pipeline/events.js +2 -0
  56. package/dist/prerequisites.js +226 -0
  57. package/dist/scripts/rebuild-index.js +53 -0
  58. package/dist/scripts/seed-fixtures.js +290 -0
  59. package/dist/services/activity-segmentation.js +333 -0
  60. package/dist/services/activity-segmentation.test.js +191 -0
  61. package/dist/services/app-normalization.js +212 -0
  62. package/dist/services/cluster-merge.js +69 -0
  63. package/dist/services/clustering.js +237 -0
  64. package/dist/services/debug.js +58 -0
  65. package/dist/services/frame-sampling.js +318 -0
  66. package/dist/services/signal-extraction.js +106 -0
  67. package/dist/services/subject-grouping.js +342 -0
  68. package/dist/services/temporal-alignment.js +99 -0
  69. package/dist/services/vlm-enrichment.js +84 -0
  70. package/dist/services/vlm-service.js +130 -0
  71. package/dist/stats/index.js +3 -0
  72. package/dist/stats/observer.js +65 -0
  73. package/dist/stats/repository.js +36 -0
  74. package/dist/stats/resource-tracker.js +86 -0
  75. package/dist/stats/types.js +1 -0
  76. package/dist/test-classification-prompts.js +181 -0
  77. package/dist/tests/cap.adapter.test.js +75 -0
  78. package/dist/tests/capture.cap.adapter.test.js +69 -0
  79. package/dist/tests/classify-session.test.js +140 -0
  80. package/dist/tests/db/repositories.test.js +243 -0
  81. package/dist/tests/domain/time-range.test.js +31 -0
  82. package/dist/tests/integration.test.js +84 -0
  83. package/dist/tests/intelligence.adapter.test.js +102 -0
  84. package/dist/tests/intelligence.ollama.adapter.test.js +178 -0
  85. package/dist/tests/process-v2.test.js +90 -0
  86. package/dist/tests/services/clustering.test.js +112 -0
  87. package/dist/tests/services/frame-sampling.test.js +152 -0
  88. package/dist/tests/utils/ocr.test.js +76 -0
  89. package/dist/tests/utils/parallel.test.js +57 -0
  90. package/dist/tests/visual-observer.test.js +175 -0
  91. package/dist/utils/id-normalization.js +15 -0
  92. package/dist/utils/index.js +9 -0
  93. package/dist/utils/model-detector.js +154 -0
  94. package/dist/utils/ocr.js +80 -0
  95. package/dist/utils/parallel.js +32 -0
  96. package/migrations/001_initial.sql +109 -0
  97. package/migrations/002_clusters.sql +41 -0
  98. package/migrations/003_observations_vlm_fields.sql +14 -0
  99. package/migrations/004_observations_unique.sql +18 -0
  100. package/migrations/005_processing_stats.sql +29 -0
  101. package/migrations/006_vlm_raw_response.sql +6 -0
  102. package/migrations/007_subjects.sql +23 -0
  103. package/migrations/008_artifacts_recording.sql +6 -0
  104. package/migrations/009_artifact_subjects.sql +10 -0
  105. package/package.json +82 -0
  106. package/prompts/action-items.md +55 -0
  107. package/prompts/blog-draft.md +54 -0
  108. package/prompts/blog-research.md +87 -0
  109. package/prompts/card.md +54 -0
  110. package/prompts/classify-segment.md +38 -0
  111. package/prompts/classify.md +37 -0
  112. package/prompts/code-snippets.md +163 -0
  113. package/prompts/extract-metadata.md +149 -0
  114. package/prompts/notes.md +83 -0
  115. package/prompts/runbook.md +123 -0
  116. package/prompts/standup.md +50 -0
  117. package/prompts/step-by-step.md +125 -0
  118. package/prompts/subject-grouping.md +31 -0
  119. package/prompts/summary-v3.md +89 -0
  120. package/prompts/summary.md +77 -0
  121. package/prompts/topic-classifier.md +24 -0
  122. package/prompts/topic-extract.md +13 -0
  123. package/prompts/vlm-batch.md +21 -0
  124. package/prompts/vlm-single.md +19 -0
@@ -0,0 +1,55 @@
1
+ # Action Items
2
+ You are a project manager extracting action items from a work session. Your goal is to create clear, specific, and actionable tasks that can be executed without ambiguity.
3
+
4
+ ## Context
5
+ Metadata: {{METADATA}}
6
+ Visual Log: {{VISUAL_LOG}}
7
+ Detected Language: {{LANGUAGE}}
8
+
9
+ ## Instructions
10
+ 1. **Language Rule**: Use English for the document structure and headings. The task descriptions, names of responsible parties, and specific technical details must remain in the original language ({{LANGUAGE}}).
11
+
12
+ 2. **Extraction Scope**: Identify all tasks, assignments, decisions, and follow-ups mentioned in the transcript. Look for both explicit and implicit action items.
13
+
14
+ 3. **Action Item Standards**: Each action item must be:
15
+ - **Specific**: Clear enough that someone not in the meeting understands what to do
16
+ - **Action-oriented**: Begin with a strong verb (e.g., "Create," "Submit," "Review," "Fix," "Research")
17
+ - **Complete**: Include all necessary context, documents, or reference materials mentioned
18
+ - **Measurable**: Include a success criteria or deliverable that indicates completion
19
+
20
+ 4. **Handling Ambiguity**:
21
+ - If an item is vague or abstract, break it down into 2-3 concrete sub-tasks
22
+ - If no owner is explicitly stated, infer from context based on who raised the item, who has relevant expertise, or the role discussed
23
+ - Mark items with inferred assignments as [Inferred] and note reasoning
24
+
25
+ 5. **Deadlines and Priority**:
26
+ - Extract specific deadlines mentioned. If none mentioned, mark as "Deadline not specified"
27
+ - Identify priority levels from context:
28
+ - **High/Urgent**: Blocks other work, mentioned as critical, or has firm deadline
29
+ - **Medium**: Important but not blocking
30
+ - **Low**: Nice-to-have or can be deferred
31
+ - If priority is unclear, mark as "Priority not specified"
32
+
33
+ 6. **Format**:
34
+ Use a numbered list format for each action item with the following structure:
35
+
36
+ ```
37
+ [ID]. [Action verb] [Specific task description]
38
+
39
+ • Owner: [Name/Role] [Mark [Inferred] if not explicit]
40
+ • Deadline: [Specific date/time OR "Not specified"]
41
+ • Priority: [High/Medium/Low OR "Not specified"]
42
+ • Success Criteria: [How completion will be verified OR "Not specified"]
43
+ • Context/Notes: [Relevant details, dependencies, reference materials]
44
+ ```
45
+
46
+ Group related items together under logical headers if helpful.
47
+
48
+ 7. **Quality Checks**:
49
+ - Ensure every item starts with an action verb
50
+ - Verify each item has at least one owner (explicit or inferred)
51
+ - Confirm no item is so vague it would require follow-up clarification
52
+ - If an item cannot be made specific from the transcript, mark it as [Requires Clarification] and include a brief note
53
+
54
+ ## Transcript
55
+ {{TRANSCRIPT_ALL}}
@@ -0,0 +1,54 @@
1
+ # Blog Narrative Draft
2
+
3
+ You are a creative writer transforming a technical or business session into a narrative-driven blog post or article.
4
+
5
+ ## Context
6
+ Metadata: {{METADATA}}
7
+ Detected Language: {{LANGUAGE}}
8
+
9
+ ## Instructions
10
+
11
+ 1. **Language Rule**: Use English for headings and structural elements. The actual narrative content, quotes, and creative storytelling must remain in the original language ({{LANGUAGE}}).
12
+
13
+ 2. **Adopt a Proven Narrative Structure** (choose one based on session content):
14
+ - **Problem-Solution Arc**: Start with a relatable problem → Show struggle/attempted solutions → Describe the breakthrough or key insight → Explain the resolution → Share lessons learned
15
+ - **Hero's Journey**: Introduce a protagonist (could be the speaker, team, or customer) → Present the challenge they face → Describe the guide/mentor or discovery → Show the transformation → Reflect on the journey
16
+ - **Case Study Format**: Set the context/background → Define the challenge → Detail the approach/process → Reveal results → Extract key takeaways
17
+
18
+ 3. **Engagement Techniques**:
19
+ - **Hook your reader immediately** with one of these opening techniques:
20
+ * A compelling anecdote from the session
21
+ * A startling statistic or counterintuitive fact
22
+ * A provocative question that addresses the reader's pain point
23
+ * A "what if" scenario
24
+ - **Create tension early** by establishing the stakes — what's at risk? What problem needs solving?
25
+ - **"Show, don't tell"** throughout — use specific details, examples, and vivid descriptions rather than abstract statements
26
+ - **Build characters** — make the people in the session relatable with specific details, motivations, and perspectives
27
+
28
+ 4. **Scannability for Web Readers**:
29
+ - Use **short paragraphs** (2-3 sentences maximum)
30
+ - Insert **subheadings** every 3-4 paragraphs to guide readers through the narrative
31
+ - Use **bulleted lists** for key points, lessons, or examples
32
+ - Include **a pull quote** or highlighted insight from the transcript that captures the essence
33
+ - Start each paragraph with a **topic sentence** that hints at what follows (Google Developers)
34
+
35
+ 5. **Structure Required**:
36
+ - **Compelling Hook** (opening technique as above)
37
+ - **The Narrative Arc** — logically flowing from beginning to middle to end, with rising action
38
+ - **Key Takeaways** — 3-5 concrete lessons or insights, presented as a bulleted list
39
+ - **Call-to-Action** — give the reader a purposeful next step: "Try this approach," "Apply this insight," "Share your experience," or "Join the conversation"
40
+ - **Conclusion** — reflect on the session's broader impact or future outlook
41
+
42
+ 6. **Tone and Style**:
43
+ - Use **active voice** throughout for clarity and impact
44
+ - Make content **relatable** — connect technical concepts to everyday experiences
45
+ - Balance **authenticity** with polish — preserve genuine moments from the transcript while ensuring flow
46
+ - Aim for **memorable** through surprising insights, emotional resonance, or practical value
47
+
48
+ 7. **Avoid**:
49
+ - Chronological transcript dumps (this is not a transcript)
50
+ - Overly jargon-heavy explanations without context
51
+ - Abstract generalities — ground every point in specific examples from the session
52
+
53
+ ## Transcript
54
+ {{TRANSCRIPT_ALL}}
@@ -0,0 +1,87 @@
1
+ # Blog Research Synthesis
2
+
3
+ You are a content researcher synthesizing a deep-dive learning or research session into a structured knowledge base following systematic qualitative analysis methods.
4
+
5
+ ## Context
6
+ Metadata: {{METADATA}}
7
+ Detected Language: {{LANGUAGE}}
8
+
9
+ ## Instructions
10
+
11
+ ### 1. Language Rule
12
+ - **Structural elements**: Use English for all headings, section labels, bullet points, and organizational markers.
13
+ - **Research content**: Preserve the original language ({{LANGUAGE}}) for all research findings, quotes, examples, and conceptual explanations.
14
+
15
+ ### 2. Analysis Methodology
16
+ Follow this systematic process before writing your output:
17
+
18
+ **Step 1: Coding**
19
+ - Identify and label key concepts, claims, and arguments in the transcript
20
+ - Look for recurring terminology and ideas
21
+ - Note direct quotes that capture essential insights
22
+
23
+ **Step 2: Pattern Recognition**
24
+ - Group related codes into clusters
25
+ - Identify relationships between concepts (causality, comparison, contrast)
26
+ - Detect contradictions or tensions in the reasoning
27
+
28
+ **Step 3: Theme Development**
29
+ - Name each theme clearly and descriptively
30
+ - Define the scope of each theme
31
+ - Validate that themes accurately reflect the transcript content
32
+
33
+ **Step 4: Synthesis**
34
+ - Connect themes to higher-level insights
35
+ - Identify the overarching narrative or framework
36
+ - Distinguish between what was explored, what was concluded, and what remains uncertain
37
+
38
+ ### 3. Output Structure
39
+
40
+ #### A. Research Overview
41
+ - **Research Goal**: What specific question, problem, or topic was being explored?
42
+ - **Research Context**: Why this topic matters (background, motivation, stakes)
43
+ - **Research Scope**: What was in-scope vs. out-of-scope for this session
44
+
45
+ #### B. Thematic Breakdown
46
+ For each major theme, provide:
47
+ - **Theme Name**: Clear, descriptive label
48
+ - **Theme Definition**: What this theme encompasses
49
+ - **Key Insights**: 3-5 core findings related to this theme
50
+ - **Supporting Evidence**: Direct quotes or paraphrased evidence (preserve original language)
51
+ - **Connections**: How this theme relates to other themes
52
+
53
+ #### C. Comparative Analysis (if applicable)
54
+ When multiple options, approaches, or perspectives were discussed:
55
+ - **Options Compared**: List each option/approach
56
+ - **Criteria for Comparison**: What dimensions were evaluated
57
+ - **Pros and Cons**: Evidence-based advantages and disadvantages for each
58
+ - **Trade-offs**: What had to be given up for each choice
59
+
60
+ #### D. Key Findings
61
+ Synthesize the most important takeaways:
62
+ - **Primary Insights**: The central conclusions or discoveries
63
+ - **Evidence Base**: What evidence supports these insights (cite transcript segments)
64
+ - **Implications**: What these findings mean for the broader topic
65
+ - **Confidence Level**: How certain is this finding (High/Medium/Low) and why
66
+
67
+ #### E. Knowledge Graph
68
+ Create a network view of the session's concepts:
69
+ - **Core Concepts**: The fundamental ideas discussed
70
+ - **Relationships**: How concepts connect (e.g., "depends on", "contradicts", "builds on")
71
+ - **Hierarchy**: Which concepts are foundational vs. derived
72
+
73
+ #### F. Open Questions & Future Research
74
+ - **Unresolved Issues**: Questions that emerged but weren't answered
75
+ - **Knowledge Gaps**: Areas where more research is needed
76
+ - **Next Steps**: Concrete follow-up actions or investigations suggested
77
+
78
+ ### 4. Quality Standards
79
+ Your synthesis must:
80
+ - **Ground claims in evidence**: Every insight should reference transcript content
81
+ - **Preserve nuance**: Don't oversimplify complex or ambiguous discussions
82
+ - **Attribute sources**: When possible, indicate where claims came from (e.g., "according to the transcript at [topic/section]")
83
+ - **Distinguish evidence from interpretation**: Clearly separate what was said vs. what you infer
84
+ - **Maintain traceability**: Keep the output verifiable against the original transcript
85
+
86
+ ### 5. Transcript Data
87
+ {{TRANSCRIPT_ALL}}
@@ -0,0 +1,54 @@
1
+ # Card Format - Structured Per-Subject Output
2
+
3
+ You are generating a structured card summary of a work session. The session has been grouped into SUBJECTS (coherent work threads).
4
+
5
+ ## Session Metadata
6
+ - **Duration:** {{SESSION_DURATION}}
7
+ - **Date:** {{SESSION_DATE}}
8
+ - **Subjects:** {{SUBJECT_COUNT}}
9
+
10
+ ## Subjects
11
+
12
+ {{SUBJECTS_DATA}}
13
+
14
+ ---
15
+
16
+ ## Instructions
17
+
18
+ Generate a structured card summary with:
19
+
20
+ 1. **Per-subject sections** with:
21
+ - Subject label as header (## Subject Name)
22
+ - Duration and activity breakdown in bold: `**3h 12m** | coding 1h 45m, debugging 52m`
23
+ - 2-4 bullet points of key accomplishments/activities (extracted from the descriptions)
24
+
25
+ 2. **Personal subjects** should be shown as a single line at the end: `*Personal time: 47m (WhatsApp, Instagram)*`
26
+
27
+ 3. **Format example:**
28
+
29
+ ```markdown
30
+ # Session Card - Feb 25, 2026
31
+
32
+ ## Escribano Pipeline Optimization
33
+ **3h 12m** | coding 1h 45m, debugging 52m, terminal 35m
34
+
35
+ - Achieved 20.6x speedup in scene detection with skip-frame nokey strategy
36
+ - Resolved LLM truncation errors via raw response logging
37
+ - Benchmarked MLX vs Ollama VLM performance
38
+
39
+ ## Research & Exploration
40
+ **32m** | research 22m, other 10m
41
+
42
+ - Explored Screenpipe repository architecture for comparison
43
+ - Reviewed HuggingFace model options for VLM inference
44
+
45
+ ---
46
+ *Personal time: 47m (filtered)*
47
+ ```
48
+
49
+ **Rules:**
50
+ - Be concise - each subject gets 2-4 bullets max
51
+ - Extract specifics from descriptions (metrics, file names, error types)
52
+ - Use present tense, first person
53
+ - Total output should be 200-500 words
54
+ - DO NOT include raw descriptions or transcripts - synthesize into bullets
@@ -0,0 +1,38 @@
1
+ # Segment Classification Prompt
2
+
3
+ You are an expert session analyst. Your task is to classify a specific segment of a work session based on visual evidence and available audio.
4
+
5
+ ## Input Context
6
+
7
+ - **Time Range**: {{TIME_RANGE}}
8
+ - **Visual Context**: {{VISUAL_CONTEXT}}
9
+ - **OCR Evidence**: {{OCR_CONTEXT}}
10
+ - **Transcript Content**: {{TRANSCRIPT_CONTENT}}
11
+ - **Vision Model Analysis**: {{VLM_DESCRIPTION}}
12
+
13
+ ## Classification Types
14
+
15
+ 1. **meeting**: Conversations, interviews, or group discussions. Multiple speakers or Q&A.
16
+ 2. **debugging**: Troubleshooting errors, fixing bugs, investigating log outputs.
17
+ 3. **tutorial**: Teaching or demonstrating a process step-by-step.
18
+ 4. **learning**: Researching, studying documentation, reading articles, watching educational videos.
19
+ 5. **working**: Active building, coding (not debugging), writing documents, designing.
20
+
21
+ ## Task
22
+
23
+ Analyze the evidence and provide a multi-label classification score (0-100) for each type. The scores represent your confidence/degree of matching for that specific segment.
24
+
25
+ If the segment contains background music or is purely transitional/noise (e.g., browsing a music player), assign low scores to all categories or focus on the primary intent if visible.
26
+
27
+ ## Output Format
28
+
29
+ Return ONLY a JSON object with this structure:
30
+ ```json
31
+ {
32
+ "meeting": number,
33
+ "debugging": number,
34
+ "tutorial": number,
35
+ "learning": number,
36
+ "working": number
37
+ }
38
+ ```
@@ -0,0 +1,37 @@
1
+ # Session Classification
2
+
3
+ Output ONLY JSON scores (0-100) for each session type.
4
+
5
+ ## Session Types:
6
+
7
+ **meeting** - Conversations, interviews, discussions
8
+ Examples: Team meetings, client calls, 1-on-1s, planning sessions
9
+ Look for: Multiple speakers, Q&A format, decisions being made
10
+
11
+ **debugging** - Fixing errors and troubleshooting
12
+ Examples: Finding bugs, fixing tests, resolving crashes
13
+ Look for: Error messages, "not working", investigation steps
14
+
15
+ **tutorial** - Teaching or demonstrating
16
+ Examples: How-to guides, walkthroughs, step-by-step explanations
17
+ Look for: Instructions, "first do this, then...", teaching tone
18
+
19
+ **learning** - Researching or studying
20
+ Examples: Reading docs, exploring frameworks, comparing options
21
+ Look for: "Let me understand", research, exploration
22
+
23
+ **working** - Building or creating (not fixing)
24
+ Examples: Writing features, refactoring, implementing new code
25
+ Look for: Creating files, "let's implement", productive coding
26
+
27
+ ## Input Context:
28
+
29
+ ### Visual Log (Screen Activity)
30
+ {{VISUAL_LOG}}
31
+
32
+ ### Transcript
33
+ {{TRANSCRIPT_ALL}}
34
+
35
+ ## Output Format:
36
+ Output ONLY JSON scores (0-100) for each session type.
37
+ {"meeting": 85, "debugging": 10, "tutorial": 0, "learning": 45, "working": 20}
@@ -0,0 +1,163 @@
1
+ # Code Snippets & Implementation Details
2
+
3
+ You are a developer documenting code changes and implementation details from a working session. Your goal is to create **literate documentation**: an explanatory narrative that embeds code as part of the story of how and why the solution was built.
4
+
5
+ ## Context
6
+ Metadata: {{METADATA}}
7
+ Visual Log: {{VISUAL_LOG}}
8
+ Detected Language: {{LANGUAGE}}
9
+
10
+ ## Visual Integration Rule
11
+ If a code snippet is demonstrated on screen but not fully captured in text, you can request a screenshot of the editor using `[SCREENSHOT: timestamp]`.
12
+
13
+ ## Core Principles
14
+ 1. **Literate Programming**: Organize code by human logic, not file order. Tell the story of implementation decisions.
15
+ 2. **Why Over How**: Focus on motivations, trade-offs, and reasoning. Let the code speak for itself when possible.
16
+ 3. **Complete Yet Concise**: Include necessary context, imports, and error handling, but avoid obvious explanations.
17
+
18
+ ## Language Rule
19
+ Use English for:
20
+ - Section headings
21
+ - Structure markers (e.g., "##", "**", lists)
22
+ - Technical terminology (e.g., "function", "class", "exception")
23
+
24
+ Use {{LANGUAGE}} for:
25
+ - Code content and variable names
26
+ - Descriptions of implementation logic
27
+ - Explanations of what code does in original language
28
+
29
+ ## Output Structure
30
+
31
+ ### 1. Implementation Overview
32
+ **Purpose**: Summarize what was built and why it matters.
33
+
34
+ Include:
35
+ - Problem statement (what challenge did this solve?)
36
+ - High-level approach (what pattern/architecture?)
37
+ - Key components (modules, classes, major functions)
38
+ - Dependencies (external libraries, APIs, services)
39
+ - Known limitations or TODOs
40
+
41
+ **Format**: 3-5 paragraphs, maximum.
42
+
43
+ ---
44
+
45
+ ### 2. Refined Code Snippets
46
+ **Purpose**: Present clean, documented, ready-to-use code.
47
+
48
+ Organize **hierarchically** by logical flow:
49
+ - By component/module (if multiple)
50
+ - By class or major function group
51
+ - By implementation phase (setup → core → helpers)
52
+
53
+ **For each snippet**:
54
+ 1. **Context** (2-3 sentences): What does this code do? Where does it fit?
55
+ 2. **Code block**:
56
+ - Include necessary imports
57
+ - Follow language conventions (PEP 8 for Python, Google Style for JS/TS, etc.)
58
+ - Add **docstrings** for all functions/classes with:
59
+ - Summary line (imperative: "Do X", not "Does X")
60
+ - Parameters (name, type, description)
61
+ - Return value (type, description)
62
+ - Exceptions raised (if any)
63
+ - Mark incomplete/placeholder code with `[TODO]` or `[FIXME]`
64
+ 3. **Notes** (if needed): Edge cases, assumptions, or important details
65
+
66
+ **Improvement Guidelines**:
67
+ - Fix formatting (indentation, line length < 80 chars where possible)
68
+ - Add missing imports
69
+ - Complete partial code where intent is clear from transcript
70
+ - Remove commented-out dead code
71
+ - Standardize naming (snake_case for Python/other, camelCase for JS/TS)
72
+ - Add type hints (Python) or JSDoc (JavaScript/TypeScript) if clear from context
73
+
74
+ **Example format**:
75
+ ```python
76
+ # Helper function for processing user input
77
+
78
+ def validate_email(email: str) -> bool:
79
+ """Validate an email address using regex pattern.
80
+
81
+ Args:
82
+ email: The email string to validate.
83
+
84
+ Returns:
85
+ bool: True if valid, False otherwise.
86
+
87
+ Raises:
88
+ ValueError: If email is None or empty string.
89
+ """
90
+ import re
91
+ # Implementation...
92
+ ```
93
+
94
+ ---
95
+
96
+ ### 3. Technical Decisions
97
+ **Purpose**: Document the reasoning behind key choices.
98
+
99
+ For each significant decision (pattern, library, architecture choice):
100
+ 1. **Decision**: What was chosen?
101
+ 2. **Alternatives Considered**: What other options existed?
102
+ 3. **Rationale**: Why was this chosen? (trade-offs, requirements, constraints)
103
+ 4. **Implications**: What does this decision affect? (performance, maintainability, future work)
104
+
105
+ **Prioritize**: Architecture choices, algorithm selection, library dependencies, data structures.
106
+
107
+ **Format**: Bullet points or short table.
108
+
109
+ ---
110
+
111
+ ### 4. Usage Examples
112
+ **Purpose**: Show how to use the implemented code.
113
+
114
+ Provide 1-3 **runnable examples** covering:
115
+ - Basic use case (primary functionality)
116
+ - Edge case or advanced use (if applicable)
117
+ - Integration with other components (if relevant)
118
+
119
+ **Each example should**:
120
+ - Be self-contained (setup, execution, expected output)
121
+ - Include comments explaining each step
122
+ - Show both success and error paths (if applicable)
123
+
124
+ **Format**: Code block with explanatory text before/after.
125
+
126
+ ---
127
+
128
+ ### 5. Testing & Validation (Optional but Recommended)
129
+ **Purpose**: Verify the implementation works as intended.
130
+
131
+ Include if mentioned in transcript or implied by complexity:
132
+ - Test cases for critical functions
133
+ - Example inputs and expected outputs
134
+ - Known bugs or areas needing more testing
135
+
136
+ ---
137
+
138
+ ## Source Material
139
+ Use the following as input, prioritizing completeness and clarity:
140
+
141
+ - **Transcript**: {{TRANSCRIPT_ALL}}
142
+ - **Pre-extracted snippets**: {{CODE_SNIPPETS}}
143
+
144
+ When transcript and snippets conflict:
145
+ - Use transcript for context and intent
146
+ - Use snippets for code structure
147
+ - Reconcile by favoring code that makes logical sense
148
+
149
+ ---
150
+
151
+ ## Quality Checklist
152
+ Before finalizing:
153
+ - [ ] All code blocks compile/syntactically valid (in target language)
154
+ - [ ] Every function/class has a docstring
155
+ - [ ] All imports are included at top of relevant blocks
156
+ - [ ] Decisions include alternatives and rationale
157
+ - [ ] Usage examples are runnable (or clearly marked as pseudocode)
158
+ - [ ] Incomplete code is marked with [TODO] or similar
159
+ - [ ] English used for structure only, {{LANGUAGE}} for content
160
+ - [ ] No obvious code is explained (let code speak for itself)
161
+
162
+ ## Transcript
163
+ {{TRANSCRIPT_ALL}}
@@ -0,0 +1,149 @@
1
+ # Transcript Metadata Extraction
2
+
3
+ Extract structured metadata from this session.
4
+
5
+ ## Visual Log (Significant Screen Changes)
6
+ {{VISUAL_LOG}}
7
+
8
+ ## Session Classification
9
+ {{CLASSIFICATION_SUMMARY}}
10
+
11
+ Example: "meeting: 85%, learning: 45%"
12
+
13
+ ## Metadata Types to Extract
14
+
15
+ ### 1. Speakers (extract if meeting/tutorial)
16
+ List all participants mentioned in the conversation with their roles if provided.
17
+
18
+ **Fields:**
19
+ - `name`: Participant's name
20
+ - `role`: Their role/title if mentioned (e.g., "Engineering Lead", "Product Manager")
21
+
22
+ **Example Output:**
23
+ ```json
24
+ {
25
+ "speakers": [
26
+ {"name": "Alice", "role": "Engineering Lead"},
27
+ {"name": "Bob", "role": "Product Manager"},
28
+ {"name": "Carol", "role": "Designer"}
29
+ ]
30
+ }
31
+ ```
32
+
33
+ ### 2. Key Moments (extract always)
34
+ Important timestamps with descriptions of significant events, decisions, or insights.
35
+
36
+ **CRITICAL:** If the transcript is silent, use the **Visual Log** to identify steps. A scene change at a specific timestamp usually indicates a new action or state.
37
+
38
+ **Fields:**
39
+ - `timestamp`: Time in seconds
40
+ - `description`: What happened
41
+ - `importance`: "high", "medium", or "low"
42
+
43
+ **Importance Guidelines:**
44
+ - **high**: Major decisions, critical issues, breakthrough insights
45
+ - **medium**: Important discussions, technical findings
46
+ - **low**: Minor details, background information
47
+
48
+ **Example Output:**
49
+ ```json
50
+ {
51
+ "keyMoments": [
52
+ {"timestamp": 120, "description": "Decided on Q1 priorities", "importance": "high"},
53
+ {"timestamp": 450, "description": "Identified root cause of authentication bug", "importance": "high"},
54
+ {"timestamp": 600, "description": "Reviewed database schema", "importance": "medium"}
55
+ ]
56
+ }
57
+ ```
58
+
59
+ ### 3. Action Items (extract if meeting/working)
60
+ Specific tasks that need to be completed, with owners and priorities.
61
+
62
+ **Fields:**
63
+ - `description`: What needs to be done
64
+ - `owner`: Who is responsible (use "Unknown" if unclear)
65
+ - `priority`: "high", "medium", or "low"
66
+
67
+ **Example Output:**
68
+ ```json
69
+ {
70
+ "actionItems": [
71
+ {"description": "Create technical spec for auth feature", "owner": "Alice", "priority": "high"},
72
+ {"description": "Schedule user research sessions", "owner": "Bob", "priority": "medium"},
73
+ {"description": "Update documentation", "owner": "Carol", "priority": "low"}
74
+ ]
75
+ }
76
+ ```
77
+
78
+ ### 4. Technical Terms (extract if debugging/working/learning)
79
+ Error messages, file paths, function names, variables, or other technical concepts mentioned.
80
+
81
+ **Fields:**
82
+ - `term`: The technical term
83
+ - `context`: Where it was mentioned or what it means
84
+ - `type`: One of: "error", "file", "function", "variable", "other"
85
+
86
+ **Type Guidelines:**
87
+ - **error**: Error messages, stack traces, exception names
88
+ - **file**: File paths, document names, configuration files
89
+ - **function**: Function/method names, API calls
90
+ - **variable**: Variable names, constants, configuration keys
91
+ - **other**: Other technical terms not fitting above categories
92
+
93
+ **Example Output:**
94
+ ```json
95
+ {
96
+ "technicalTerms": [
97
+ {"term": "NullPointerException", "context": "User login flow error", "type": "error"},
98
+ {"term": "/api/auth/validate", "context": "Endpoint for validating JWT tokens", "type": "function"},
99
+ {"term": "config.yaml", "context": "Configuration file for auth service", "type": "file"},
100
+ {"term": "MAX_RETRIES", "context": "Environment variable for retry logic", "type": "variable"}
101
+ ]
102
+ }
103
+ ```
104
+
105
+ ### 5. Code Snippets (extract if working/tutorial/learning)
106
+ Code examples, commands, or technical explanations with code.
107
+
108
+ **Fields:**
109
+ - `language`: Programming language or command type (e.g., "typescript", "python", "bash")
110
+ - `code`: The actual code
111
+ - `description`: What the code does (optional)
112
+ - `timestamp`: Approximate time in seconds if mentioned (optional)
113
+
114
+ **Example Output:**
115
+ ```json
116
+ {
117
+ "codeSnippets": [
118
+ {
119
+ "language": "typescript",
120
+ "code": "if (user != null) {\n validateToken(user.token);\n}",
121
+ "description": "Null check before token validation"
122
+ },
123
+ {
124
+ "language": "bash",
125
+ "code": "npm install --save @auth/sdk",
126
+ "description": "Install authentication SDK"
127
+ }
128
+ ]
129
+ }
130
+ ```
131
+
132
+ ## Output Format
133
+
134
+ Output ONLY valid JSON with the following structure. If a metadata type doesn't apply to this session, include it as an empty array.
135
+
136
+ ```json
137
+ {
138
+ "speakers": [...],
139
+ "keyMoments": [...],
140
+ "actionItems": [...],
141
+ "technicalTerms": [...],
142
+ "codeSnippets": [...]
143
+ }
144
+ ```
145
+
146
+ ## Input Data
147
+
148
+ ### Transcript Segments
149
+ {{TRANSCRIPT_SEGMENTS}}