@houtini/gemini-mcp 1.2.2 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -583,6 +583,160 @@ Use Gemini deep research with 3 iterations focusing on cost analysis and market
583
583
 
584
584
  **Note**: Deep research takes several minutes as it performs multiple iterations. Perfect for when you need comprehensive, well-researched analysis rather than quick answers.
585
585
 
586
+ ## Video Description for Accessibility
587
+
588
+ The server includes a specialized **video description tool** designed to generate detailed audio descriptions of YouTube video content for accessibility purposes. This tool analyzes visual content including actions, spatial relationships, text overlays, and provides time-stamped descriptions suitable for screen reader users.
589
+
590
+ ### How Video Description Works
591
+
592
+ The tool uses Gemini's multimodal understanding capabilities to:
593
+
594
+ 1. **Analyze Visual Content** - Processes the video frame by frame to understand what's happening
595
+ 2. **Describe Actions** - Details all movements, gestures, and procedures shown
596
+ 3. **Spatial Awareness** - Describes positions, movements, and relationships between elements
597
+ 4. **Text Recognition** - Transcribes all on-screen text, captions, and overlays verbatim
598
+ 5. **Structured Output** - Organizes information in screen reader-friendly formats
599
+
600
+ ### Using Video Description
601
+
602
+ **Basic Usage:**
603
+ ```
604
+ Describe this YouTube video for accessibility: https://www.youtube.com/watch?v=abc123
605
+ ```
606
+
607
+ **With Specific Detail Level:**
608
+ ```
609
+ Generate a comprehensive accessibility description for this video: https://www.youtube.com/watch?v=abc123
610
+ ```
611
+
612
+ **Focus on Specific Aspects:**
613
+ ```
614
+ Describe the actions shown in this instructional video: https://www.youtube.com/watch?v=abc123
615
+ ```
616
+
617
+ ### Description Parameters
618
+
619
+ | Parameter | Options | Default | Description |
620
+ |-----------|---------|---------|-------------|
621
+ | `detail_level` | `brief`, `standard`, `comprehensive` | `standard` | Controls description depth |
622
+ | `accessibility_mode` | `true`, `false` | `true` | Includes spatial, color, and text details |
623
+ | `include_timestamps` | `true`, `false` | `true` | Adds timestamps to descriptions |
624
+ | `output_format` | `narrative`, `structured` | `structured` | Output as prose or JSON |
625
+ | `focus` | `actions`, `visual_elements`, `text_on_screen`, `all` | `all` | What to emphasize |
626
+
627
+ ### Detail Levels Explained
628
+
629
+ **Brief** (2-3 paragraphs)
630
+ - High-level summary of video content
631
+ - Main actions and key points only
632
+ - Perfect for quick overview
633
+
634
+ **Standard** (5-8 scenes)
635
+ - Balanced description with key scenes
636
+ - Major actions and visual elements
637
+ - Recommended for most use cases
638
+
639
+ **Comprehensive** (10-15+ scenes)
640
+ - Extensive detail for every significant moment
641
+ - Complete coverage of all visual information
642
+ - Best for instructional or technical content
643
+
644
+ ### Accessibility Mode Features
645
+
646
+ When enabled (default), includes:
647
+
648
+ - **Spatial Descriptions**: "on the left side of frame", "moving from right to left"
649
+ - **Color Information**: "red laser line", "blue indicator light"
650
+ - **Text Transcription**: All on-screen text verbatim
651
+ - **Facial Expressions**: When relevant to understanding
652
+ - **Sound Cues**: Visible audio indicators
653
+ - **Concrete Language**: Avoids vague references like "this" or "here"
654
+
655
+ ### Output Formats
656
+
657
+ **Structured JSON** (default):
658
+ ```json
659
+ {
660
+ "video_title": "How to Install Picture Mounts",
661
+ "duration": "4:32",
662
+ "scene_count": 8,
663
+ "scenes": [
664
+ {
665
+ "timestamp": "0:00-0:45",
666
+ "description": "Close-up of hands positioning metal cleat on frame back",
667
+ "actions": ["Marking screw holes", "Using pencil"],
668
+ "tools_visible": ["Metal cleat", "Pencil"],
669
+ "text_on_screen": null,
670
+ "spatial_notes": "Cleat positioned horizontally on upper portion of frame",
671
+ "colors": "Silver metal cleat against dark wood frame"
672
+ }
673
+ ],
674
+ "key_techniques": ["Using laser level", "Locating wall studs"],
675
+ "tools_list": ["Drill", "Laser level", "Stud finder"],
676
+ "summary": "Demonstrates professional picture hanging using interlocking cleats"
677
+ }
678
+ ```
679
+
680
+ **Narrative Prose**:
681
+ ```
682
+ [0:00] The video opens with a close-up view of hands positioning a silver metal
683
+ cleat horizontally on the back of a dark wooden picture frame...
684
+ ```
685
+
686
+ ### Perfect For
687
+
688
+ **Accessibility Needs**:
689
+ - Screen reader users
690
+ - Visually impaired individuals
691
+ - Audio description requirements
692
+ - WCAG compliance
693
+
694
+ **Content Analysis**:
695
+ - Tutorial documentation
696
+ - Training material transcription
697
+ - Video content indexing
698
+ - Instructional content breakdown
699
+
700
+ **Research & Education**:
701
+ - Video content study
702
+ - Procedural analysis
703
+ - Teaching material preparation
704
+
705
+ ### Best Practices
706
+
707
+ 1. **Choose appropriate detail level** - Brief for overviews, comprehensive for tutorials
708
+ 2. **Enable accessibility mode** - Always on for screen reader users
709
+ 3. **Use structured format** - Better for navigation with assistive technology
710
+ 4. **Include timestamps** - Essential for sync with video playback
711
+ 5. **Focus parameter** - Use when specific aspects are most important
712
+
713
+ ### Technical Requirements
714
+
715
+ - **Supported Platforms**: YouTube videos only
716
+ - **Model Used**: Gemini 2.5 Flash (optimized for video understanding)
717
+ - **Response Time**: 30-60 seconds depending on detail level
718
+ - **Token Usage**: 8,000-16,000 tokens for comprehensive descriptions
719
+
720
+ ### Example Use Cases
721
+
722
+ **Tutorial Documentation**:
723
+ ```
724
+ Generate a comprehensive accessibility description focusing on actions for this DIY tutorial:
725
+ https://www.youtube.com/watch?v=example
726
+ ```
727
+
728
+ **Content Indexing**:
729
+ ```
730
+ Create a brief structured description of this product demo:
731
+ https://www.youtube.com/watch?v=example
732
+ ```
733
+
734
+ **Training Materials**:
735
+ ```
736
+ Describe the visual elements and text shown in this training video:
737
+ https://www.youtube.com/watch?v=example
738
+ ```
739
+
586
740
  ## API Reference
587
741
 
588
742
  ### Available Tools
@@ -653,6 +807,58 @@ Conduct iterative multi-step research on complex topics.
653
807
  }
654
808
  ```
655
809
 
810
+ #### `gemini_describe_video`
811
+
812
+ Generate detailed audio descriptions of YouTube video content for accessibility.
813
+
814
+ **Parameters:**
815
+
816
+ | Parameter | Type | Required | Default | Description |
817
+ |-----------|------|----------|---------|-------------|
818
+ | `url` | string | Yes | - | YouTube URL to analyze |
819
+ | `detail_level` | string | No | `standard` | Level of detail: `brief`, `standard`, `comprehensive` |
820
+ | `accessibility_mode` | boolean | No | `true` | Include spatial descriptions, colors, text overlays |
821
+ | `include_timestamps` | boolean | No | `true` | Add timestamps for all major actions |
822
+ | `output_format` | string | No | `structured` | Format: `narrative` (prose) or `structured` (JSON) |
823
+ | `focus` | string | No | `all` | Focus: `actions`, `visual_elements`, `text_on_screen`, `all` |
824
+
825
+ **Example:**
826
+ ```json
827
+ {
828
+ "url": "https://www.youtube.com/watch?v=abc123",
829
+ "detail_level": "comprehensive",
830
+ "accessibility_mode": true,
831
+ "include_timestamps": true,
832
+ "output_format": "structured",
833
+ "focus": "all"
834
+ }
835
+ ```
836
+
837
+ **Response Format (Structured)**:
838
+ ```json
839
+ {
840
+ "video_title": "How to Install Picture Mounts",
841
+ "duration": "4:32",
842
+ "scene_count": 8,
843
+ "scenes": [
844
+ {
845
+ "timestamp": "0:00-0:45",
846
+ "description": "Detailed scene description",
847
+ "actions": ["action1", "action2"],
848
+ "tools_visible": ["tool1", "tool2"],
849
+ "text_on_screen": "Exact text or null",
850
+ "spatial_notes": "Spatial relationships",
851
+ "colors": "Color descriptions"
852
+ }
853
+ ],
854
+ "key_techniques": ["technique1", "technique2"],
855
+ "tools_list": ["all tools shown"],
856
+ "materials_list": ["all materials"],
857
+ "safety_notes": ["safety considerations"],
858
+ "summary": "Overall summary"
859
+ }
860
+ ```
861
+
656
862
  ### Available Models
657
863
 
658
864
  Models are **dynamically discovered** from Google's API. The exact list may vary, but typically includes:
@@ -883,6 +1089,19 @@ By using this software, you acknowledge that you have read this disclaimer and a
883
1089
 
884
1090
  ## Changelog
885
1091
 
1092
+ ### v1.3.0
1093
+
1094
+ **Accessibility Features**
1095
+ - Added `gemini_describe_video` tool for YouTube video accessibility descriptions
1096
+ - Comprehensive audio description generation with spatial awareness and text recognition
1097
+ - Screen reader-friendly structured JSON output format
1098
+ - Configurable detail levels (brief, standard, comprehensive)
1099
+ - Time-stamped scene descriptions for video synchronization
1100
+ - Focus parameters for emphasizing specific content aspects (actions, visual elements, text)
1101
+ - Accessibility mode with spatial descriptions, colors, and text overlays
1102
+ - Support for narrative prose and structured JSON output formats
1103
+ - Optimized for Gemini 2.5 Flash's multimodal video understanding
1104
+
886
1105
  ### v1.0.4
887
1106
 
888
1107
  **Security and Dependency Updates**
@@ -58,7 +58,7 @@ exports.config = {
58
58
  threshold: 'BLOCK_MEDIUM_AND_ABOVE'
59
59
  }
60
60
  ],
61
- defaultModel: 'gemini-2.5-flash',
61
+ defaultModel: 'gemini-2.5-flash-lite-preview-09-2025',
62
62
  maxTokens: 16384,
63
63
  temperature: 0.7,
64
64
  defaultGrounding: true,
@@ -1 +1 @@
1
- {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/config/index.ts"],"names":[],"mappings":";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AA0CA,wCAOC;AAhDD,+CAAiC;AAEjC,MAAM,CAAC,MAAM,EAAE,CAAC;AAEH,QAAA,MAAM,GAAW;IAC5B,MAAM,EAAE;QACN,MAAM,EAAE,OAAO,CAAC,GAAG,CAAC,cAAc;QAClC,cAAc,EAAE;YACd;gBACE,QAAQ,EAAE,0BAA0B;gBACpC,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,2BAA2B;gBACrC,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,iCAAiC;gBAC3C,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,iCAAiC;gBAC3C,SAAS,EAAE,wBAAwB;aACpC;SACF;QACD,YAAY,EAAE,kBAAkB;QAChC,SAAS,EAAE,KAAK;QAChB,WAAW,EAAE,GAAG;QAChB,gBAAgB,EAAE,IAAI;QACtB,uBAAuB,EAAE,OAAO,CAAC,GAAG,CAAC,yBAAyB,KAAK,MAAM;KAC1E;IACD,MAAM,EAAE;QACN,IAAI,EAAE,YAAY;QAClB,OAAO,EAAE,OAAO;KACjB;IACD,OAAO,EAAE;QACP,KAAK,EAAE,OAAO,CAAC,GAAG,CAAC,SAAS,IAAI,MAAM;QACtC,MAAM,EAAE,UAAU;KACnB;CACF,CAAC;AAEF,SAAgB,cAAc;IAC5B,IAAI,CAAC,cAAM,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC;QAC1B,MAAM,IAAI,KAAK,CACb,+CAA+C;YAC/C,yEAAyE,CAC1E,CAAC;IACJ,CAAC;AACH,CAAC"}
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../../src/config/index.ts"],"names":[],"mappings":";;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;AA0CA,wCAOC;AAhDD,+CAAiC;AAEjC,MAAM,CAAC,MAAM,EAAE,CAAC;AAEH,QAAA,MAAM,GAAW;IAC5B,MAAM,EAAE;QACN,MAAM,EAAE,OAAO,CAAC,GAAG,CAAC,cAAc;QAClC,cAAc,EAAE;YACd;gBACE,QAAQ,EAAE,0BAA0B;gBACpC,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,2BAA2B;gBACrC,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,iCAAiC;gBAC3C,SAAS,EAAE,wBAAwB;aACpC;YACD;gBACE,QAAQ,EAAE,iCAAiC;gBAC3C,SAAS,EAAE,wBAAwB;aACpC;SACF;QACD,YAAY,EAAE,uCAAuC;QACrD,SAAS,EAAE,KAAK;QAChB,WAAW,EAAE,GAAG;QAChB,gBAAgB,EAAE,IAAI;QACtB,uBAAuB,EAAE,OAAO,CAAC,GAAG,CAAC,yBAAyB,KAAK,MAAM;KAC1E;IACD,MAAM,EAAE;QACN,IAAI,EAAE,YAAY;QAClB,OAAO,EAAE,OAAO;KACjB;IACD,OAAO,EAAE;QACP,KAAK,EAAE,OAAO,CAAC,GAAG,CAAC,SAAS,IAAI,MAAM;QACtC,MAAM,EAAE,UAAU;KACnB;CACF,CAAC;AAEF,SAAgB,cAAc;IAC5B,IAAI,CAAC,cAAM,CAAC,MAAM,CAAC,MAAM,EAAE,CAAC;QAC1B,MAAM,IAAI,KAAK,CACb,+CAA+C;YAC/C,yEAAyE,CAC1E,CAAC;IACJ,CAAC;AACH,CAAC"}
@@ -1 +1 @@
1
- {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAiBA,cAAM,eAAe;IACnB,OAAO,CAAC,MAAM,CAAS;IACvB,OAAO,CAAC,aAAa,CAAgB;IACrC,OAAO,CAAC,KAAK,CAAmB;;IAkChC,OAAO,CAAC,eAAe;IAevB,OAAO,CAAC,aAAa;IAoDf,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC;CAsB7B;AAkCD,OAAO,EAAE,eAAe,EAAE,CAAC"}
1
+ {"version":3,"file":"index.d.ts","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";AAkBA,cAAM,eAAe;IACnB,OAAO,CAAC,MAAM,CAAS;IACvB,OAAO,CAAC,aAAa,CAAgB;IACrC,OAAO,CAAC,KAAK,CAAmB;;IAkChC,OAAO,CAAC,eAAe;IAiBvB,OAAO,CAAC,aAAa;IAoDf,KAAK,IAAI,OAAO,CAAC,IAAI,CAAC;CAsB7B;AAkCD,OAAO,EAAE,eAAe,EAAE,CAAC"}
package/dist/index.js CHANGED
@@ -13,6 +13,7 @@ const gemini_1 = require("./services/gemini");
13
13
  const gemini_chat_1 = require("./tools/gemini-chat");
14
14
  const gemini_list_models_1 = require("./tools/gemini-list-models");
15
15
  const gemini_deep_research_1 = require("./tools/gemini-deep-research");
16
+ const gemini_describe_video_1 = require("./tools/gemini-describe-video");
16
17
  const logger_1 = __importDefault(require("./utils/logger"));
17
18
  const error_handler_1 = require("./utils/error-handler");
18
19
  class GeminiMcpServer {
@@ -48,9 +49,11 @@ class GeminiMcpServer {
48
49
  const geminiChatTool = new gemini_chat_1.GeminiChatTool(this.geminiService);
49
50
  const geminiListModelsTool = new gemini_list_models_1.GeminiListModelsTool(this.geminiService);
50
51
  const geminiDeepResearchTool = new gemini_deep_research_1.GeminiDeepResearchTool(this.geminiService);
52
+ const geminiDescribeVideoTool = new gemini_describe_video_1.GeminiDescribeVideoTool(this.geminiService);
51
53
  this.tools.set('gemini_chat', geminiChatTool);
52
54
  this.tools.set('gemini_list_models', geminiListModelsTool);
53
55
  this.tools.set('gemini_deep_research', geminiDeepResearchTool);
56
+ this.tools.set('gemini_describe_video', geminiDescribeVideoTool);
54
57
  logger_1.default.info('Tools initialized', {
55
58
  toolCount: this.tools.size,
56
59
  tools: Array.from(this.tools.keys())
package/dist/index.js.map CHANGED
@@ -1 +1 @@
1
- {"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";;;;;;;AAEA,wEAAmE;AACnE,wEAAiF;AACjF,iEAG4C;AAE5C,qCAAkD;AAClD,8CAAkD;AAClD,qDAAqD;AACrD,mEAAkE;AAClE,uEAAsE;AACtE,4DAAoC;AACpC,yDAAoD;AAEpD,MAAM,eAAe;IACX,MAAM,CAAS;IACf,aAAa,CAAgB;IAC7B,KAAK,CAAmB;IAEhC;QACE,IAAI,CAAC;YACH,IAAA,uBAAc,GAAE,CAAC;QACnB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,gBAAM,CAAC,KAAK,CAAC,iCAAiC,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;YAC3D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;QAClB,CAAC;QAED,IAAI,CAAC,aAAa,GAAG,IAAI,sBAAa,CAAC,eAAM,CAAC,MAAM,CAAC,CAAC;QAEtD,IAAI,CAAC,MAAM,GAAG,IAAI,iBAAM,CACtB;YACE,IAAI,EAAE,eAAM,CAAC,MAAM,CAAC,IAAI;YACxB,OAAO,EAAE,eAAM,CAAC,MAAM,CAAC,OAAO;SAC/B,EACD;YACE,YAAY,EAAE;gBACZ,KAAK,EAAE,EAAE;aACV;SACF,CACF,CAAC;QAEF,IAAI,CAAC,KAAK,GAAG,IAAI,GAAG,EAAE,CAAC;QACvB,IAAI,CAAC,eAAe,EAAE,CAAC;QACvB,IAAI,CAAC,aAAa,EAAE,CAAC;QAErB,gBAAM,CAAC,IAAI,CAAC,+BAA+B,EAAE;YAC3C,UAAU,EAAE,eAAM,CAAC,MAAM,CAAC,IAAI;YAC9B,OAAO,EAAE,eAAM,CAAC,MAAM,CAAC,OAAO;SAC/B,CAAC,CAAC;IACL,CAAC;IAEO,eAAe;QACrB,MAAM,cAAc,GAAG,IAAI,4BAAc,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAC9D,MAAM,oBAAoB,GAAG,IAAI,yCAAoB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAC1E,MAAM,sBAAsB,GAAG,IAAI,6CAAsB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAE9E,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,aAAa,EAAE,cAAc,CAAC,CAAC;QAC9C,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,oBAAoB,EAAE,oBAAoB,CAAC,CAAC;QAC3D,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,sBAAsB,EAAE,sBAAsB,CAAC,CAAC;QAE/D,gBAAM,CAAC,IAAI,CAAC,mBAAmB,EAAE;YAC/B,SAAS,EAAE,IAAI,CAAC,KAAK,CAAC,IAAI;YAC1B,KAAK,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC;SACrC,CAAC,CAAC;IACL,CAAC;IAEO,aAAa;QACnB,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,iCAAsB,EAAE,KAAK,IAAI,EAAE;YAC/D,gBAAM,CAAC,IAAI,CAAC,6BAA6B,CAAC,CAAC;YAE3C,MAAM,KAAK,GAAG,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,MAAM,EAAE,CAAC,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC,IAAI,CAAC,aAAa,EAAE,CAAC,CAAC;YAEhF,OAAO;gBACL,KAAK;aACN,CAAC;QACJ,CAAC,CAAC,CAAC;QAEH,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,gCAAqB,EAAE,KAAK,EAAE,OAAO,EAAE,EAAE;YACrE,MAAM,EAAE,IAAI,EAAE,SAAS,EAAE,IAAI,EAAE,GAAG,OAAO,CAAC,MAAM,CAAC;YAEjD,gBAAM,CAAC,IAAI,CAAC,4BAA4B,EAAE;gBACxC,QAAQ,EAAE,IAAI;gBACd,OAAO,EAAE,CAAC,CAAC,IAAI;aAChB,CAAC,CAAC;YAEH,IAAI,CAAC;gBACH,MAAM,IAAI,GAAG,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;gBAClC,IAAI,CAAC,IAAI,EAAE,CAAC;oBACV,MAAM,IAAI,KAAK,CAAC,iBAAiB,IAAI,EAAE,CAAC,CAAC;gBAC3C,CAAC;gBAED,MAAM,MAAM,GAAG,MAAM,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC;gBAExC,OAAO;oBACL,OAAO,EAAE,MAAM;iBAChB,CAAC;YAEJ,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBACf,gBAAM,CAAC,KAAK,CAAC,uBAAuB,EAAE;oBACpC,QAAQ,EAAE,IAAI;oBACd,KAAK,EAAG,KAAe,CAAC,OAAO;iBAChC,CAAC,CAAC;gBAEH,MAAM,QAAQ,GAAG,IAAA,2BAAW,EAAC,KAAc,EAAE,QAAQ,IAAI,EAAE,CAAC,CAAC;gBAE7D,OAAO;oBACL,OAAO,EAAE;wBACP;4BACE,IAAI,EAAE,MAAM;4BACZ,IAAI,EAAE,QAAQ,CAAC,OAAO;yBACvB;qBACF;oBACD,OAAO,EAAE,IAAI;iBACd,CAAC;YACJ,CAAC;QACH,CAAC,CAAC,CAAC;IACL,CAAC;IAED,KAAK,CAAC,KAAK;QACT,IAAI,CAAC;YACH,MAAM,OAAO,GAAG,MAAM,IAAI,CAAC,aAAa,CAAC,cAAc,EAAE,CAAC;YAC1D,IAAI,CAAC,OAAO,EAAE,CAAC;gBACb,MAAM,IAAI,KAAK,CAAC,kCAAkC,CAAC,CAAC;YACtD,CAAC;YAED,gBAAM,CAAC,IAAI,CAAC,+BAA+B,CAAC,CAAC;YAE7C,MAAM,SAAS,GAAG,IAAI,+BAAoB,EAAE,CAAC;YAC7C,MAAM,IAAI,CAAC,MAAM,CAAC,OAAO,CAAC,SAAS,CAAC,CAAC;YAErC,gBAAM,CAAC,IAAI,CAAC,wCAAwC,EAAE;gBACpD,SAAS,EAAE,OAAO;gBAClB,cAAc,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC;aAC9C,CAAC,CAAC;QAEL,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,gBAAM,CAAC,KAAK,CAAC,mCAAmC,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;YAC7D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;QAClB,CAAC;IACH,CAAC;CACF;AAkCQ,0CAAe;AAhCxB,OAAO,CAAC,EAAE,CAAC,QAAQ,EAAE,GAAG,EAAE;IACxB,gBAAM,CAAC,IAAI,CAAC,8CAA8C,CAAC,CAAC;IAC5D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;IACzB,gBAAM,CAAC,IAAI,CAAC,+CAA+C,CAAC,CAAC;IAC7D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,mBAAmB,EAAE,CAAC,KAAK,EAAE,EAAE;IACxC,gBAAM,CAAC,KAAK,CAAC,oBAAoB,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;IAC9C,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,oBAAoB,EAAE,CAAC,MAAM,EAAE,OAAO,EAAE,EAAE;IACnD,gBAAM,CAAC,KAAK,CAAC,6BAA6B,EAAE,EAAE,MAAM,EAAE,OAAO,EAAE,CAAC,CAAC;IACjE,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,KAAK,UAAU,IAAI;IACjB,MAAM,MAAM,GAAG,IAAI,eAAe,EAAE,CAAC;IACrC,MAAM,MAAM,CAAC,KAAK,EAAE,CAAC;AACvB,CAAC;AAED,IAAI,OAAO,CAAC,IAAI,KAAK,MAAM,EAAE,CAAC;IAC5B,IAAI,EAAE,CAAC,KAAK,CAAC,KAAK,CAAC,EAAE;QACnB,gBAAM,CAAC,KAAK,CAAC,uBAAuB,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;QACjD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC,CAAC,CAAC;AACL,CAAC"}
1
+ {"version":3,"file":"index.js","sourceRoot":"","sources":["../src/index.ts"],"names":[],"mappings":";;;;;;;AAEA,wEAAmE;AACnE,wEAAiF;AACjF,iEAG4C;AAE5C,qCAAkD;AAClD,8CAAkD;AAClD,qDAAqD;AACrD,mEAAkE;AAClE,uEAAsE;AACtE,yEAAwE;AACxE,4DAAoC;AACpC,yDAAoD;AAEpD,MAAM,eAAe;IACX,MAAM,CAAS;IACf,aAAa,CAAgB;IAC7B,KAAK,CAAmB;IAEhC;QACE,IAAI,CAAC;YACH,IAAA,uBAAc,GAAE,CAAC;QACnB,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,gBAAM,CAAC,KAAK,CAAC,iCAAiC,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;YAC3D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;QAClB,CAAC;QAED,IAAI,CAAC,aAAa,GAAG,IAAI,sBAAa,CAAC,eAAM,CAAC,MAAM,CAAC,CAAC;QAEtD,IAAI,CAAC,MAAM,GAAG,IAAI,iBAAM,CACtB;YACE,IAAI,EAAE,eAAM,CAAC,MAAM,CAAC,IAAI;YACxB,OAAO,EAAE,eAAM,CAAC,MAAM,CAAC,OAAO;SAC/B,EACD;YACE,YAAY,EAAE;gBACZ,KAAK,EAAE,EAAE;aACV;SACF,CACF,CAAC;QAEF,IAAI,CAAC,KAAK,GAAG,IAAI,GAAG,EAAE,CAAC;QACvB,IAAI,CAAC,eAAe,EAAE,CAAC;QACvB,IAAI,CAAC,aAAa,EAAE,CAAC;QAErB,gBAAM,CAAC,IAAI,CAAC,+BAA+B,EAAE;YAC3C,UAAU,EAAE,eAAM,CAAC,MAAM,CAAC,IAAI;YAC9B,OAAO,EAAE,eAAM,CAAC,MAAM,CAAC,OAAO;SAC/B,CAAC,CAAC;IACL,CAAC;IAEO,eAAe;QACrB,MAAM,cAAc,GAAG,IAAI,4BAAc,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAC9D,MAAM,oBAAoB,GAAG,IAAI,yCAAoB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAC1E,MAAM,sBAAsB,GAAG,IAAI,6CAAsB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAC9E,MAAM,uBAAuB,GAAG,IAAI,+CAAuB,CAAC,IAAI,CAAC,aAAa,CAAC,CAAC;QAEhF,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,aAAa,EAAE,cAAc,CAAC,CAAC;QAC9C,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,oBAAoB,EAAE,oBAAoB,CAAC,CAAC;QAC3D,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,sBAAsB,EAAE,sBAAsB,CAAC,CAAC;QAC/D,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,uBAAuB,EAAE,uBAAuB,CAAC,CAAC;QAEjE,gBAAM,CAAC,IAAI,CAAC,mBAAmB,EAAE;YAC/B,SAAS,EAAE,IAAI,CAAC,KAAK,CAAC,IAAI;YAC1B,KAAK,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC;SACrC,CAAC,CAAC;IACL,CAAC;IAEO,aAAa;QACnB,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,iCAAsB,EAAE,KAAK,IAAI,EAAE;YAC/D,gBAAM,CAAC,IAAI,CAAC,6BAA6B,CAAC,CAAC;YAE3C,MAAM,KAAK,GAAG,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,MAAM,EAAE,CAAC,CAAC,GAAG,CAAC,IAAI,CAAC,EAAE,CAAC,IAAI,CAAC,aAAa,EAAE,CAAC,CAAC;YAEhF,OAAO;gBACL,KAAK;aACN,CAAC;QACJ,CAAC,CAAC,CAAC;QAEH,IAAI,CAAC,MAAM,CAAC,iBAAiB,CAAC,gCAAqB,EAAE,KAAK,EAAE,OAAO,EAAE,EAAE;YACrE,MAAM,EAAE,IAAI,EAAE,SAAS,EAAE,IAAI,EAAE,GAAG,OAAO,CAAC,MAAM,CAAC;YAEjD,gBAAM,CAAC,IAAI,CAAC,4BAA4B,EAAE;gBACxC,QAAQ,EAAE,IAAI;gBACd,OAAO,EAAE,CAAC,CAAC,IAAI;aAChB,CAAC,CAAC;YAEH,IAAI,CAAC;gBACH,MAAM,IAAI,GAAG,IAAI,CAAC,KAAK,CAAC,GAAG,CAAC,IAAI,CAAC,CAAC;gBAClC,IAAI,CAAC,IAAI,EAAE,CAAC;oBACV,MAAM,IAAI,KAAK,CAAC,iBAAiB,IAAI,EAAE,CAAC,CAAC;gBAC3C,CAAC;gBAED,MAAM,MAAM,GAAG,MAAM,IAAI,CAAC,OAAO,CAAC,IAAI,CAAC,CAAC;gBAExC,OAAO;oBACL,OAAO,EAAE,MAAM;iBAChB,CAAC;YAEJ,CAAC;YAAC,OAAO,KAAK,EAAE,CAAC;gBACf,gBAAM,CAAC,KAAK,CAAC,uBAAuB,EAAE;oBACpC,QAAQ,EAAE,IAAI;oBACd,KAAK,EAAG,KAAe,CAAC,OAAO;iBAChC,CAAC,CAAC;gBAEH,MAAM,QAAQ,GAAG,IAAA,2BAAW,EAAC,KAAc,EAAE,QAAQ,IAAI,EAAE,CAAC,CAAC;gBAE7D,OAAO;oBACL,OAAO,EAAE;wBACP;4BACE,IAAI,EAAE,MAAM;4BACZ,IAAI,EAAE,QAAQ,CAAC,OAAO;yBACvB;qBACF;oBACD,OAAO,EAAE,IAAI;iBACd,CAAC;YACJ,CAAC;QACH,CAAC,CAAC,CAAC;IACL,CAAC;IAED,KAAK,CAAC,KAAK;QACT,IAAI,CAAC;YACH,MAAM,OAAO,GAAG,MAAM,IAAI,CAAC,aAAa,CAAC,cAAc,EAAE,CAAC;YAC1D,IAAI,CAAC,OAAO,EAAE,CAAC;gBACb,MAAM,IAAI,KAAK,CAAC,kCAAkC,CAAC,CAAC;YACtD,CAAC;YAED,gBAAM,CAAC,IAAI,CAAC,+BAA+B,CAAC,CAAC;YAE7C,MAAM,SAAS,GAAG,IAAI,+BAAoB,EAAE,CAAC;YAC7C,MAAM,IAAI,CAAC,MAAM,CAAC,OAAO,CAAC,SAAS,CAAC,CAAC;YAErC,gBAAM,CAAC,IAAI,CAAC,wCAAwC,EAAE;gBACpD,SAAS,EAAE,OAAO;gBAClB,cAAc,EAAE,KAAK,CAAC,IAAI,CAAC,IAAI,CAAC,KAAK,CAAC,IAAI,EAAE,CAAC;aAC9C,CAAC,CAAC;QAEL,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,gBAAM,CAAC,KAAK,CAAC,mCAAmC,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;YAC7D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;QAClB,CAAC;IACH,CAAC;CACF;AAkCQ,0CAAe;AAhCxB,OAAO,CAAC,EAAE,CAAC,QAAQ,EAAE,GAAG,EAAE;IACxB,gBAAM,CAAC,IAAI,CAAC,8CAA8C,CAAC,CAAC;IAC5D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,SAAS,EAAE,GAAG,EAAE;IACzB,gBAAM,CAAC,IAAI,CAAC,+CAA+C,CAAC,CAAC;IAC7D,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,mBAAmB,EAAE,CAAC,KAAK,EAAE,EAAE;IACxC,gBAAM,CAAC,KAAK,CAAC,oBAAoB,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;IAC9C,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,OAAO,CAAC,EAAE,CAAC,oBAAoB,EAAE,CAAC,MAAM,EAAE,OAAO,EAAE,EAAE;IACnD,gBAAM,CAAC,KAAK,CAAC,6BAA6B,EAAE,EAAE,MAAM,EAAE,OAAO,EAAE,CAAC,CAAC;IACjE,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;AAClB,CAAC,CAAC,CAAC;AAEH,KAAK,UAAU,IAAI;IACjB,MAAM,MAAM,GAAG,IAAI,eAAe,EAAE,CAAC;IACrC,MAAM,MAAM,CAAC,KAAK,EAAE,CAAC;AACvB,CAAC;AAED,IAAI,OAAO,CAAC,IAAI,KAAK,MAAM,EAAE,CAAC;IAC5B,IAAI,EAAE,CAAC,KAAK,CAAC,KAAK,CAAC,EAAE;QACnB,gBAAM,CAAC,KAAK,CAAC,uBAAuB,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;QACjD,OAAO,CAAC,IAAI,CAAC,CAAC,CAAC,CAAC;IAClB,CAAC,CAAC,CAAC;AACL,CAAC"}
@@ -0,0 +1,38 @@
1
+ import { TextContent, Tool } from '@modelcontextprotocol/sdk/types.js';
2
+ import { GeminiService } from '../services/gemini';
3
+ export interface VideoDescriptionRequest {
4
+ url: string;
5
+ detail_level?: 'brief' | 'standard' | 'comprehensive';
6
+ accessibility_mode?: boolean;
7
+ include_timestamps?: boolean;
8
+ output_format?: 'narrative' | 'structured';
9
+ focus?: 'actions' | 'visual_elements' | 'text_on_screen' | 'all';
10
+ }
11
+ export interface VideoScene {
12
+ timestamp: string;
13
+ description: string;
14
+ actions?: string[];
15
+ tools_visible?: string[];
16
+ text_on_screen?: string | null;
17
+ spatial_notes?: string;
18
+ colors?: string;
19
+ }
20
+ export interface StructuredVideoDescription {
21
+ video_title?: string;
22
+ duration?: string;
23
+ scene_count: number;
24
+ scenes: VideoScene[];
25
+ key_techniques?: string[];
26
+ tools_list?: string[];
27
+ materials_list?: string[];
28
+ safety_notes?: string[];
29
+ summary: string;
30
+ }
31
+ export declare class GeminiDescribeVideoTool {
32
+ private geminiService;
33
+ constructor(geminiService: GeminiService);
34
+ getDefinition(): Tool;
35
+ private buildAccessibilityPrompt;
36
+ execute(args: any): Promise<TextContent[]>;
37
+ }
38
+ //# sourceMappingURL=gemini-describe-video.d.ts.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"gemini-describe-video.d.ts","sourceRoot":"","sources":["../../src/tools/gemini-describe-video.ts"],"names":[],"mappings":"AAAA,OAAO,EAAE,WAAW,EAAE,IAAI,EAAE,MAAM,oCAAoC,CAAC;AACvE,OAAO,EAAE,aAAa,EAAE,MAAM,oBAAoB,CAAC;AAInD,MAAM,WAAW,uBAAuB;IACtC,GAAG,EAAE,MAAM,CAAC;IACZ,YAAY,CAAC,EAAE,OAAO,GAAG,UAAU,GAAG,eAAe,CAAC;IACtD,kBAAkB,CAAC,EAAE,OAAO,CAAC;IAC7B,kBAAkB,CAAC,EAAE,OAAO,CAAC;IAC7B,aAAa,CAAC,EAAE,WAAW,GAAG,YAAY,CAAC;IAC3C,KAAK,CAAC,EAAE,SAAS,GAAG,iBAAiB,GAAG,gBAAgB,GAAG,KAAK,CAAC;CAClE;AAED,MAAM,WAAW,UAAU;IACzB,SAAS,EAAE,MAAM,CAAC;IAClB,WAAW,EAAE,MAAM,CAAC;IACpB,OAAO,CAAC,EAAE,MAAM,EAAE,CAAC;IACnB,aAAa,CAAC,EAAE,MAAM,EAAE,CAAC;IACzB,cAAc,CAAC,EAAE,MAAM,GAAG,IAAI,CAAC;IAC/B,aAAa,CAAC,EAAE,MAAM,CAAC;IACvB,MAAM,CAAC,EAAE,MAAM,CAAC;CACjB;AAED,MAAM,WAAW,0BAA0B;IACzC,WAAW,CAAC,EAAE,MAAM,CAAC;IACrB,QAAQ,CAAC,EAAE,MAAM,CAAC;IAClB,WAAW,EAAE,MAAM,CAAC;IACpB,MAAM,EAAE,UAAU,EAAE,CAAC;IACrB,cAAc,CAAC,EAAE,MAAM,EAAE,CAAC;IAC1B,UAAU,CAAC,EAAE,MAAM,EAAE,CAAC;IACtB,cAAc,CAAC,EAAE,MAAM,EAAE,CAAC;IAC1B,YAAY,CAAC,EAAE,MAAM,EAAE,CAAC;IACxB,OAAO,EAAE,MAAM,CAAC;CACjB;AAED,qBAAa,uBAAuB;IACtB,OAAO,CAAC,aAAa;gBAAb,aAAa,EAAE,aAAa;IAEhD,aAAa,IAAI,IAAI;IA6CrB,OAAO,CAAC,wBAAwB;IA8E1B,OAAO,CAAC,IAAI,EAAE,GAAG,GAAG,OAAO,CAAC,WAAW,EAAE,CAAC;CA4EjD"}
@@ -0,0 +1,196 @@
1
+ "use strict";
2
+ var __importDefault = (this && this.__importDefault) || function (mod) {
3
+ return (mod && mod.__esModule) ? mod : { "default": mod };
4
+ };
5
+ Object.defineProperty(exports, "__esModule", { value: true });
6
+ exports.GeminiDescribeVideoTool = void 0;
7
+ const error_handler_1 = require("../utils/error-handler");
8
+ const logger_1 = __importDefault(require("../utils/logger"));
9
+ class GeminiDescribeVideoTool {
10
+ geminiService;
11
+ constructor(geminiService) {
12
+ this.geminiService = geminiService;
13
+ }
14
+ getDefinition() {
15
+ return {
16
+ name: 'gemini_describe_video',
17
+ description: 'Generate detailed audio description of YouTube video content for accessibility. Analyzes visual content including actions, spatial relationships, text overlays, and provides time-stamped descriptions.',
18
+ inputSchema: {
19
+ type: 'object',
20
+ properties: {
21
+ url: {
22
+ type: 'string',
23
+ description: 'YouTube URL to analyze'
24
+ },
25
+ detail_level: {
26
+ type: 'string',
27
+ enum: ['brief', 'standard', 'comprehensive'],
28
+ default: 'standard',
29
+ description: 'Level of detail in descriptions: brief (high-level summary), standard (balanced), comprehensive (extensive detail)'
30
+ },
31
+ accessibility_mode: {
32
+ type: 'boolean',
33
+ default: true,
34
+ description: 'Include spatial descriptions, colors, text overlays, and other visual details important for accessibility'
35
+ },
36
+ include_timestamps: {
37
+ type: 'boolean',
38
+ default: true,
39
+ description: 'Include timestamps for all major actions and scene changes'
40
+ },
41
+ output_format: {
42
+ type: 'string',
43
+ enum: ['narrative', 'structured'],
44
+ default: 'structured',
45
+ description: 'Output format: narrative (prose description) or structured (JSON with sections for easy navigation)'
46
+ },
47
+ focus: {
48
+ type: 'string',
49
+ enum: ['actions', 'visual_elements', 'text_on_screen', 'all'],
50
+ default: 'all',
51
+ description: 'What aspects to focus on: actions (what people do), visual_elements (objects, colors, composition), text_on_screen (captions, overlays), or all'
52
+ }
53
+ },
54
+ required: ['url']
55
+ }
56
+ };
57
+ }
58
+ buildAccessibilityPrompt(request) {
59
+ const detailLevelInstructions = {
60
+ brief: 'Provide a concise 2-3 paragraph summary of the video content.',
61
+ standard: 'Provide a balanced description with key scenes and actions, approximately 5-8 scenes.',
62
+ comprehensive: 'Provide extensive detail for every significant action and visual element, breaking the video into 10-15+ distinct scenes.'
63
+ };
64
+ const focusInstructions = {
65
+ actions: 'Focus primarily on what people are doing - movements, gestures, tool usage, and procedures.',
66
+ visual_elements: 'Focus on visual composition - objects, colors, spatial relationships, and visual design.',
67
+ text_on_screen: 'Focus on any text, captions, overlays, or written information visible in the video.',
68
+ all: 'Provide comprehensive coverage of actions, visual elements, and text equally.'
69
+ };
70
+ const accessibilityInstructions = request.accessibility_mode ? `
71
+ ACCESSIBILITY MODE REQUIREMENTS:
72
+ - Describe spatial relationships explicitly (e.g., "on the left side of the frame", "in the upper right corner", "moving from left to right")
73
+ - Include colors where relevant to understanding (e.g., "red laser line", "blue button")
74
+ - Transcribe ALL text overlays, captions, and on-screen text verbatim
75
+ - Describe facial expressions and body language when relevant
76
+ - Note sound cues if visible (e.g., "person speaking", "tool making noise", "music playing")
77
+ - Use clear, concrete language - avoid "this", "that", "here", "there" without context
78
+ - Describe what IS happening, not what viewers should do
79
+ - For tools and objects, describe them before using pronouns
80
+ ` : '';
81
+ const timestampInstructions = request.include_timestamps ? `
82
+ - Include precise timestamps for every scene or major action (format: MM:SS or H:MM:SS)
83
+ - Mark when new scenes begin, when actions change, when new tools appear
84
+ ` : '';
85
+ const outputFormatInstructions = request.output_format === 'structured' ? `
86
+ OUTPUT FORMAT: Return a valid JSON object with this exact structure:
87
+ {
88
+ "video_title": "Brief title if discernible",
89
+ "duration": "Total video length (MM:SS format)",
90
+ "scene_count": number,
91
+ "scenes": [
92
+ {
93
+ "timestamp": "0:00-0:45",
94
+ "description": "Detailed description of what's happening",
95
+ "actions": ["action 1", "action 2"],
96
+ "tools_visible": ["tool 1", "tool 2"],
97
+ "text_on_screen": "Exact text shown, or null",
98
+ "spatial_notes": "Spatial relationships if relevant",
99
+ "colors": "Color descriptions if relevant"
100
+ }
101
+ ],
102
+ "key_techniques": ["technique 1", "technique 2"],
103
+ "tools_list": ["complete list of all tools shown"],
104
+ "materials_list": ["all materials used"],
105
+ "safety_notes": ["any safety considerations visible"],
106
+ "summary": "2-3 sentence overall summary"
107
+ }
108
+
109
+ CRITICAL: Return ONLY valid JSON. Do not include any text before or after the JSON object.
110
+ ` : `
111
+ OUTPUT FORMAT: Return a narrative prose description with clear paragraphs.
112
+ ${request.include_timestamps ? 'Start each major section with timestamps in [MM:SS] format.' : ''}
113
+ `;
114
+ return `You are analyzing a YouTube video to create an audio description for accessibility purposes.
115
+
116
+ VIDEO URL: ${request.url}
117
+
118
+ DETAIL LEVEL: ${request.detail_level || 'standard'}
119
+ ${detailLevelInstructions[request.detail_level || 'standard']}
120
+
121
+ FOCUS: ${request.focus || 'all'}
122
+ ${focusInstructions[request.focus || 'all']}
123
+
124
+ ${accessibilityInstructions}
125
+ ${timestampInstructions}
126
+ ${outputFormatInstructions}
127
+
128
+ Analyze the video and provide the description according to these requirements.`;
129
+ }
130
+ async execute(args) {
131
+ try {
132
+ logger_1.default.info('Executing gemini_describe_video tool', {
133
+ url: args.url,
134
+ detail_level: args.detail_level,
135
+ accessibility_mode: args.accessibility_mode,
136
+ output_format: args.output_format
137
+ });
138
+ if (!args.url) {
139
+ throw new error_handler_1.McpError('URL is required', 'INVALID_PARAMS');
140
+ }
141
+ // Validate YouTube URL
142
+ if (!args.url.includes('youtube.com') && !args.url.includes('youtu.be')) {
143
+ throw new error_handler_1.McpError('Only YouTube URLs are supported', 'INVALID_PARAMS');
144
+ }
145
+ const request = {
146
+ url: args.url,
147
+ detail_level: args.detail_level || 'standard',
148
+ accessibility_mode: args.accessibility_mode !== false, // Default true
149
+ include_timestamps: args.include_timestamps !== false, // Default true
150
+ output_format: args.output_format || 'structured',
151
+ focus: args.focus || 'all'
152
+ };
153
+ const prompt = this.buildAccessibilityPrompt(request);
154
+ // Use Gemini 2.5 Flash for video understanding (best model for this task)
155
+ const response = await this.geminiService.chat({
156
+ message: prompt,
157
+ model: 'gemini-2.5-flash',
158
+ temperature: 0.3, // Lower temperature for more consistent, factual descriptions
159
+ maxTokens: 16000, // Comprehensive descriptions may be lengthy
160
+ grounding: false // Don't use web search for video analysis
161
+ });
162
+ // If structured output requested, try to parse and validate JSON
163
+ if (request.output_format === 'structured') {
164
+ try {
165
+ // Remove markdown code blocks if present
166
+ let cleanedContent = response.content.trim();
167
+ cleanedContent = cleanedContent.replace(/```json\n?/g, '').replace(/```\n?/g, '');
168
+ const parsed = JSON.parse(cleanedContent);
169
+ // Validate basic structure
170
+ if (!parsed.scenes || !Array.isArray(parsed.scenes)) {
171
+ logger_1.default.warn('Structured output missing scenes array, returning raw response');
172
+ return (0, error_handler_1.createToolResult)(true, response.content);
173
+ }
174
+ // Return formatted JSON
175
+ return (0, error_handler_1.createToolResult)(true, JSON.stringify(parsed, null, 2));
176
+ }
177
+ catch (parseError) {
178
+ logger_1.default.warn('Failed to parse structured output as JSON, returning raw response', {
179
+ error: parseError.message
180
+ });
181
+ return (0, error_handler_1.createToolResult)(true, response.content);
182
+ }
183
+ }
184
+ return (0, error_handler_1.createToolResult)(true, response.content);
185
+ }
186
+ catch (error) {
187
+ logger_1.default.error('gemini_describe_video tool execution failed', { error });
188
+ if (error instanceof error_handler_1.McpError) {
189
+ return (0, error_handler_1.createToolResult)(false, error.message, error);
190
+ }
191
+ return (0, error_handler_1.createToolResult)(false, `Unexpected error: ${error.message}`, error);
192
+ }
193
+ }
194
+ }
195
+ exports.GeminiDescribeVideoTool = GeminiDescribeVideoTool;
196
+ //# sourceMappingURL=gemini-describe-video.js.map
@@ -0,0 +1 @@
1
+ {"version":3,"file":"gemini-describe-video.js","sourceRoot":"","sources":["../../src/tools/gemini-describe-video.ts"],"names":[],"mappings":";;;;;;AAEA,0DAAoE;AACpE,6DAAqC;AAiCrC,MAAa,uBAAuB;IACd;IAApB,YAAoB,aAA4B;QAA5B,kBAAa,GAAb,aAAa,CAAe;IAAG,CAAC;IAEpD,aAAa;QACX,OAAO;YACL,IAAI,EAAE,uBAAuB;YAC7B,WAAW,EAAE,0MAA0M;YACvN,WAAW,EAAE;gBACX,IAAI,EAAE,QAAQ;gBACd,UAAU,EAAE;oBACV,GAAG,EAAE;wBACH,IAAI,EAAE,QAAQ;wBACd,WAAW,EAAE,wBAAwB;qBACtC;oBACD,YAAY,EAAE;wBACZ,IAAI,EAAE,QAAQ;wBACd,IAAI,EAAE,CAAC,OAAO,EAAE,UAAU,EAAE,eAAe,CAAC;wBAC5C,OAAO,EAAE,UAAU;wBACnB,WAAW,EAAE,oHAAoH;qBAClI;oBACD,kBAAkB,EAAE;wBAClB,IAAI,EAAE,SAAS;wBACf,OAAO,EAAE,IAAI;wBACb,WAAW,EAAE,2GAA2G;qBACzH;oBACD,kBAAkB,EAAE;wBAClB,IAAI,EAAE,SAAS;wBACf,OAAO,EAAE,IAAI;wBACb,WAAW,EAAE,4DAA4D;qBAC1E;oBACD,aAAa,EAAE;wBACb,IAAI,EAAE,QAAQ;wBACd,IAAI,EAAE,CAAC,WAAW,EAAE,YAAY,CAAC;wBACjC,OAAO,EAAE,YAAY;wBACrB,WAAW,EAAE,qGAAqG;qBACnH;oBACD,KAAK,EAAE;wBACL,IAAI,EAAE,QAAQ;wBACd,IAAI,EAAE,CAAC,SAAS,EAAE,iBAAiB,EAAE,gBAAgB,EAAE,KAAK,CAAC;wBAC7D,OAAO,EAAE,KAAK;wBACd,WAAW,EAAE,iJAAiJ;qBAC/J;iBACF;gBACD,QAAQ,EAAE,CAAC,KAAK,CAAC;aAClB;SACF,CAAC;IACJ,CAAC;IAEO,wBAAwB,CAAC,OAAgC;QAC/D,MAAM,uBAAuB,GAAG;YAC9B,KAAK,EAAE,+DAA+D;YACtE,QAAQ,EAAE,uFAAuF;YACjG,aAAa,EAAE,2HAA2H;SAC3I,CAAC;QAEF,MAAM,iBAAiB,GAAG;YACxB,OAAO,EAAE,6FAA6F;YACtG,eAAe,EAAE,0FAA0F;YAC3G,cAAc,EAAE,qFAAqF;YACrG,GAAG,EAAE,+EAA+E;SACrF,CAAC;QAEF,MAAM,yBAAyB,GAAG,OAAO,CAAC,kBAAkB,CAAC,CAAC,CAAC;;;;;;;;;;CAUlE,CAAC,CAAC,CAAC,EAAE,CAAC;QAEH,MAAM,qBAAqB,GAAG,OAAO,CAAC,kBAAkB,CAAC,CAAC,CAAC;;;CAG9D,CAAC,CAAC,CAAC,EAAE,CAAC;QAEH,MAAM,wBAAwB,GAAG,OAAO,CAAC,aAAa,KAAK,YAAY,CAAC,CAAC,CAAC;;;;;;;;;;;;;;;;;;;;;;;;;CAyB7E,CAAC,CAAC,CAAC;;EAEF,OAAO,CAAC,kBAAkB,CAAC,CAAC,CAAC,6DAA6D,CAAC,CAAC,CAAC,EAAE;CAChG,CAAC;QAEE,OAAO;;aAEE,OAAO,CAAC,GAAG;;gBAER,OAAO,CAAC,YAAY,IAAI,UAAU;EAChD,uBAAuB,CAAC,OAAO,CAAC,YAAY,IAAI,UAAU,CAAC;;SAEpD,OAAO,CAAC,KAAK,IAAI,KAAK;EAC7B,iBAAiB,CAAC,OAAO,CAAC,KAAK,IAAI,KAAK,CAAC;;EAEzC,yBAAyB;EACzB,qBAAqB;EACrB,wBAAwB;;+EAEqD,CAAC;IAC9E,CAAC;IAED,KAAK,CAAC,OAAO,CAAC,IAAS;QACrB,IAAI,CAAC;YACH,gBAAM,CAAC,IAAI,CAAC,sCAAsC,EAAE;gBAClD,GAAG,EAAE,IAAI,CAAC,GAAG;gBACb,YAAY,EAAE,IAAI,CAAC,YAAY;gBAC/B,kBAAkB,EAAE,IAAI,CAAC,kBAAkB;gBAC3C,aAAa,EAAE,IAAI,CAAC,aAAa;aAClC,CAAC,CAAC;YAEH,IAAI,CAAC,IAAI,CAAC,GAAG,EAAE,CAAC;gBACd,MAAM,IAAI,wBAAQ,CAAC,iBAAiB,EAAE,gBAAgB,CAAC,CAAC;YAC1D,CAAC;YAED,uBAAuB;YACvB,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC,QAAQ,CAAC,aAAa,CAAC,IAAI,CAAC,IAAI,CAAC,GAAG,CAAC,QAAQ,CAAC,UAAU,CAAC,EAAE,CAAC;gBACxE,MAAM,IAAI,wBAAQ,CAAC,iCAAiC,EAAE,gBAAgB,CAAC,CAAC;YAC1E,CAAC;YAED,MAAM,OAAO,GAA4B;gBACvC,GAAG,EAAE,IAAI,CAAC,GAAG;gBACb,YAAY,EAAE,IAAI,CAAC,YAAY,IAAI,UAAU;gBAC7C,kBAAkB,EAAE,IAAI,CAAC,kBAAkB,KAAK,KAAK,EAAE,eAAe;gBACtE,kBAAkB,EAAE,IAAI,CAAC,kBAAkB,KAAK,KAAK,EAAE,eAAe;gBACtE,aAAa,EAAE,IAAI,CAAC,aAAa,IAAI,YAAY;gBACjD,KAAK,EAAE,IAAI,CAAC,KAAK,IAAI,KAAK;aAC3B,CAAC;YAEF,MAAM,MAAM,GAAG,IAAI,CAAC,wBAAwB,CAAC,OAAO,CAAC,CAAC;YAEtD,0EAA0E;YAC1E,MAAM,QAAQ,GAAG,MAAM,IAAI,CAAC,aAAa,CAAC,IAAI,CAAC;gBAC7C,OAAO,EAAE,MAAM;gBACf,KAAK,EAAE,kBAAkB;gBACzB,WAAW,EAAE,GAAG,EAAE,8DAA8D;gBAChF,SAAS,EAAE,KAAK,EAAE,4CAA4C;gBAC9D,SAAS,EAAE,KAAK,CAAC,0CAA0C;aAC5D,CAAC,CAAC;YAEH,iEAAiE;YACjE,IAAI,OAAO,CAAC,aAAa,KAAK,YAAY,EAAE,CAAC;gBAC3C,IAAI,CAAC;oBACH,yCAAyC;oBACzC,IAAI,cAAc,GAAG,QAAQ,CAAC,OAAO,CAAC,IAAI,EAAE,CAAC;oBAC7C,cAAc,GAAG,cAAc,CAAC,OAAO,CAAC,aAAa,EAAE,EAAE,CAAC,CAAC,OAAO,CAAC,SAAS,EAAE,EAAE,CAAC,CAAC;oBAElF,MAAM,MAAM,GAAG,IAAI,CAAC,KAAK,CAAC,cAAc,CAAC,CAAC;oBAE1C,2BAA2B;oBAC3B,IAAI,CAAC,MAAM,CAAC,MAAM,IAAI,CAAC,KAAK,CAAC,OAAO,CAAC,MAAM,CAAC,MAAM,CAAC,EAAE,CAAC;wBACpD,gBAAM,CAAC,IAAI,CAAC,gEAAgE,CAAC,CAAC;wBAC9E,OAAO,IAAA,gCAAgB,EAAC,IAAI,EAAE,QAAQ,CAAC,OAAO,CAAC,CAAC;oBAClD,CAAC;oBAED,wBAAwB;oBACxB,OAAO,IAAA,gCAAgB,EAAC,IAAI,EAAE,IAAI,CAAC,SAAS,CAAC,MAAM,EAAE,IAAI,EAAE,CAAC,CAAC,CAAC,CAAC;gBAEjE,CAAC;gBAAC,OAAO,UAAU,EAAE,CAAC;oBACpB,gBAAM,CAAC,IAAI,CAAC,mEAAmE,EAAE;wBAC/E,KAAK,EAAG,UAAoB,CAAC,OAAO;qBACrC,CAAC,CAAC;oBACH,OAAO,IAAA,gCAAgB,EAAC,IAAI,EAAE,QAAQ,CAAC,OAAO,CAAC,CAAC;gBAClD,CAAC;YACH,CAAC;YAED,OAAO,IAAA,gCAAgB,EAAC,IAAI,EAAE,QAAQ,CAAC,OAAO,CAAC,CAAC;QAElD,CAAC;QAAC,OAAO,KAAK,EAAE,CAAC;YACf,gBAAM,CAAC,KAAK,CAAC,6CAA6C,EAAE,EAAE,KAAK,EAAE,CAAC,CAAC;YAEvE,IAAI,KAAK,YAAY,wBAAQ,EAAE,CAAC;gBAC9B,OAAO,IAAA,gCAAgB,EAAC,KAAK,EAAE,KAAK,CAAC,OAAO,EAAE,KAAK,CAAC,CAAC;YACvD,CAAC;YAED,OAAO,IAAA,gCAAgB,EAAC,KAAK,EAAE,qBAAsB,KAAe,CAAC,OAAO,EAAE,EAAE,KAAc,CAAC,CAAC;QAClG,CAAC;IACH,CAAC;CACF;AA1MD,0DA0MC"}
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@houtini/gemini-mcp",
3
- "version": "1.2.2",
3
+ "version": "1.3.0",
4
4
  "description": "Professional Model Context Protocol server for Google Gemini AI models with enterprise-grade features",
5
5
  "main": "dist/index.js",
6
6
  "types": "dist/index.d.ts",