getraw 0.1.3 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,10 +1,20 @@
1
1
  # getraw
2
2
 
3
- Fast media downloader CLI built natively in Bun/TypeScript.
3
+ Fast media downloader CLI built natively in Bun/TypeScript. A yt-dlp replacement with native JS execution.
4
4
 
5
- ## Installation
5
+ [![npm](https://img.shields.io/npm/v/getraw)](https://www.npmjs.com/package/getraw)
6
+ [![tests](https://img.shields.io/badge/tests-386%20passing-brightgreen)]()
7
+ [![license](https://img.shields.io/badge/license-MIT-blue)]()
8
+
9
+ ## Why getraw?
6
10
 
7
- ### Global install (Bun required)
11
+ - **Native JS execution** — YouTube's player code runs natively in Bun. No external runtime needed (yt-dlp requires Deno/Node).
12
+ - **50ms cold startup** — Bun-powered, not Python.
13
+ - **30+ sites** — YouTube, Twitter, TikTok, Instagram, Reddit, Twitch, and more.
14
+ - **Zero API keys** — All extractors use public endpoints, guest tokens, and page scraping.
15
+ - **Agent-ready** — Install as an AI agent skill: `npx skills add onkits/getraw`
16
+
17
+ ## Installation
8
18
 
9
19
  ```sh
10
20
  bun install -g getraw
@@ -13,53 +23,35 @@ bun install -g getraw
13
23
  ### From source
14
24
 
15
25
  ```sh
16
- git clone https://github.com/web3mikee/getraw
26
+ git clone https://github.com/onkits/getraw
17
27
  cd getraw
18
28
  bun install
19
29
  ```
20
30
 
21
- Run directly from source:
31
+ ### As an AI agent skill
22
32
 
23
33
  ```sh
24
- bun run src/cli/index.ts <URL>
34
+ npx skills add onkits/getraw
25
35
  ```
26
36
 
27
- Build a standalone binary:
28
-
29
- ```sh
30
- bun run build
31
- ./getraw <URL>
32
- ```
37
+ Works with Claude Code, Cursor, Copilot, Codex, Windsurf, and 50+ other agents.
33
38
 
34
39
  ## Quick Start
35
40
 
36
- Download a video at best quality:
37
-
38
41
  ```sh
42
+ # Download a video
39
43
  getraw https://www.youtube.com/watch?v=dQw4w9WgXcQ
40
- ```
41
-
42
- Extract audio as MP3:
43
44
 
44
- ```sh
45
+ # Extract audio as MP3
45
46
  getraw -x --audio-format mp3 https://soundcloud.com/artist/track
46
- ```
47
47
 
48
- List all available formats before downloading:
49
-
50
- ```sh
48
+ # List available formats
51
49
  getraw -F https://vimeo.com/123456789
52
- ```
53
50
 
54
- Download a specific format and write subtitles:
51
+ # Download specific quality with subtitles
52
+ getraw -f "bestvideo[height<=1080]+bestaudio" --write-subs https://www.youtube.com/watch?v=dQw4w9WgXcQ
55
53
 
56
- ```sh
57
- getraw -f "bestvideo[height<=1080]+bestaudio" --write-subs --sub-langs en https://www.youtube.com/watch?v=dQw4w9WgXcQ
58
- ```
59
-
60
- Dump extracted metadata as JSON without downloading:
61
-
62
- ```sh
54
+ # Get metadata as JSON (no download)
63
55
  getraw -j https://www.reddit.com/r/videos/comments/abc123/some_post/
64
56
  ```
65
57
 
@@ -69,97 +61,96 @@ getraw -j https://www.reddit.com/r/videos/comments/abc123/some_post/
69
61
  Usage: getraw [OPTIONS] URL [URL...]
70
62
  ```
71
63
 
72
- | Flag | Short | Type | Default | Description |
73
- |------|-------|------|---------|-------------|
74
- | `--format` | `-f` | string | `bv*+ba/b` | Format selection string |
75
- | `--output` | `-o` | string | `%(title)s [%(id)s].%(ext)s` | Output filename template |
76
- | `--extract-audio` | `-x` | boolean | false | Extract audio only |
77
- | `--audio-format` | | string | `mp3` | Audio format (`mp3`, `aac`, `flac`, etc.) |
78
- | `--audio-quality` | | string | `5` | Audio quality (0–10 or bitrate) |
79
- | `--write-subs` | | boolean | false | Write subtitles to file |
80
- | `--sub-langs` | | string | `en` | Subtitle languages |
81
- | `--list-formats` | `-F` | boolean | false | List available formats |
82
- | `--dump-json` | `-j` | boolean | false | Dump info JSON to stdout |
83
- | `--quiet` | `-q` | boolean | false | Suppress output |
84
- | `--verbose` | `-v` | boolean | false | Verbose output |
85
- | `--no-progress` | | boolean | false | Disable progress bar |
86
- | `--retries` | `-R` | number | `3` | Number of retries |
87
- | `--rate-limit` | `-r` | number | none | Rate limit in bytes/sec |
88
- | `--proxy` | | string | none | Proxy URL |
89
- | `--cookies` | | string | none | Cookie file path |
90
- | `--user-agent` | | string | `getraw/0.0.0` | Custom User-Agent |
91
- | `--referer` | | string | none | Custom Referer header |
92
- | `--embed-thumbnail` | | boolean | false | Embed thumbnail in output file |
93
- | `--embed-subs` | | boolean | false | Embed subtitles in output file |
94
- | `--merge-output-format` | | string | none | Output container for merging streams |
95
- | `--ffmpeg-location` | | string | none | Path to ffmpeg binary |
96
- | `--version` | `-V` | boolean | false | Print version |
97
- | `--help` | `-h` | boolean | false | Show help |
98
-
99
- ## Supported Sites
100
-
101
- | Site | Extractor name | URL pattern | Subtitles |
102
- |------|---------------|-------------|-----------|
103
- | YouTube | `youtube` | `youtube.com/watch`, `youtu.be/`, `youtube.com/shorts/`, `youtube.com/live/`, `youtube.com/playlist`, `youtube.com/channel/`, `youtube.com/@handle` | Yes (manual + auto-generated) |
104
- | Vimeo | `vimeo` | `vimeo.com/<id>`, `player.vimeo.com/video/<id>`, channels, groups | No |
105
- | Twitter / X | `twitter` | `twitter.com/*/status/*`, `x.com/*/status/*` | No |
106
- | Twitter Spaces | `twitter:spaces` | `twitter.com/i/spaces/*`, `x.com/i/spaces/*` | No |
107
- | TikTok | `tiktok` | `tiktok.com/@user/video/<id>`, `vm.tiktok.com/*` | No |
108
- | TikTok User | `tiktok:user` | `tiktok.com/@username` | No |
109
- | Instagram | `instagram` | `instagram.com/p/*`, `instagram.com/reel/*`, `instagram.com/reels/*` | No |
110
- | Instagram Reels feed | `instagram:reels` | `instagram.com/reels/` | No |
111
- | Twitch VOD | `twitch:vod` | `twitch.tv/videos/<id>` | No |
112
- | Twitch Clip | `twitch:clip` | `twitch.tv/*/clip/*`, `clips.twitch.tv/*` | No |
113
- | Twitch Live | `twitch:live` | `twitch.tv/<channel>` | No |
114
- | Kick VOD | `kick` | `kick.com/video/<id>` | No |
115
- | Kick Clip | `kick:clips` | `kick.com/<channel>/clips/<id>` | No |
116
- | Kick Live | `kick:live` | `kick.com/<channel>` | No |
117
- | Reddit | `reddit` | `reddit.com/r/*/comments/*`, `v.redd.it/*` | No |
118
- | Reddit Gallery | `reddit:gallery` | `reddit.com/r/*/comments/*`, `reddit.com/gallery/*` | No |
119
- | SoundCloud | `soundcloud` | `soundcloud.com/<user>/<track>` | No |
120
- | SoundCloud Playlist | `soundcloud:playlist` | `soundcloud.com/<user>/sets/<playlist>` | No |
121
- | Bilibili | `bilibili` | `bilibili.com/video/BV*`, `bilibili.com/video/av*` | No |
122
- | Bilibili Bangumi | `bilibili:bangumi` | `bilibili.com/bangumi/play/ep*`, `bilibili.com/bangumi/play/ss*` | No |
123
- | Niconico | `niconico` | `nicovideo.jp/watch/sm*`, `nicovideo.jp/watch/nm*` | No |
124
- | Bandcamp | `bandcamp` | `*.bandcamp.com/track/*`, `*.bandcamp.com/album/*` | No |
125
- | Dailymotion | `dailymotion` | `dailymotion.com/video/<id>` | No |
126
- | Streamable | `streamable` | `streamable.com/<id>` | No |
127
- | Coub | `coub` | `coub.com/view/*`, `coub.com/embed/*` | No |
128
- | Imgur | `imgur` | `imgur.com/<id>`, `imgur.com/a/<id>`, `imgur.com/gallery/<id>`, `i.imgur.com/*` | No |
129
- | Rumble | `rumble` | `rumble.com/v*.html`, `rumble.com/embed/*` | No |
130
- | Odysee | `odysee` | `odysee.com/@*:*/<slug>`, `lbry.tv/@*:*/<slug>` | No |
131
- | TED | `ted` | `ted.com/talks/<slug>` | Yes |
132
- | PeerTube | `peertube` | Any PeerTube instance: `<host>/videos/watch/*`, `<host>/w/*`, `<host>/videos/embed/*` | Yes |
133
- | Google Drive | `google-drive` | `drive.google.com/file/d/*`, `docs.google.com/file/d/*` | No |
134
- | Dropbox | `dropbox` | `dropbox.com/s/*`, `dropbox.com/sh/*`, `dropbox.com/scl/fo/*` | No |
135
- | Archive.org | `archive.org` | `archive.org/details/*`, `archive.org/download/*` | No |
136
- | Spotify | `spotify` | `open.spotify.com/episode/<id>` | No |
137
- | Generic | `generic` | Any `http://` or `https://` URL (fallback) | No |
138
-
139
- > Spotify: only 30-second preview audio is available without authentication. Full episode audio requires Spotify auth (not currently implemented).
140
-
141
- See [docs/supported-sites.md](docs/supported-sites.md) for full format and URL pattern details.
64
+ | Flag | Short | Default | Description |
65
+ |------|-------|---------|-------------|
66
+ | `--format` | `-f` | `bv*+ba/b` | Format selection string |
67
+ | `--output` | `-o` | `%(title)s [%(id)s].%(ext)s` | Output filename template |
68
+ | `--extract-audio` | `-x` | | Extract audio only |
69
+ | `--audio-format` | | `mp3` | Audio format (mp3, aac, flac, wav, opus) |
70
+ | `--write-subs` | | | Write subtitles to file |
71
+ | `--sub-langs` | | `en` | Subtitle languages |
72
+ | `--list-formats` | `-F` | | List available formats |
73
+ | `--dump-json` | `-j` | | Dump info JSON to stdout |
74
+ | `--quiet` | `-q` | | Suppress output |
75
+ | `--verbose` | `-v` | | Verbose output |
76
+ | `--retries` | `-R` | `3` | Number of retries |
77
+ | `--rate-limit` | `-r` | | Rate limit in bytes/sec |
78
+ | `--proxy` | | | Proxy URL |
79
+ | `--cookies` | | | Cookie file path (Netscape format) |
80
+ | `--embed-thumbnail` | | | Embed thumbnail in output |
81
+ | `--embed-subs` | | | Embed subtitles in output |
82
+ | `--version` | `-V` | | Print version |
83
+ | `--help` | `-h` | | Show help |
84
+
85
+ ## Supported Sites (30+)
86
+
87
+ | Site | URL Patterns |
88
+ |------|-------------|
89
+ | **YouTube** | youtube.com, youtu.be, shorts, live, playlists, channels |
90
+ | **Twitter/X** | twitter.com/\*/status/\*, x.com/\*/status/\*, Spaces |
91
+ | **TikTok** | tiktok.com/@\*/video/\*, vm.tiktok.com, user profiles |
92
+ | **Instagram** | instagram.com/p/\*, /reel/\*, /reels/ |
93
+ | **Reddit** | reddit.com/r/\*/comments/\*, v.redd.it, galleries |
94
+ | **Twitch** | VODs, clips, live streams |
95
+ | **Vimeo** | vimeo.com/\*, player embeds |
96
+ | **SoundCloud** | Tracks, playlists, albums |
97
+ | **Bilibili** | Videos, bangumi/anime |
98
+ | **Dailymotion** | Videos |
99
+ | **Bandcamp** | Tracks, albums |
100
+ | **Kick** | VODs, clips, live |
101
+ | **Rumble** | Videos |
102
+ | **TED** | Talks (with multi-language subtitles) |
103
+ | **Niconico** | Videos |
104
+ | **Streamable** | Videos |
105
+ | **Imgur** | Videos, GIFs, albums |
106
+ | **Coub** | Videos (video + audio merge) |
107
+ | **Odysee/LBRY** | Videos |
108
+ | **PeerTube** | Any instance |
109
+ | **Spotify** | Podcast episodes (30s preview) |
110
+ | **Archive.org** | Any public media |
111
+ | **Google Drive** | Public files |
112
+ | **Dropbox** | Public share links |
113
+ | **+ more** | Generic fallback for direct media URLs |
114
+
115
+ See [docs/supported-sites.md](docs/supported-sites.md) for full details.
116
+
117
+ ## For AI Agents
118
+
119
+ getraw is designed to be used by AI agents. Key commands for automation:
142
120
 
143
- ## Building from Source
121
+ ```sh
122
+ # Get structured metadata
123
+ getraw --dump-json "URL" | jq '.title, .duration, .formats[0].url'
144
124
 
145
- Requires [Bun](https://bun.sh) v1.0 or later.
125
+ # Download transcript for summarization
126
+ getraw --write-subs --sub-langs en --skip-download "URL"
127
+
128
+ # Extract audio for transcription pipelines
129
+ getraw -x --audio-format wav -o "audio.wav" "URL"
130
+
131
+ # Batch download
132
+ getraw URL1 URL2 URL3
133
+ ```
134
+
135
+ Install as an agent skill for any compatible AI coding agent:
146
136
 
147
137
  ```sh
148
- git clone https://github.com/web3mikee/getraw
149
- cd getraw
150
- bun install
151
- bun run build # produces ./getraw binary
138
+ npx skills add onkits/getraw
152
139
  ```
153
140
 
154
- Run tests:
141
+ ## Building from Source
155
142
 
156
143
  ```sh
157
- bun test
144
+ git clone https://github.com/onkits/getraw
145
+ cd getraw
146
+ bun install
147
+ bun test # 386 tests
148
+ bun run build # standalone binary
158
149
  ```
159
150
 
160
151
  ## Writing a Custom Extractor
161
152
 
162
- See [docs/plugin-guide.md](docs/plugin-guide.md) for the `BaseExtractor` interface and a minimal example.
153
+ See [docs/plugin-guide.md](docs/plugin-guide.md) for the `BaseExtractor` interface and examples.
163
154
 
164
155
  ## License
165
156
 
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "getraw",
3
- "version": "0.1.3",
3
+ "version": "0.2.0",
4
4
  "description": "Fast media downloader CLI built natively in Bun/TypeScript",
5
5
  "type": "module",
6
6
  "bin": {
@@ -25,6 +25,6 @@
25
25
  "license": "MIT",
26
26
  "repository": {
27
27
  "type": "git",
28
- "url": "https://github.com/web3mikee/getraw"
28
+ "url": "https://github.com/onkits/getraw"
29
29
  }
30
30
  }
@@ -0,0 +1,163 @@
1
+ ---
2
+ name: getraw
3
+ description: Download videos, audio, and metadata from 30+ sites (YouTube, Twitter, TikTok, Instagram, Reddit, Twitch, Vimeo, SoundCloud, and more). Use when the user asks to download media, extract video info, get transcripts/subtitles, rip audio, or fetch metadata from a URL. Wraps the getraw CLI — a yt-dlp replacement built in Bun/TypeScript.
4
+ ---
5
+
6
+ # getraw
7
+
8
+ Download and extract media from 30+ sites. Built in Bun/TypeScript as a yt-dlp replacement.
9
+
10
+ ## Prerequisites
11
+
12
+ Requires `bun` and `getraw` installed:
13
+
14
+ ```bash
15
+ bun install -g getraw
16
+ ```
17
+
18
+ Optional: `ffmpeg` for audio extraction, format merging, and subtitle embedding.
19
+
20
+ ## Commands
21
+
22
+ ### Download a video
23
+
24
+ ```bash
25
+ getraw "URL"
26
+ ```
27
+
28
+ Downloads the best available format to the current directory.
29
+
30
+ ### Get metadata as JSON (no download)
31
+
32
+ ```bash
33
+ getraw --dump-json "URL"
34
+ ```
35
+
36
+ Returns full metadata: title, description, uploader, duration, formats, subtitles, thumbnails. Use this when you need info about a video without downloading it. Parse the JSON output for structured data.
37
+
38
+ ### List available formats
39
+
40
+ ```bash
41
+ getraw --list-formats "URL"
42
+ ```
43
+
44
+ Shows all available quality/format options (resolution, codec, bitrate, filesize).
45
+
46
+ ### Download specific format
47
+
48
+ ```bash
49
+ getraw -f "best[height<=720]" "URL"
50
+ getraw -f "bestvideo+bestaudio" "URL"
51
+ getraw -f "bestaudio" "URL"
52
+ ```
53
+
54
+ Format selection strings:
55
+ - `best` — best single file
56
+ - `bestvideo+bestaudio` — best video + best audio, merged by ffmpeg
57
+ - `bestaudio` — audio only (best quality)
58
+ - `best[height<=720]` — best format at 720p or below
59
+ - Format ID from `--list-formats` (e.g. `137+140`)
60
+
61
+ ### Extract audio only
62
+
63
+ ```bash
64
+ getraw -x "URL"
65
+ getraw -x --audio-format mp3 "URL"
66
+ getraw -x --audio-format flac "URL"
67
+ ```
68
+
69
+ Supported audio formats: `mp3`, `aac`, `flac`, `wav`, `opus`, `vorbis`, `m4a`.
70
+
71
+ ### Download subtitles
72
+
73
+ ```bash
74
+ getraw --write-subs "URL"
75
+ getraw --write-subs --sub-langs "en,es" "URL"
76
+ ```
77
+
78
+ Downloads subtitle files alongside the video. Use `--sub-langs` to specify languages.
79
+
80
+ ### Custom output filename
81
+
82
+ ```bash
83
+ getraw -o "%(title)s.%(ext)s" "URL"
84
+ getraw -o "%(uploader)s - %(title)s [%(id)s].%(ext)s" "URL"
85
+ ```
86
+
87
+ Template variables: `%(title)s`, `%(id)s`, `%(ext)s`, `%(uploader)s`, `%(upload_date)s`, `%(duration)s`, `%(view_count)s`.
88
+
89
+ ### Embed metadata
90
+
91
+ ```bash
92
+ getraw --embed-thumbnail --embed-subs "URL"
93
+ ```
94
+
95
+ Embeds thumbnail art and subtitles into the downloaded file (requires ffmpeg).
96
+
97
+ ## Supported Sites
98
+
99
+ | Site | URL Pattern |
100
+ |------|------------|
101
+ | YouTube | youtube.com, youtu.be, youtube.com/shorts |
102
+ | Twitter/X | twitter.com/*/status/*, x.com/*/status/* |
103
+ | TikTok | tiktok.com/@*/video/*, vm.tiktok.com/* |
104
+ | Instagram | instagram.com/p/*, instagram.com/reel/* |
105
+ | Reddit | reddit.com/r/*/comments/*, v.redd.it/* |
106
+ | Twitch | twitch.tv/videos/*, twitch.tv/*/clip/* |
107
+ | Vimeo | vimeo.com/* |
108
+ | SoundCloud | soundcloud.com/*/* |
109
+ | Bilibili | bilibili.com/video/* |
110
+ | Dailymotion | dailymotion.com/video/* |
111
+ | Bandcamp | *.bandcamp.com/track/*, *.bandcamp.com/album/* |
112
+ | Rumble | rumble.com/* |
113
+ | TED | ted.com/talks/* |
114
+ | Kick | kick.com/video/*, kick.com/*/clips/* |
115
+ | Streamable | streamable.com/* |
116
+ | PeerTube | Any PeerTube instance |
117
+ | Archive.org | archive.org/details/* |
118
+ | + 13 more | Imgur, Coub, Odysee, Spotify podcasts, NHK, BBC, etc. |
119
+
120
+ ## When to Use
121
+
122
+ - User says "download this video" or shares a video URL
123
+ - User wants video/audio metadata (`--dump-json`)
124
+ - User wants to extract audio from a video (`-x`)
125
+ - User wants subtitles or transcripts (`--write-subs`)
126
+ - User wants to check available qualities (`--list-formats`)
127
+ - User wants to save media for offline use or processing
128
+
129
+ ## Common Patterns
130
+
131
+ ### Get video transcript for summarization
132
+
133
+ ```bash
134
+ getraw --write-subs --sub-langs en --skip-download "URL"
135
+ # Then read the .vtt or .srt file
136
+ ```
137
+
138
+ ### Download audio for TTS/transcription pipeline
139
+
140
+ ```bash
141
+ getraw -x --audio-format wav -o "audio.wav" "URL"
142
+ ```
143
+
144
+ ### Batch download from a list
145
+
146
+ ```bash
147
+ getraw URL1 URL2 URL3
148
+ ```
149
+
150
+ ### Get metadata for multiple videos
151
+
152
+ ```bash
153
+ for url in URL1 URL2 URL3; do
154
+ getraw --dump-json "$url"
155
+ done
156
+ ```
157
+
158
+ ## Error Handling
159
+
160
+ - If a site is unsupported, getraw returns a clear error with the URL
161
+ - If a format is unavailable, it falls back to the best available
162
+ - Network errors retry 3 times with exponential backoff
163
+ - Use `--verbose` for debug output, `--quiet` to suppress all output