getraw 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,43 @@
1
+ name: Release
2
+ on:
3
+ push:
4
+ tags: ["v*"]
5
+
6
+ permissions:
7
+ contents: write
8
+ id-token: write
9
+
10
+ jobs:
11
+ release:
12
+ runs-on: ubuntu-latest
13
+ environment: release
14
+ steps:
15
+ - uses: actions/checkout@v4
16
+
17
+ - uses: oven-sh/setup-bun@v2
18
+ with:
19
+ bun-version: latest
20
+
21
+ - run: sudo apt-get update && sudo apt-get install -y ffmpeg
22
+
23
+ - run: bun install
24
+
25
+ - run: bun test
26
+
27
+ - uses: actions/setup-node@v4
28
+ with:
29
+ node-version: "22"
30
+ registry-url: "https://registry.npmjs.org"
31
+
32
+ - name: Publish to npm
33
+ run: npm publish --access public
34
+ env:
35
+ NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
36
+
37
+ - name: Create GitHub Release
38
+ run: |
39
+ gh release create ${{ github.ref_name }} \
40
+ --title "${{ github.ref_name }}" \
41
+ --generate-notes
42
+ env:
43
+ GH_TOKEN: ${{ github.token }}
package/CLAUDE.md CHANGED
@@ -1,4 +1,4 @@
1
- # dlpx
1
+ # getraw
2
2
 
3
3
  Fast media downloader CLI — yt-dlp replacement built natively in Bun/TypeScript.
4
4
 
package/README.md CHANGED
@@ -1,165 +1,156 @@
1
- # dlpx
1
+ # getraw
2
2
 
3
- Fast media downloader CLI built natively in Bun/TypeScript.
3
+ Fast media downloader CLI built natively in Bun/TypeScript. A yt-dlp replacement with native JS execution.
4
4
 
5
- ## Installation
5
+ [![npm](https://img.shields.io/npm/v/getraw)](https://www.npmjs.com/package/getraw)
6
+ [![tests](https://img.shields.io/badge/tests-386%20passing-brightgreen)]()
7
+ [![license](https://img.shields.io/badge/license-MIT-blue)]()
8
+
9
+ ## Why getraw?
6
10
 
7
- ### Global install (Bun required)
11
+ - **Native JS execution** — YouTube's player code runs natively in Bun. No external runtime needed (yt-dlp requires Deno/Node).
12
+ - **50ms cold startup** — Bun-powered, not Python.
13
+ - **30+ sites** — YouTube, Twitter, TikTok, Instagram, Reddit, Twitch, and more.
14
+ - **Zero API keys** — All extractors use public endpoints, guest tokens, and page scraping.
15
+ - **Agent-ready** — Install as an AI agent skill: `npx skills add onkits/getraw`
16
+
17
+ ## Installation
8
18
 
9
19
  ```sh
10
- bun install -g dlpx
20
+ bun install -g getraw
11
21
  ```
12
22
 
13
23
  ### From source
14
24
 
15
25
  ```sh
16
- git clone https://github.com/web3mikee/dlpx
17
- cd dlpx
26
+ git clone https://github.com/onkits/getraw
27
+ cd getraw
18
28
  bun install
19
29
  ```
20
30
 
21
- Run directly from source:
31
+ ### As an AI agent skill
22
32
 
23
33
  ```sh
24
- bun run src/cli/index.ts <URL>
34
+ npx skills add onkits/getraw
25
35
  ```
26
36
 
27
- Build a standalone binary:
28
-
29
- ```sh
30
- bun run build
31
- ./dlpx <URL>
32
- ```
37
+ Works with Claude Code, Cursor, Copilot, Codex, Windsurf, and 50+ other agents.
33
38
 
34
39
  ## Quick Start
35
40
 
36
- Download a video at best quality:
37
-
38
41
  ```sh
39
- dlpx https://www.youtube.com/watch?v=dQw4w9WgXcQ
40
- ```
42
+ # Download a video
43
+ getraw https://www.youtube.com/watch?v=dQw4w9WgXcQ
41
44
 
42
- Extract audio as MP3:
45
+ # Extract audio as MP3
46
+ getraw -x --audio-format mp3 https://soundcloud.com/artist/track
43
47
 
44
- ```sh
45
- dlpx -x --audio-format mp3 https://soundcloud.com/artist/track
46
- ```
48
+ # List available formats
49
+ getraw -F https://vimeo.com/123456789
47
50
 
48
- List all available formats before downloading:
51
+ # Download specific quality with subtitles
52
+ getraw -f "bestvideo[height<=1080]+bestaudio" --write-subs https://www.youtube.com/watch?v=dQw4w9WgXcQ
49
53
 
50
- ```sh
51
- dlpx -F https://vimeo.com/123456789
54
+ # Get metadata as JSON (no download)
55
+ getraw -j https://www.reddit.com/r/videos/comments/abc123/some_post/
52
56
  ```
53
57
 
54
- Download a specific format and write subtitles:
58
+ ## CLI Reference
55
59
 
56
- ```sh
57
- dlpx -f "bestvideo[height<=1080]+bestaudio" --write-subs --sub-langs en https://www.youtube.com/watch?v=dQw4w9WgXcQ
60
+ ```
61
+ Usage: getraw [OPTIONS] URL [URL...]
58
62
  ```
59
63
 
60
- Dump extracted metadata as JSON without downloading:
64
+ | Flag | Short | Default | Description |
65
+ |------|-------|---------|-------------|
66
+ | `--format` | `-f` | `bv*+ba/b` | Format selection string |
67
+ | `--output` | `-o` | `%(title)s [%(id)s].%(ext)s` | Output filename template |
68
+ | `--extract-audio` | `-x` | | Extract audio only |
69
+ | `--audio-format` | | `mp3` | Audio format (mp3, aac, flac, wav, opus) |
70
+ | `--write-subs` | | | Write subtitles to file |
71
+ | `--sub-langs` | | `en` | Subtitle languages |
72
+ | `--list-formats` | `-F` | | List available formats |
73
+ | `--dump-json` | `-j` | | Dump info JSON to stdout |
74
+ | `--quiet` | `-q` | | Suppress output |
75
+ | `--verbose` | `-v` | | Verbose output |
76
+ | `--retries` | `-R` | `3` | Number of retries |
77
+ | `--rate-limit` | `-r` | | Rate limit in bytes/sec |
78
+ | `--proxy` | | | Proxy URL |
79
+ | `--cookies` | | | Cookie file path (Netscape format) |
80
+ | `--embed-thumbnail` | | | Embed thumbnail in output |
81
+ | `--embed-subs` | | | Embed subtitles in output |
82
+ | `--version` | `-V` | | Print version |
83
+ | `--help` | `-h` | | Show help |
84
+
85
+ ## Supported Sites (30+)
86
+
87
+ | Site | URL Patterns |
88
+ |------|-------------|
89
+ | **YouTube** | youtube.com, youtu.be, shorts, live, playlists, channels |
90
+ | **Twitter/X** | twitter.com/\*/status/\*, x.com/\*/status/\*, Spaces |
91
+ | **TikTok** | tiktok.com/@\*/video/\*, vm.tiktok.com, user profiles |
92
+ | **Instagram** | instagram.com/p/\*, /reel/\*, /reels/ |
93
+ | **Reddit** | reddit.com/r/\*/comments/\*, v.redd.it, galleries |
94
+ | **Twitch** | VODs, clips, live streams |
95
+ | **Vimeo** | vimeo.com/\*, player embeds |
96
+ | **SoundCloud** | Tracks, playlists, albums |
97
+ | **Bilibili** | Videos, bangumi/anime |
98
+ | **Dailymotion** | Videos |
99
+ | **Bandcamp** | Tracks, albums |
100
+ | **Kick** | VODs, clips, live |
101
+ | **Rumble** | Videos |
102
+ | **TED** | Talks (with multi-language subtitles) |
103
+ | **Niconico** | Videos |
104
+ | **Streamable** | Videos |
105
+ | **Imgur** | Videos, GIFs, albums |
106
+ | **Coub** | Videos (video + audio merge) |
107
+ | **Odysee/LBRY** | Videos |
108
+ | **PeerTube** | Any instance |
109
+ | **Spotify** | Podcast episodes (30s preview) |
110
+ | **Archive.org** | Any public media |
111
+ | **Google Drive** | Public files |
112
+ | **Dropbox** | Public share links |
113
+ | **+ more** | Generic fallback for direct media URLs |
114
+
115
+ See [docs/supported-sites.md](docs/supported-sites.md) for full details.
116
+
117
+ ## For AI Agents
118
+
119
+ getraw is designed to be used by AI agents. Key commands for automation:
61
120
 
62
121
  ```sh
63
- dlpx -j https://www.reddit.com/r/videos/comments/abc123/some_post/
64
- ```
122
+ # Get structured metadata
123
+ getraw --dump-json "URL" | jq '.title, .duration, .formats[0].url'
65
124
 
66
- ## CLI Reference
67
-
68
- ```
69
- Usage: dlpx [OPTIONS] URL [URL...]
70
- ```
125
+ # Download transcript for summarization
126
+ getraw --write-subs --sub-langs en --skip-download "URL"
71
127
 
72
- | Flag | Short | Type | Default | Description |
73
- |------|-------|------|---------|-------------|
74
- | `--format` | `-f` | string | `bv*+ba/b` | Format selection string |
75
- | `--output` | `-o` | string | `%(title)s [%(id)s].%(ext)s` | Output filename template |
76
- | `--extract-audio` | `-x` | boolean | false | Extract audio only |
77
- | `--audio-format` | | string | `mp3` | Audio format (`mp3`, `aac`, `flac`, etc.) |
78
- | `--audio-quality` | | string | `5` | Audio quality (0–10 or bitrate) |
79
- | `--write-subs` | | boolean | false | Write subtitles to file |
80
- | `--sub-langs` | | string | `en` | Subtitle languages |
81
- | `--list-formats` | `-F` | boolean | false | List available formats |
82
- | `--dump-json` | `-j` | boolean | false | Dump info JSON to stdout |
83
- | `--quiet` | `-q` | boolean | false | Suppress output |
84
- | `--verbose` | `-v` | boolean | false | Verbose output |
85
- | `--no-progress` | | boolean | false | Disable progress bar |
86
- | `--retries` | `-R` | number | `3` | Number of retries |
87
- | `--rate-limit` | `-r` | number | none | Rate limit in bytes/sec |
88
- | `--proxy` | | string | none | Proxy URL |
89
- | `--cookies` | | string | none | Cookie file path |
90
- | `--user-agent` | | string | `dlpx/0.0.0` | Custom User-Agent |
91
- | `--referer` | | string | none | Custom Referer header |
92
- | `--embed-thumbnail` | | boolean | false | Embed thumbnail in output file |
93
- | `--embed-subs` | | boolean | false | Embed subtitles in output file |
94
- | `--merge-output-format` | | string | none | Output container for merging streams |
95
- | `--ffmpeg-location` | | string | none | Path to ffmpeg binary |
96
- | `--version` | `-V` | boolean | false | Print version |
97
- | `--help` | `-h` | boolean | false | Show help |
98
-
99
- ## Supported Sites
100
-
101
- | Site | Extractor name | URL pattern | Subtitles |
102
- |------|---------------|-------------|-----------|
103
- | YouTube | `youtube` | `youtube.com/watch`, `youtu.be/`, `youtube.com/shorts/`, `youtube.com/live/`, `youtube.com/playlist`, `youtube.com/channel/`, `youtube.com/@handle` | Yes (manual + auto-generated) |
104
- | Vimeo | `vimeo` | `vimeo.com/<id>`, `player.vimeo.com/video/<id>`, channels, groups | No |
105
- | Twitter / X | `twitter` | `twitter.com/*/status/*`, `x.com/*/status/*` | No |
106
- | Twitter Spaces | `twitter:spaces` | `twitter.com/i/spaces/*`, `x.com/i/spaces/*` | No |
107
- | TikTok | `tiktok` | `tiktok.com/@user/video/<id>`, `vm.tiktok.com/*` | No |
108
- | TikTok User | `tiktok:user` | `tiktok.com/@username` | No |
109
- | Instagram | `instagram` | `instagram.com/p/*`, `instagram.com/reel/*`, `instagram.com/reels/*` | No |
110
- | Instagram Reels feed | `instagram:reels` | `instagram.com/reels/` | No |
111
- | Twitch VOD | `twitch:vod` | `twitch.tv/videos/<id>` | No |
112
- | Twitch Clip | `twitch:clip` | `twitch.tv/*/clip/*`, `clips.twitch.tv/*` | No |
113
- | Twitch Live | `twitch:live` | `twitch.tv/<channel>` | No |
114
- | Kick VOD | `kick` | `kick.com/video/<id>` | No |
115
- | Kick Clip | `kick:clips` | `kick.com/<channel>/clips/<id>` | No |
116
- | Kick Live | `kick:live` | `kick.com/<channel>` | No |
117
- | Reddit | `reddit` | `reddit.com/r/*/comments/*`, `v.redd.it/*` | No |
118
- | Reddit Gallery | `reddit:gallery` | `reddit.com/r/*/comments/*`, `reddit.com/gallery/*` | No |
119
- | SoundCloud | `soundcloud` | `soundcloud.com/<user>/<track>` | No |
120
- | SoundCloud Playlist | `soundcloud:playlist` | `soundcloud.com/<user>/sets/<playlist>` | No |
121
- | Bilibili | `bilibili` | `bilibili.com/video/BV*`, `bilibili.com/video/av*` | No |
122
- | Bilibili Bangumi | `bilibili:bangumi` | `bilibili.com/bangumi/play/ep*`, `bilibili.com/bangumi/play/ss*` | No |
123
- | Niconico | `niconico` | `nicovideo.jp/watch/sm*`, `nicovideo.jp/watch/nm*` | No |
124
- | Bandcamp | `bandcamp` | `*.bandcamp.com/track/*`, `*.bandcamp.com/album/*` | No |
125
- | Dailymotion | `dailymotion` | `dailymotion.com/video/<id>` | No |
126
- | Streamable | `streamable` | `streamable.com/<id>` | No |
127
- | Coub | `coub` | `coub.com/view/*`, `coub.com/embed/*` | No |
128
- | Imgur | `imgur` | `imgur.com/<id>`, `imgur.com/a/<id>`, `imgur.com/gallery/<id>`, `i.imgur.com/*` | No |
129
- | Rumble | `rumble` | `rumble.com/v*.html`, `rumble.com/embed/*` | No |
130
- | Odysee | `odysee` | `odysee.com/@*:*/<slug>`, `lbry.tv/@*:*/<slug>` | No |
131
- | TED | `ted` | `ted.com/talks/<slug>` | Yes |
132
- | PeerTube | `peertube` | Any PeerTube instance: `<host>/videos/watch/*`, `<host>/w/*`, `<host>/videos/embed/*` | Yes |
133
- | Google Drive | `google-drive` | `drive.google.com/file/d/*`, `docs.google.com/file/d/*` | No |
134
- | Dropbox | `dropbox` | `dropbox.com/s/*`, `dropbox.com/sh/*`, `dropbox.com/scl/fo/*` | No |
135
- | Archive.org | `archive.org` | `archive.org/details/*`, `archive.org/download/*` | No |
136
- | Spotify | `spotify` | `open.spotify.com/episode/<id>` | No |
137
- | Generic | `generic` | Any `http://` or `https://` URL (fallback) | No |
138
-
139
- > Spotify: only 30-second preview audio is available without authentication. Full episode audio requires Spotify auth (not currently implemented).
140
-
141
- See [docs/supported-sites.md](docs/supported-sites.md) for full format and URL pattern details.
128
+ # Extract audio for transcription pipelines
129
+ getraw -x --audio-format wav -o "audio.wav" "URL"
142
130
 
143
- ## Building from Source
131
+ # Batch download
132
+ getraw URL1 URL2 URL3
133
+ ```
144
134
 
145
- Requires [Bun](https://bun.sh) v1.0 or later.
135
+ Install as an agent skill for any compatible AI coding agent:
146
136
 
147
137
  ```sh
148
- git clone https://github.com/web3mikee/dlpx
149
- cd dlpx
150
- bun install
151
- bun run build # produces ./dlpx binary
138
+ npx skills add onkits/getraw
152
139
  ```
153
140
 
154
- Run tests:
141
+ ## Building from Source
155
142
 
156
143
  ```sh
157
- bun test
144
+ git clone https://github.com/onkits/getraw
145
+ cd getraw
146
+ bun install
147
+ bun test # 386 tests
148
+ bun run build # standalone binary
158
149
  ```
159
150
 
160
151
  ## Writing a Custom Extractor
161
152
 
162
- See [docs/plugin-guide.md](docs/plugin-guide.md) for the `BaseExtractor` interface and a minimal example.
153
+ See [docs/plugin-guide.md](docs/plugin-guide.md) for the `BaseExtractor` interface and examples.
163
154
 
164
155
  ## License
165
156