datagrok-tools 5.1.9 → 6.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,36 @@
1
+ FROM node:22-bookworm-slim
2
+
3
+ # System deps + Chrome stable
4
+ RUN apt-get update && apt-get install -y --no-install-recommends \
5
+ wget gnupg ca-certificates git curl jq procps docker.io \
6
+ && wget -q -O - https://dl.google.com/linux/linux_signing_key.pub \
7
+ | gpg --dearmor -o /usr/share/keyrings/google-chrome.gpg \
8
+ && echo "deb [arch=amd64 signed-by=/usr/share/keyrings/google-chrome.gpg] \
9
+ http://dl.google.com/linux/chrome/deb/ stable main" \
10
+ > /etc/apt/sources.list.d/google-chrome.list \
11
+ && apt-get update && apt-get install -y --no-install-recommends \
12
+ google-chrome-stable \
13
+ && rm -rf /var/lib/apt/lists/*
14
+
15
+ ENV PUPPETEER_SKIP_DOWNLOAD=true \
16
+ PUPPETEER_EXECUTABLE_PATH=/usr/bin/google-chrome-stable \
17
+ CHROME_BIN=/usr/bin/google-chrome-stable
18
+
19
+ # grok CLI, Claude Code, Playwright
20
+ RUN npm install -g datagrok-tools @anthropic-ai/claude-code \
21
+ && npx playwright install --with-deps chromium
22
+
23
+ # Use existing node user (UID 1000), add docker group access
24
+ RUN usermod -aG docker node 2>/dev/null || true \
25
+ && mkdir -p /home/node/.grok /home/node/.claude /home/node/.npm \
26
+ && chown -R node:node /home/node
27
+
28
+ COPY entrypoint.sh /usr/local/bin/entrypoint.sh
29
+ RUN chmod 755 /usr/local/bin/entrypoint.sh
30
+
31
+ RUN mkdir -p /workspace/repo && chown -R node:node /workspace
32
+
33
+ USER node
34
+ WORKDIR /workspace
35
+ ENTRYPOINT ["/usr/local/bin/entrypoint.sh"]
36
+ CMD ["sleep", "infinity"]
@@ -0,0 +1,501 @@
1
+ # Packages Dev Container
2
+
3
+ Isolated environment for JS/TS package development against a running Datagrok instance.
4
+ All images are pre-built on Docker Hub — no local build step needed. Install `datagrok-tools`
5
+ globally and run `grok claude` from any git repo.
6
+
7
+ ## Quick start
8
+
9
+ ```bash
10
+ npm i -g datagrok-tools # one-time install
11
+ export ANTHROPIC_API_KEY=... # or set in shell profile
12
+
13
+ # From any package directory:
14
+ grok claude
15
+
16
+ # With options:
17
+ grok claude --version 1.22.0 --profile full --task GROK-123
18
+
19
+ # Stop containers:
20
+ grok claude --stop --task GROK-123
21
+ ```
22
+
23
+ The `grok claude` command is fully self-contained — the compose configuration is
24
+ embedded in the CLI, all container images are pulled from Docker Hub (`datagrok/tools-dev`,
25
+ `datagrok/datagrok`, etc.). No files to copy, no Dockerfile to build.
26
+ See `grok claude --help` for all options.
27
+
28
+ ### What it does
29
+
30
+ 1. Generates a `docker-compose.yaml` and `.env` in a temp directory
31
+ 2. Runs `docker compose up -d --wait` (pulls pre-built images)
32
+ 3. Waits for Datagrok to be healthy
33
+ 4. Launches `claude --dangerously-skip-permissions` inside the `tools-dev` container
34
+ 5. On exit, stops containers (unless `--keep`)
35
+
36
+ ### Manual setup
37
+
38
+ For manual setup or customization, use the `docker-compose.yaml` in this directory
39
+ directly. The `Dockerfile.pkg_dev` and `entrypoint.sh` are the build recipe for the
40
+ `datagrok/tools-dev` Docker Hub image — end users don't need them.
41
+
42
+ ## What you can do
43
+
44
+ | Workflow | How it works |
45
+ |----------|-------------|
46
+ | **Develop packages (public repo)** | Mount public repo worktree as `/workspace`. Agent reads js-api source, help docs, ApiSamples, and CLAUDE.md skills directly. |
47
+ | **Develop packages (separate repo)** | Mount your repo as `/workspace`. The public repo is auto-cloned to `/workspace/public` on first start, branch matching `DG_VERSION`. |
48
+ | **Learn from examples** | Agent reads `packages/ApiSamples/scripts/` for runnable code samples and `help/develop/` for guides. |
49
+ | **Search Jira and GitHub** | MCP plugins for Atlassian Jira and GitHub — agent searches for similar issues, reads context, updates status. |
50
+ | **Write and run tests** | Playwright for browser automation, `grok test` for Puppeteer-based package tests, Chrome remote debugging. |
51
+ | **Publish packages** | `grok publish` to the local Datagrok instance or any external server. Agent handles build + publish. |
52
+ | **Manage the Datagrok stand** | Agent changes `DG_VERSION`, pulls new images, resets DB, redeploys — all from inside the container via Docker socket. |
53
+ | **Interactive setup** | Agent asks the user for tokens, auth keys, server URLs via Claude Code prompts. No pre-configuration required. |
54
+
55
+ ## Architecture
56
+
57
+ ```
58
+ Host machine
59
+ ┌──────────────────────────────────────────────────────────────────┐
60
+ │ Worktree: ~/pkg-worktrees/TASK-123/ │
61
+ │ └── public/ (auto-cloned if workspace is not public repo) │
62
+ │ │
63
+ │ Docker network: dg-pkg-TASK-123-net │
64
+ │ ┌────────────────────────────────────────────────────────────┐ │
65
+ │ │ datagrok (datagrok/datagrok:${VERSION}) → :8080 │ │
66
+ │ │ postgres (pgvector/pgvector:pg17) │ │
67
+ │ │ rabbitmq (rabbitmq:4.0.5-management) │ │
68
+ │ │ grok_pipe (datagrok/grok_pipe:${VERSION}) │ │
69
+ │ │ grok_connect, grok_spawner, jkg (optional profiles) │ │
70
+ │ │ world, test_db, northwind (optional demo DBs) │ │
71
+ │ │ │ │
72
+ │ │ tools-dev (datagrok/tools-dev:latest) │ │
73
+ │ │ ├── /workspace ← bind mount of worktree │ │
74
+ │ │ ├── /var/run/docker.sock ← host Docker for stand mgmt │ │
75
+ │ │ ├── ~/.claude/ ← bind mount from host │ │
76
+ │ │ ├── Claude Code + MCP plugins (Jira, GitHub) │ │
77
+ │ │ ├── Playwright + Chrome for testing │ │
78
+ │ │ └── grok CLI for build/publish/test │ │
79
+ │ └────────────────────────────────────────────────────────────┘ │
80
+ │ │
81
+ │ Host browser → http://localhost:8080 (Datagrok UI) │
82
+ │ → chrome://inspect → localhost:9222 (Debug) │
83
+ └──────────────────────────────────────────────────────────────────┘
84
+ ```
85
+
86
+ ## docker-compose.yaml
87
+
88
+ The `docker-compose.yaml` in this directory uses pre-built images from Docker Hub.
89
+ The `grok claude` command embeds this same configuration and generates it automatically
90
+ — you don't need this file unless you want manual control.
91
+
92
+ For manual use, set variables in `.env` or export them, then:
93
+
94
+ ```bash
95
+ docker compose up -d
96
+ ```
97
+
98
+ ### Building the tools-dev image
99
+
100
+ The `Dockerfile.pkg_dev` and `entrypoint.sh` are the build recipe for the
101
+ `datagrok/tools-dev` image published to Docker Hub. To build locally:
102
+
103
+ ```bash
104
+ cd public/tools/.devcontainer
105
+ docker build -f Dockerfile.pkg_dev -t datagrok/tools-dev:latest .
106
+ ```
107
+
108
+ ## Working with repos
109
+
110
+ ### Public repo (default)
111
+
112
+ ```bash
113
+ cd /path/to/public
114
+ git worktree add ~/pkg-worktrees/TASK-123 -b TASK-123
115
+
116
+ # Set WORKTREE_PATH in .env or export
117
+ echo "WORKTREE_PATH=$HOME/pkg-worktrees/TASK-123" >> .env
118
+ docker compose up -d
119
+ ```
120
+
121
+ Inside the container, the agent sees:
122
+ - `/workspace/js-api/` — JS API source (read directly, no build needed for reference)
123
+ - `/workspace/packages/ApiSamples/scripts/` — runnable code examples
124
+ - `/workspace/help/develop/` — development guides
125
+ - `/workspace/packages/` — all existing packages as reference
126
+
127
+ ### Separate repo (e.g., private packages)
128
+
129
+ ```bash
130
+ git -C /path/to/my-repo worktree add ~/pkg-worktrees/TASK-123 -b TASK-123
131
+
132
+ echo "WORKTREE_PATH=$HOME/pkg-worktrees/TASK-123" >> .env
133
+ docker compose up -d
134
+ ```
135
+
136
+ On first start, the entrypoint detects that `/workspace` is not the public repo
137
+ (no `js-api/` at root) and automatically clones it to `/workspace/public`:
138
+
139
+ ```
140
+ [tools-dev] Cloning public repo (bleeding-edge) into /workspace/public...
141
+ [tools-dev] Public repo ready at /workspace/public (branch: bleeding-edge).
142
+ ```
143
+
144
+ The branch is resolved in this order:
145
+ 1. `DG_PUBLIC_BRANCH` (explicit override in `.env`)
146
+ 2. `DG_VERSION` (matches the Datagrok image tag — keeps API and server in sync)
147
+ 3. Falls back to `master` if the branch/tag doesn't exist
148
+
149
+ Inside the container:
150
+ - `/workspace/` — your repo
151
+ - `/workspace/public/js-api/` — JS API source (matching Datagrok version)
152
+ - `/workspace/public/packages/ApiSamples/scripts/` — code examples
153
+ - `/workspace/public/help/` — docs
154
+ - `/workspace/public/packages/` — reference packages
155
+
156
+ To use a private fork of public, set `DG_PUBLIC_REPO` in `.env`:
157
+ ```bash
158
+ DG_PUBLIC_REPO=https://github.com/myorg/public-fork.git
159
+ ```
160
+
161
+ Worktrees keep full git connectivity — commit, push, fetch, PR all work.
162
+
163
+ ## MCP plugins: Jira and GitHub
164
+
165
+ The agent uses MCP plugins to search for similar issues, read context, and update
166
+ status. Set tokens in `.env` or the agent will ask interactively on first use.
167
+
168
+ ### Jira (Atlassian)
169
+
170
+ ```bash
171
+ # Inside tools-dev, register the MCP server:
172
+ claude mcp add mcp-atlassian -s user -- \
173
+ npx -y mcp-atlassian \
174
+ --jira-url "$JIRA_URL" \
175
+ --jira-username "$JIRA_USERNAME" \
176
+ --jira-token "$JIRA_TOKEN"
177
+ ```
178
+
179
+ The agent can then:
180
+ - Search for issues: "find similar bugs to GROK-12345"
181
+ - Read issue details, comments, attachments
182
+ - Update issue status, add comments
183
+ - Create new issues
184
+
185
+ ### GitHub
186
+
187
+ ```bash
188
+ # Inside tools-dev, register the MCP server:
189
+ claude mcp add github -s user -- \
190
+ npx -y @modelcontextprotocol/server-github
191
+ ```
192
+
193
+ Requires `GITHUB_TOKEN` in the environment (already passed from `.env`). The agent can:
194
+ - Search issues and PRs across repos
195
+ - Read issue comments and PR diffs
196
+ - Create issues, PRs, and comments
197
+ - Check CI status
198
+
199
+ ### Interactive token setup
200
+
201
+ If tokens are not pre-configured, the agent asks the user:
202
+
203
+ ```
204
+ Agent: I need a Jira API token to search for similar issues.
205
+ Go to https://id.atlassian.com/manage-profile/security/api-tokens
206
+ and create a token. Paste it here.
207
+ User: <pastes token>
208
+ Agent: <configures MCP server and proceeds>
209
+ ```
210
+
211
+ The same flow works for GitHub tokens and grok dev keys.
212
+
213
+ ## Testing
214
+
215
+ ### Playwright (browser automation)
216
+
217
+ Playwright is pre-installed with Chromium. Use it for custom browser automation,
218
+ E2E tests, and UI interaction scenarios.
219
+
220
+ ```bash
221
+ # Inside tools-dev:
222
+ cd /workspace/packages/MyPkg
223
+
224
+ # Run Playwright tests (if the package has them)
225
+ npx playwright test --project chromium
226
+
227
+ # Or write ad-hoc automation
228
+ npx playwright codegen http://datagrok:8080
229
+ ```
230
+
231
+ ### grok test (Puppeteer-based package tests)
232
+
233
+ Standard Datagrok package testing via the `grok` CLI:
234
+
235
+ ```bash
236
+ # Inside tools-dev:
237
+ cd /workspace/packages/MyPkg
238
+ grok test --host local # headless
239
+ grok test --host local --gui # visible browser
240
+ grok test --host local --verbose # detailed output
241
+ grok test --host local --category "MyCategory" # filter tests
242
+ ```
243
+
244
+ ### Chrome remote debugging
245
+
246
+ `grok test --gui` runs Chrome with `--remote-debugging-port=9222`. Port 9222 is
247
+ exposed to the host.
248
+
249
+ On host: open `chrome://inspect` → Configure → add `localhost:9222` → click "inspect"
250
+ on the test session. Set breakpoints in package source during test execution.
251
+
252
+ ### Developing test scenarios
253
+
254
+ The agent can:
255
+ 1. Read existing tests in `packages/ApiTests/` and other packages for patterns
256
+ 2. Create new test files following the `package-test.ts` template
257
+ 3. Run tests and analyze failures
258
+ 4. Generate test cases from Jira issue descriptions or GitHub issues
259
+
260
+ ```bash
261
+ # Scaffold a test file in a package
262
+ cd /workspace/packages/MyPkg
263
+ grok add test
264
+ ```
265
+
266
+ ## Publishing packages
267
+
268
+ ### To the local Datagrok instance
269
+
270
+ ```bash
271
+ # Inside tools-dev:
272
+ cd /workspace/packages/MyPkg
273
+ grok publish # debug mode (visible only to dev)
274
+ grok publish --release # public release
275
+ grok publish --build # build webpack first
276
+ ```
277
+
278
+ ### To an external server
279
+
280
+ ```bash
281
+ # Add a server config
282
+ grok config add --alias prod \
283
+ --server https://example.datagrok.ai/api \
284
+ --key <dev-key>
285
+
286
+ # Publish to it
287
+ grok publish prod --release --build
288
+ ```
289
+
290
+ The agent handles the full cycle: build, check, publish, verify in the UI.
291
+
292
+ ## Managing the Datagrok stand
293
+
294
+ The tools-dev container has the Docker socket mounted, so the agent can manage the
295
+ entire compose stack from inside.
296
+
297
+ ### Version switching
298
+
299
+ ```bash
300
+ # Inside tools-dev (or from host):
301
+ DG_VERSION=1.22.0 docker compose up -d --pull always
302
+ ```
303
+
304
+ When the workspace is a separate repo (not public), the auto-cloned public repo
305
+ should also be updated to match the new version:
306
+
307
+ ```bash
308
+ # Inside tools-dev — update the public clone to match new DG_VERSION:
309
+ cd /workspace/public && git fetch && git checkout 1.22.0
310
+ ```
311
+
312
+ The agent does this autonomously when asked:
313
+ ```
314
+ User: Switch to Datagrok 1.22.0
315
+ Agent: <updates DG_VERSION, pulls new images, updates public branch>
316
+ Datagrok 1.22.0 is running. Public repo updated to 1.22.0.
317
+ ```
318
+
319
+ ### DB reset / redeploy
320
+
321
+ ```bash
322
+ docker compose down -v && docker compose up -d
323
+ ```
324
+
325
+ ### Adding profiles on the fly
326
+
327
+ ```bash
328
+ # Need demo databases now
329
+ docker compose --profile demo up -d
330
+
331
+ # Need everything
332
+ docker compose --profile full up -d
333
+ ```
334
+
335
+ ## Profiles
336
+
337
+ | Profile | Services added | Use case |
338
+ |---------|---------------|----------|
339
+ | (none) | postgres, rabbitmq, grok_pipe, datagrok, tools-dev | Basic package dev |
340
+ | `demo` | + world, test_db, northwind | Need demo databases |
341
+ | `scripting` | + jupyter_kernel_gateway | Need Python/R/Julia scripts |
342
+ | `full` | + grok_connect, grok_spawner, demo DBs, JKG | Everything |
343
+
344
+ ## JS API and code reference
345
+
346
+ The agent reads JS API source and documentation directly from the workspace — no
347
+ generated docs needed.
348
+
349
+ ### Key paths (public repo at `/workspace`)
350
+
351
+ | What | Path |
352
+ |------|------|
353
+ | JS API source (types, classes, methods) | `js-api/src/` |
354
+ | JS API entry points (grok, ui, dg) | `js-api/grok.ts`, `js-api/ui.ts`, `js-api/dg.ts` |
355
+ | JS API CLAUDE.md (module map, patterns) | `js-api/CLAUDE.md` |
356
+ | Runnable code samples | `packages/ApiSamples/scripts/` |
357
+ | Package development guide | `help/develop/packages/` |
358
+ | All platform help docs | `help/` |
359
+ | Existing packages (patterns) | `packages/` |
360
+ | CLI tool source | `tools/` |
361
+
362
+ ### Key paths (separate repo — public auto-cloned)
363
+
364
+ Same as above, prefixed with `public/`:
365
+ - `public/js-api/src/`, `public/packages/ApiSamples/scripts/`, etc.
366
+
367
+ The public branch matches `DG_VERSION` by default, so js-api types stay in sync
368
+ with the running Datagrok server.
369
+
370
+ ### CLAUDE.md for the agent
371
+
372
+ Add to your project's CLAUDE.md so the agent knows where to look:
373
+
374
+ ```markdown
375
+ ## Reference
376
+
377
+ - JS API: read source in `js-api/src/` — see `js-api/CLAUDE.md` for module map
378
+ - Code samples: `packages/ApiSamples/scripts/` — runnable examples for eval
379
+ - Help docs: `help/develop/` — guides for packages, viewers, functions
380
+ - Existing packages: `packages/` — real-world patterns
381
+ ```
382
+
383
+ ## Exposing the client
384
+
385
+ - Datagrok UI: `http://localhost:${DG_PORT:-8080}`
386
+ - Default credentials: **admin / admin** (created on first deploy)
387
+
388
+ ### grok CLI config inside the container
389
+
390
+ The entrypoint auto-creates `~/.grok/config.yaml` with the local Datagrok instance
391
+ (dev key `admin`). The grok CLI is ready to use immediately — no manual config needed.
392
+
393
+ If you need to reconfigure:
394
+
395
+ ```bash
396
+ # Inside tools-dev:
397
+ grok config add --alias local \
398
+ --server http://datagrok:8080/api \
399
+ --key admin --default
400
+ ```
401
+
402
+ ## Claude Code inside the container
403
+
404
+ Mount `~/.claude/` from host (already in the compose file) for credentials and
405
+ config. Pass `ANTHROPIC_API_KEY` via `.env` or shell export.
406
+
407
+ ```bash
408
+ # Single entry point — launch Claude Code interactively
409
+ docker exec -it <tools-dev_container> claude --dangerously-skip-permissions
410
+
411
+ # One-shot command
412
+ docker exec <tools-dev_container> claude -p "publish the Chem package" \
413
+ --dangerously-skip-permissions
414
+ ```
415
+
416
+ The agent can:
417
+ - Build, test, and publish packages
418
+ - Search Jira/GitHub for context (via MCP plugins)
419
+ - Manage the Datagrok stand (version switch, redeploy, DB reset)
420
+ - Ask the user for tokens and auth when needed
421
+ - Read js-api source and ApiSamples for reference
422
+
423
+ ## Quick reference
424
+
425
+ ```bash
426
+ # Start (basic)
427
+ docker compose up -d
428
+
429
+ # Start (everything)
430
+ docker compose --profile full up -d
431
+
432
+ # Launch Claude agent (single entry point)
433
+ docker exec -it <tools-dev> claude --dangerously-skip-permissions
434
+
435
+ # Shell into tools-dev
436
+ docker exec -it <tools-dev> bash
437
+
438
+ # Publish a package
439
+ docker exec <tools-dev> bash -c "cd /workspace/packages/MyPkg && grok publish"
440
+
441
+ # Run grok tests
442
+ docker exec <tools-dev> bash -c "cd /workspace/packages/MyPkg && grok test --host local"
443
+
444
+ # Run Playwright tests
445
+ docker exec <tools-dev> bash -c "cd /workspace/packages/MyPkg && npx playwright test"
446
+
447
+ # Version switch
448
+ DG_VERSION=1.22.0 docker compose up -d --pull always
449
+
450
+ # DB reset
451
+ docker compose down -v && docker compose up -d
452
+
453
+ # Logs
454
+ docker compose logs -f datagrok
455
+
456
+ # Stop
457
+ docker compose down
458
+ ```
459
+
460
+ ## Troubleshooting
461
+
462
+ ### Datagrok not starting
463
+ ```bash
464
+ docker compose logs datagrok
465
+ # Wait for DB migrations — first start takes 1-2 minutes
466
+ # Look for: "Server started on port 8080"
467
+ ```
468
+
469
+ ### grok publish fails
470
+ Ensure `~/.grok/config.yaml` inside the container points to `http://datagrok:8080/api`
471
+ with a valid dev key. The container reaches Datagrok via Docker DNS, not `localhost`.
472
+
473
+ ### Chrome / Playwright not working
474
+ ```bash
475
+ docker exec <tools-dev> google-chrome-stable --version
476
+ docker exec <tools-dev> npx playwright --version
477
+ ```
478
+ If missing, pull the latest image: `docker pull datagrok/tools-dev:latest`
479
+
480
+ ### MCP plugins not connecting
481
+ ```bash
482
+ # Verify env vars are set
483
+ docker exec <tools-dev> env | grep -E 'JIRA|GITHUB'
484
+ # Re-register if needed
485
+ docker exec -it <tools-dev> claude mcp add mcp-atlassian -s user -- \
486
+ npx -y mcp-atlassian --jira-url "$JIRA_URL" \
487
+ --jira-username "$JIRA_USERNAME" --jira-token "$JIRA_TOKEN"
488
+ ```
489
+
490
+ ### Agent can't manage Docker (version switch, redeploy)
491
+ The Docker socket must be mounted and the `dev` user must be in the `docker` group:
492
+ ```bash
493
+ docker exec <tools-dev> docker ps
494
+ ```
495
+ If permission denied, check that `/var/run/docker.sock` is mounted and accessible.
496
+
497
+ ### Container can't resolve `datagrok` hostname
498
+ All services must be on the same Docker network:
499
+ ```bash
500
+ docker network inspect dg-pkg-${TASK_KEY:-default}-net
501
+ ```