synth-ai 0.2.14__py3-none-any.whl → 0.2.16__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Potentially problematic release.
This version of synth-ai might be problematic. Click here for more details.
- examples/README.md +1 -0
- examples/multi_step/SFT_README.md +147 -0
- examples/multi_step/configs/crafter_rl_stepwise_hosted_judge.toml +9 -9
- examples/multi_step/configs/crafter_sft_qwen30b_lora.toml +62 -0
- examples/multi_step/convert_traces_to_sft.py +84 -0
- examples/multi_step/run_sft_qwen30b.sh +45 -0
- examples/qwen_coder/configs/coder_lora_30b.toml +2 -1
- examples/qwen_coder/configs/coder_lora_4b.toml +2 -1
- examples/qwen_coder/configs/coder_lora_small.toml +2 -1
- examples/qwen_vl/BUGS_AND_FIXES.md +232 -0
- examples/qwen_vl/IMAGE_VALIDATION_COMPLETE.md +271 -0
- examples/qwen_vl/IMAGE_VALIDATION_SUMMARY.md +260 -0
- examples/qwen_vl/INFERENCE_SFT_TESTS.md +412 -0
- examples/qwen_vl/NEXT_STEPS_2B.md +325 -0
- examples/qwen_vl/QUICKSTART.md +327 -0
- examples/qwen_vl/QUICKSTART_RL_VISION.md +110 -0
- examples/qwen_vl/README.md +154 -0
- examples/qwen_vl/RL_VISION_COMPLETE.md +475 -0
- examples/qwen_vl/RL_VISION_TESTING.md +333 -0
- examples/qwen_vl/SDK_VISION_INTEGRATION.md +328 -0
- examples/qwen_vl/SETUP_COMPLETE.md +275 -0
- examples/qwen_vl/VISION_TESTS_COMPLETE.md +490 -0
- examples/qwen_vl/VLM_PIPELINE_COMPLETE.md +242 -0
- examples/qwen_vl/__init__.py +2 -0
- examples/qwen_vl/collect_data_via_cli.md +423 -0
- examples/qwen_vl/collect_vision_traces.py +368 -0
- examples/qwen_vl/configs/crafter_rl_vision_qwen3vl4b.toml +127 -0
- examples/qwen_vl/configs/crafter_vlm_sft_example.toml +60 -0
- examples/qwen_vl/configs/eval_gpt4o_mini_vision.toml +43 -0
- examples/qwen_vl/configs/eval_gpt4o_vision_proper.toml +29 -0
- examples/qwen_vl/configs/eval_gpt5nano_vision.toml +45 -0
- examples/qwen_vl/configs/eval_qwen2vl_vision.toml +44 -0
- examples/qwen_vl/configs/filter_qwen2vl_sft.toml +50 -0
- examples/qwen_vl/configs/filter_vision_sft.toml +53 -0
- examples/qwen_vl/configs/filter_vision_test.toml +8 -0
- examples/qwen_vl/configs/sft_qwen3_vl_2b_test.toml +54 -0
- examples/qwen_vl/crafter_gpt5nano_agent.py +308 -0
- examples/qwen_vl/crafter_qwen_vl_agent.py +300 -0
- examples/qwen_vl/run_vision_comparison.sh +62 -0
- examples/qwen_vl/run_vision_sft_pipeline.sh +175 -0
- examples/qwen_vl/test_image_validation.py +201 -0
- examples/qwen_vl/test_sft_vision_data.py +110 -0
- examples/rl/README.md +1 -1
- examples/rl/configs/eval_base_qwen.toml +17 -0
- examples/rl/configs/eval_rl_qwen.toml +13 -0
- examples/rl/configs/rl_from_base_qwen.toml +37 -0
- examples/rl/configs/rl_from_base_qwen17.toml +76 -0
- examples/rl/configs/rl_from_ft_qwen.toml +37 -0
- examples/rl/run_eval.py +436 -0
- examples/rl/run_rl_and_save.py +111 -0
- examples/rl/task_app/README.md +22 -0
- examples/rl/task_app/math_single_step.py +990 -0
- examples/rl/task_app/math_task_app.py +111 -0
- examples/sft/README.md +5 -5
- examples/sft/configs/crafter_fft_qwen0p6b.toml +4 -2
- examples/sft/configs/crafter_lora_qwen0p6b.toml +4 -3
- examples/sft/evaluate.py +2 -4
- examples/sft/export_dataset.py +7 -4
- examples/swe/task_app/README.md +1 -1
- examples/swe/task_app/grpo_swe_mini.py +0 -1
- examples/swe/task_app/grpo_swe_mini_task_app.py +0 -12
- examples/swe/task_app/hosted/envs/mini_swe/environment.py +13 -13
- examples/swe/task_app/hosted/policy_routes.py +0 -2
- examples/swe/task_app/hosted/rollout.py +0 -8
- examples/task_apps/crafter/task_app/grpo_crafter.py +4 -7
- examples/task_apps/crafter/task_app/synth_envs_hosted/envs/crafter/policy.py +59 -1
- examples/task_apps/crafter/task_app/synth_envs_hosted/inference/openai_client.py +30 -0
- examples/task_apps/crafter/task_app/synth_envs_hosted/policy_routes.py +62 -31
- examples/task_apps/crafter/task_app/synth_envs_hosted/rollout.py +16 -14
- examples/task_apps/enron/__init__.py +1 -0
- examples/vlm/README.md +3 -3
- examples/vlm/configs/crafter_vlm_gpt4o.toml +2 -0
- examples/vlm/crafter_openai_vlm_agent.py +3 -5
- examples/vlm/filter_image_rows.py +1 -1
- examples/vlm/run_crafter_vlm_benchmark.py +2 -2
- examples/warming_up_to_rl/_utils.py +92 -0
- examples/warming_up_to_rl/analyze_trace_db.py +1 -1
- examples/warming_up_to_rl/configs/crafter_fft.toml +2 -0
- examples/warming_up_to_rl/configs/crafter_fft_4b.toml +2 -0
- examples/warming_up_to_rl/configs/eval_fft_qwen4b.toml +2 -0
- examples/warming_up_to_rl/configs/eval_groq_qwen32b.toml +2 -0
- examples/warming_up_to_rl/configs/eval_modal_qwen4b.toml +2 -1
- examples/warming_up_to_rl/configs/rl_from_base_qwen4b.toml +2 -1
- examples/warming_up_to_rl/configs/rl_from_ft.toml +2 -0
- examples/warming_up_to_rl/export_trace_sft.py +174 -60
- examples/warming_up_to_rl/readme.md +63 -132
- examples/warming_up_to_rl/run_fft_and_save.py +1 -1
- examples/warming_up_to_rl/run_rl_and_save.py +1 -1
- examples/warming_up_to_rl/task_app/README.md +42 -0
- examples/warming_up_to_rl/task_app/grpo_crafter.py +696 -0
- examples/warming_up_to_rl/task_app/grpo_crafter_task_app.py +135 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/README.md +173 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/__init__.py +5 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/branching.py +143 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/environment_routes.py +1226 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/__init__.py +1 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/__init__.py +6 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/app.py +1 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/environment.py +522 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/policy.py +478 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/react_agent.py +108 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/shared.py +305 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/envs/crafter/tools.py +47 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/hosted_app.py +204 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/inference/__init__.py +5 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/inference/openai_client.py +618 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/main.py +100 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/policy_routes.py +1081 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/registry.py +195 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/rollout.py +1861 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/storage/__init__.py +5 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/storage/volume.py +211 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/test_agents.py +161 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/test_service.py +137 -0
- examples/warming_up_to_rl/task_app/synth_envs_hosted/utils.py +62 -0
- synth_ai/__init__.py +44 -30
- synth_ai/_utils/__init__.py +47 -0
- synth_ai/_utils/base_url.py +10 -0
- synth_ai/_utils/http.py +10 -0
- synth_ai/_utils/prompts.py +10 -0
- synth_ai/_utils/task_app_state.py +12 -0
- synth_ai/_utils/user_config.py +10 -0
- synth_ai/api/models/supported.py +144 -7
- synth_ai/api/train/__init__.py +13 -1
- synth_ai/api/train/cli.py +30 -7
- synth_ai/api/train/config_finder.py +18 -11
- synth_ai/api/train/env_resolver.py +13 -10
- synth_ai/cli/__init__.py +62 -78
- synth_ai/cli/_modal_wrapper.py +7 -5
- synth_ai/cli/_typer_patch.py +0 -2
- synth_ai/cli/_validate_task_app.py +22 -4
- synth_ai/cli/legacy_root_backup.py +3 -1
- synth_ai/cli/lib/__init__.py +10 -0
- synth_ai/cli/lib/task_app_discovery.py +7 -0
- synth_ai/cli/lib/task_app_env.py +518 -0
- synth_ai/cli/recent.py +2 -1
- synth_ai/cli/setup.py +266 -0
- synth_ai/cli/status.py +1 -1
- synth_ai/cli/task_app_deploy.py +16 -0
- synth_ai/cli/task_app_list.py +25 -0
- synth_ai/cli/task_app_modal_serve.py +16 -0
- synth_ai/cli/task_app_serve.py +18 -0
- synth_ai/cli/task_apps.py +71 -31
- synth_ai/cli/traces.py +1 -1
- synth_ai/cli/train.py +18 -0
- synth_ai/cli/tui.py +7 -2
- synth_ai/cli/turso.py +1 -1
- synth_ai/cli/watch.py +1 -1
- synth_ai/demos/__init__.py +10 -0
- synth_ai/demos/core/__init__.py +28 -1
- synth_ai/demos/crafter/__init__.py +1 -0
- synth_ai/demos/crafter/crafter_fft_4b.toml +55 -0
- synth_ai/demos/crafter/grpo_crafter_task_app.py +185 -0
- synth_ai/demos/crafter/rl_from_base_qwen4b.toml +74 -0
- synth_ai/demos/demo_registry.py +176 -0
- synth_ai/demos/math/__init__.py +1 -0
- synth_ai/demos/math/_common.py +16 -0
- synth_ai/demos/math/app.py +38 -0
- synth_ai/demos/math/config.toml +76 -0
- synth_ai/demos/math/deploy_modal.py +54 -0
- synth_ai/demos/math/modal_task_app.py +702 -0
- synth_ai/demos/math/task_app_entry.py +51 -0
- synth_ai/environments/environment/core.py +7 -1
- synth_ai/environments/examples/bandit/engine.py +0 -1
- synth_ai/environments/examples/bandit/environment.py +0 -1
- synth_ai/environments/examples/wordle/environment.py +0 -1
- synth_ai/evals/base.py +16 -5
- synth_ai/evals/client.py +1 -1
- synth_ai/inference/client.py +1 -1
- synth_ai/judge_schemas.py +8 -8
- synth_ai/learning/client.py +1 -1
- synth_ai/learning/health.py +1 -1
- synth_ai/learning/jobs.py +1 -1
- synth_ai/learning/rl/client.py +1 -1
- synth_ai/learning/rl/env_keys.py +1 -1
- synth_ai/learning/rl/secrets.py +1 -1
- synth_ai/learning/sft/client.py +1 -1
- synth_ai/learning/sft/data.py +407 -4
- synth_ai/learning/validators.py +4 -1
- synth_ai/task/apps/__init__.py +4 -2
- synth_ai/task/config.py +6 -4
- synth_ai/task/rubrics/__init__.py +1 -2
- synth_ai/task/rubrics/loaders.py +14 -10
- synth_ai/task/rubrics.py +219 -0
- synth_ai/task/trace_correlation_helpers.py +24 -11
- synth_ai/task/tracing_utils.py +14 -3
- synth_ai/task/validators.py +2 -3
- synth_ai/tracing_v3/abstractions.py +3 -3
- synth_ai/tracing_v3/config.py +15 -13
- synth_ai/tracing_v3/constants.py +21 -0
- synth_ai/tracing_v3/db_config.py +3 -1
- synth_ai/tracing_v3/decorators.py +10 -7
- synth_ai/tracing_v3/llm_call_record_helpers.py +5 -5
- synth_ai/tracing_v3/session_tracer.py +7 -7
- synth_ai/tracing_v3/storage/base.py +29 -29
- synth_ai/tracing_v3/storage/config.py +3 -3
- synth_ai/tracing_v3/turso/daemon.py +8 -9
- synth_ai/tracing_v3/turso/native_manager.py +80 -72
- synth_ai/tracing_v3/utils.py +2 -2
- synth_ai/tui/cli/query_experiments.py +4 -4
- synth_ai/tui/cli/query_experiments_v3.py +4 -4
- synth_ai/tui/dashboard.py +14 -9
- synth_ai/utils/__init__.py +101 -0
- synth_ai/utils/base_url.py +94 -0
- synth_ai/utils/cli.py +131 -0
- synth_ai/utils/env.py +287 -0
- synth_ai/utils/http.py +169 -0
- synth_ai/utils/modal.py +308 -0
- synth_ai/utils/process.py +212 -0
- synth_ai/utils/prompts.py +39 -0
- synth_ai/utils/sqld.py +122 -0
- synth_ai/utils/task_app_discovery.py +882 -0
- synth_ai/utils/task_app_env.py +186 -0
- synth_ai/utils/task_app_state.py +318 -0
- synth_ai/utils/user_config.py +137 -0
- synth_ai/v0/config/__init__.py +1 -5
- synth_ai/v0/config/base_url.py +1 -7
- synth_ai/v0/tracing/config.py +1 -1
- synth_ai/v0/tracing/decorators.py +1 -1
- synth_ai/v0/tracing/upload.py +1 -1
- synth_ai/v0/tracing_v1/config.py +1 -1
- synth_ai/v0/tracing_v1/decorators.py +1 -1
- synth_ai/v0/tracing_v1/upload.py +1 -1
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/METADATA +85 -31
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/RECORD +229 -117
- synth_ai/cli/man.py +0 -106
- synth_ai/compound/cais.py +0 -0
- synth_ai/core/experiment.py +0 -13
- synth_ai/core/system.py +0 -15
- synth_ai/demo_registry.py +0 -295
- synth_ai/handshake.py +0 -109
- synth_ai/http.py +0 -26
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/WHEEL +0 -0
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/entry_points.txt +0 -0
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/licenses/LICENSE +0 -0
- {synth_ai-0.2.14.dist-info → synth_ai-0.2.16.dist-info}/top_level.txt +0 -0
|
@@ -5,6 +5,7 @@ from __future__ import annotations
|
|
|
5
5
|
|
|
6
6
|
import argparse
|
|
7
7
|
import json
|
|
8
|
+
import os
|
|
8
9
|
import sqlite3
|
|
9
10
|
import sys
|
|
10
11
|
from collections import Counter, defaultdict
|
|
@@ -12,6 +13,13 @@ from collections.abc import Iterable
|
|
|
12
13
|
from pathlib import Path
|
|
13
14
|
from typing import Any
|
|
14
15
|
|
|
16
|
+
from synth_ai._utils.prompts import ensure_required_args
|
|
17
|
+
from synth_ai.tracing_v3.constants import (
|
|
18
|
+
TRACE_DB_BASENAME,
|
|
19
|
+
TRACE_DB_DIR,
|
|
20
|
+
canonical_trace_db_name,
|
|
21
|
+
)
|
|
22
|
+
|
|
15
23
|
Row = sqlite3.Row
|
|
16
24
|
|
|
17
25
|
|
|
@@ -489,55 +497,81 @@ def _validate_dataset(records: list[dict[str, Any]]) -> None:
|
|
|
489
497
|
|
|
490
498
|
|
|
491
499
|
def _find_trace_database() -> Path | None:
|
|
492
|
-
"""Automatically discover the trace database in common locations."""
|
|
500
|
+
"""Automatically discover the most recent trace database in common locations."""
|
|
493
501
|
|
|
494
|
-
|
|
495
|
-
try:
|
|
496
|
-
state_path = Path.home() / ".synth-ai" / "demo.json"
|
|
497
|
-
if state_path.exists():
|
|
498
|
-
import json
|
|
499
|
-
|
|
500
|
-
with state_path.open() as f:
|
|
501
|
-
data = json.load(f)
|
|
502
|
-
demo_dir = data.get("DEMO_DIR")
|
|
503
|
-
if demo_dir:
|
|
504
|
-
candidate = Path(demo_dir) / "traces" / "v3" / "synth_ai.db"
|
|
505
|
-
if candidate.exists():
|
|
506
|
-
return candidate
|
|
507
|
-
except Exception:
|
|
508
|
-
pass
|
|
502
|
+
candidates: list[Path] = []
|
|
509
503
|
|
|
510
|
-
#
|
|
504
|
+
# Walk up parent directories from CWD
|
|
511
505
|
cwd = Path.cwd()
|
|
512
506
|
for parent in [cwd] + list(cwd.parents):
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
|
|
516
|
-
|
|
517
|
-
|
|
518
|
-
|
|
519
|
-
|
|
520
|
-
|
|
521
|
-
|
|
522
|
-
|
|
523
|
-
|
|
524
|
-
|
|
507
|
+
candidates.append(parent / "traces" / "v3")
|
|
508
|
+
|
|
509
|
+
# Standard fallback locations
|
|
510
|
+
candidates.extend(
|
|
511
|
+
[
|
|
512
|
+
TRACE_DB_DIR,
|
|
513
|
+
Path("../traces"),
|
|
514
|
+
Path.home() / "synth-ai" / "traces" / "v3",
|
|
515
|
+
]
|
|
516
|
+
)
|
|
517
|
+
|
|
518
|
+
found: list[Path] = []
|
|
519
|
+
for directory in candidates:
|
|
525
520
|
try:
|
|
526
|
-
if
|
|
527
|
-
|
|
521
|
+
if not directory.exists():
|
|
522
|
+
continue
|
|
523
|
+
for pattern in (
|
|
524
|
+
f"{TRACE_DB_BASENAME}_*.db",
|
|
525
|
+
canonical_trace_db_name(),
|
|
526
|
+
):
|
|
527
|
+
for candidate in directory.glob(pattern):
|
|
528
|
+
found.append(candidate.resolve())
|
|
528
529
|
except Exception:
|
|
529
530
|
continue
|
|
530
531
|
|
|
531
|
-
|
|
532
|
+
if not found:
|
|
533
|
+
return None
|
|
534
|
+
|
|
535
|
+
found.sort(key=lambda p: p.stat().st_mtime, reverse=True)
|
|
536
|
+
return found[0]
|
|
537
|
+
|
|
538
|
+
|
|
539
|
+
def _discover_local_trace_dbs(root: Path) -> list[Path]:
|
|
540
|
+
"""Return trace DBs under *root* (recursively), newest first."""
|
|
541
|
+
|
|
542
|
+
candidates: set[Path] = set()
|
|
543
|
+
ignore_dirs = {".git", ".venv", "__pycache__", "node_modules", "dist", "build"}
|
|
544
|
+
target_exact = canonical_trace_db_name()
|
|
545
|
+
|
|
546
|
+
for dirpath, dirnames, filenames in os.walk(root):
|
|
547
|
+
dirnames[:] = [d for d in dirnames if d not in ignore_dirs]
|
|
548
|
+
for filename in filenames:
|
|
549
|
+
if filename == target_exact or (
|
|
550
|
+
filename.startswith(f"{TRACE_DB_BASENAME}_") and filename.endswith(".db")
|
|
551
|
+
):
|
|
552
|
+
path = Path(dirpath) / filename
|
|
553
|
+
try:
|
|
554
|
+
candidates.add(path.resolve())
|
|
555
|
+
except Exception:
|
|
556
|
+
continue
|
|
557
|
+
|
|
558
|
+
return sorted(candidates, key=lambda p: p.stat().st_mtime, reverse=True)
|
|
532
559
|
|
|
533
560
|
|
|
534
561
|
def main() -> None:
|
|
535
562
|
parser = argparse.ArgumentParser(description=__doc__)
|
|
536
|
-
parser.add_argument("--db", type=Path, default=None, help="Path to tracing_v3 SQLite DB")
|
|
537
563
|
parser.add_argument(
|
|
538
|
-
"--
|
|
564
|
+
"--in",
|
|
565
|
+
dest="input_path",
|
|
566
|
+
type=Path,
|
|
567
|
+
default=None,
|
|
568
|
+
help="Path to tracing_v3 SQLite DB",
|
|
569
|
+
)
|
|
570
|
+
parser.add_argument(
|
|
571
|
+
"--out",
|
|
572
|
+
dest="output_path",
|
|
539
573
|
type=Path,
|
|
540
|
-
|
|
574
|
+
default=None,
|
|
541
575
|
help="Destination JSONL path for the exported dataset",
|
|
542
576
|
)
|
|
543
577
|
parser.add_argument(
|
|
@@ -593,25 +627,109 @@ def main() -> None:
|
|
|
593
627
|
)
|
|
594
628
|
args = parser.parse_args()
|
|
595
629
|
|
|
596
|
-
|
|
597
|
-
|
|
598
|
-
|
|
599
|
-
|
|
600
|
-
|
|
601
|
-
|
|
602
|
-
|
|
603
|
-
|
|
604
|
-
|
|
605
|
-
|
|
630
|
+
default_output_path = (Path.cwd() / "ft_data" / "crafter_sft.jsonl").resolve()
|
|
631
|
+
|
|
632
|
+
initial_path: Path | None = None
|
|
633
|
+
if args.input_path is not None:
|
|
634
|
+
initial_path = Path(args.input_path).expanduser().resolve()
|
|
635
|
+
else:
|
|
636
|
+
discovered = _find_trace_database()
|
|
637
|
+
if discovered is not None:
|
|
638
|
+
initial_path = discovered.expanduser().resolve()
|
|
639
|
+
args.input_path = initial_path
|
|
640
|
+
|
|
641
|
+
if args.output_path is None:
|
|
642
|
+
args.output_path = default_output_path
|
|
643
|
+
|
|
644
|
+
local_candidates = _discover_local_trace_dbs(Path.cwd())
|
|
645
|
+
if local_candidates:
|
|
646
|
+
print("\nDiscovered trace databases:")
|
|
647
|
+
for idx, path in enumerate(local_candidates, start=1):
|
|
648
|
+
marker = " <- most recent" if idx == 1 else ""
|
|
649
|
+
print(f" {idx}) {path}{marker}")
|
|
650
|
+
print(" m) Enter path manually")
|
|
651
|
+
print(" 0) Abort")
|
|
652
|
+
|
|
653
|
+
default_index = 1
|
|
654
|
+
if initial_path:
|
|
655
|
+
for idx, candidate in enumerate(local_candidates, start=1):
|
|
656
|
+
if candidate == initial_path:
|
|
657
|
+
default_index = idx
|
|
658
|
+
break
|
|
606
659
|
|
|
660
|
+
while True:
|
|
661
|
+
prompt = f"Select database [{default_index}]: "
|
|
662
|
+
choice = input(prompt).strip().lower()
|
|
663
|
+
if not choice:
|
|
664
|
+
args.input_path = local_candidates[default_index - 1]
|
|
665
|
+
break
|
|
666
|
+
if choice == "0":
|
|
667
|
+
raise SystemExit("Aborted by user.")
|
|
668
|
+
if choice in {"m", "manual"}:
|
|
669
|
+
manual = input("Enter trace database path: ").strip()
|
|
670
|
+
if manual:
|
|
671
|
+
args.input_path = Path(manual)
|
|
672
|
+
break
|
|
673
|
+
print("Path required; try again.")
|
|
674
|
+
continue
|
|
675
|
+
try:
|
|
676
|
+
idx = int(choice)
|
|
677
|
+
except ValueError:
|
|
678
|
+
print("Invalid selection; enter a number, 'm', or 0 to abort.")
|
|
679
|
+
continue
|
|
680
|
+
if 1 <= idx <= len(local_candidates):
|
|
681
|
+
args.input_path = local_candidates[idx - 1]
|
|
682
|
+
break
|
|
683
|
+
print(f"Select between 1 and {len(local_candidates)}, 'm', or 0.")
|
|
684
|
+
elif initial_path is not None:
|
|
685
|
+
args.input_path = initial_path
|
|
686
|
+
|
|
687
|
+
# If output wasn't overridden, derive it from the chosen DB name
|
|
688
|
+
if args.output_path == default_output_path and args.input_path:
|
|
689
|
+
db_name = Path(args.input_path).name # e.g., task_app_traces_2025-10-23_13-23-02.db
|
|
690
|
+
timestamp = db_name[:-3] if db_name.endswith(".db") else db_name
|
|
691
|
+
if timestamp.startswith("task_app_traces_"):
|
|
692
|
+
timestamp = timestamp[len("task_app_traces_") :]
|
|
693
|
+
derived_name = f"sft_dataset_{timestamp}.jsonl"
|
|
694
|
+
args.output_path = (Path.cwd() / "ft_data" / derived_name).resolve()
|
|
695
|
+
|
|
696
|
+
input_default = (
|
|
697
|
+
Path(args.input_path).expanduser().resolve()
|
|
698
|
+
if args.input_path is not None
|
|
699
|
+
else (TRACE_DB_DIR / canonical_trace_db_name()).expanduser().resolve()
|
|
700
|
+
)
|
|
701
|
+
output_default = Path(args.output_path).expanduser().resolve() if args.output_path else default_output_path
|
|
702
|
+
|
|
703
|
+
args = ensure_required_args(
|
|
704
|
+
args,
|
|
705
|
+
{
|
|
706
|
+
"input_path": "Trace database path",
|
|
707
|
+
"output_path": "Output JSONL path",
|
|
708
|
+
},
|
|
709
|
+
coerce={
|
|
710
|
+
"input_path": lambda raw: Path(raw).expanduser().resolve(),
|
|
711
|
+
"output_path": lambda raw: Path(raw).expanduser().resolve(),
|
|
712
|
+
},
|
|
713
|
+
defaults={
|
|
714
|
+
"input_path": input_default,
|
|
715
|
+
"output_path": output_default,
|
|
716
|
+
},
|
|
717
|
+
)
|
|
718
|
+
|
|
719
|
+
db_path = Path(args.input_path).expanduser().resolve()
|
|
720
|
+
print(f"Trace database: {db_path}")
|
|
607
721
|
if not db_path.exists():
|
|
608
|
-
|
|
609
|
-
|
|
722
|
+
discovered = _find_trace_database()
|
|
723
|
+
if discovered and discovered.exists():
|
|
724
|
+
discovered = discovered.resolve()
|
|
725
|
+
print(f"Discovered trace database: {discovered}")
|
|
726
|
+
db_path = discovered
|
|
727
|
+
else:
|
|
728
|
+
print(f"Database not found: {db_path}", file=sys.stderr)
|
|
729
|
+
raise SystemExit(1)
|
|
610
730
|
|
|
611
|
-
output_path = args.
|
|
612
|
-
|
|
613
|
-
output_path = Path("ft_data/crafter_traces.jsonl")
|
|
614
|
-
print(f"Output will be written to: {output_path.resolve()}")
|
|
731
|
+
output_path = Path(args.output_path).expanduser().resolve()
|
|
732
|
+
print(f"Output dataset: {output_path}")
|
|
615
733
|
|
|
616
734
|
min_unique = args.min_unique
|
|
617
735
|
if min_unique is None:
|
|
@@ -619,15 +737,11 @@ def main() -> None:
|
|
|
619
737
|
print(f"Minimum unique achievements filter: {min_unique} (all traces)")
|
|
620
738
|
|
|
621
739
|
# Override args with prompted values
|
|
622
|
-
args.
|
|
623
|
-
args.
|
|
740
|
+
args.input_path = db_path
|
|
741
|
+
args.output_path = output_path
|
|
624
742
|
args.min_unique = min_unique
|
|
625
743
|
|
|
626
|
-
|
|
627
|
-
print(f"Database not found: {args.db}", file=sys.stderr)
|
|
628
|
-
raise SystemExit(1)
|
|
629
|
-
|
|
630
|
-
conn = connect(args.db)
|
|
744
|
+
conn = connect(args.input_path)
|
|
631
745
|
try:
|
|
632
746
|
(
|
|
633
747
|
achievements_map,
|
|
@@ -708,11 +822,11 @@ def main() -> None:
|
|
|
708
822
|
raise SystemExit(1)
|
|
709
823
|
|
|
710
824
|
_validate_dataset(dataset)
|
|
711
|
-
write_jsonl(args.
|
|
825
|
+
write_jsonl(args.output_path, dataset)
|
|
712
826
|
session_ids = {item.get("metadata", {}).get("session_id") for item in dataset}
|
|
713
827
|
session_ids.discard(None)
|
|
714
828
|
print(
|
|
715
|
-
f"Wrote {len(dataset)} examples from {len(session_ids)} session(s) -> {args.
|
|
829
|
+
f"Wrote {len(dataset)} examples from {len(session_ids)} session(s) -> {args.output_path.resolve()}",
|
|
716
830
|
file=sys.stderr,
|
|
717
831
|
)
|
|
718
832
|
finally:
|
|
@@ -1,179 +1,110 @@
|
|
|
1
1
|
# Warming Up to RL (Crafter)
|
|
2
2
|
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
## Quick Reference Commands
|
|
6
|
-
|
|
7
|
-
- Serve task app locally with tracing:
|
|
8
|
-
```bash
|
|
9
|
-
uvx synth-ai serve --port 8001 --env-file examples/warming_up_to_rl/.env --trace traces/v3
|
|
10
|
-
```
|
|
11
|
-
- Deploy to Modal:
|
|
12
|
-
```bash
|
|
13
|
-
uvx synth-ai deploy grpo-crafter --name grpo-crafter-task-app
|
|
14
|
-
```
|
|
15
|
-
- Groq rollout (server-side):
|
|
16
|
-
```bash
|
|
17
|
-
uv run python examples/warming_up_to_rl/run_eval.py --toml examples/warming_up_to_rl/configs/eval_groq_qwen32b.toml --use-rollout
|
|
18
|
-
```
|
|
19
|
-
- Export SFT data from traced runs:
|
|
20
|
-
```bash
|
|
21
|
-
python examples/warming_up_to_rl/export_trace_sft.py --db traces/v3/synth_ai.db --output ft_data/crafter_traces.jsonl
|
|
22
|
-
```
|
|
23
|
-
- FFT via CLI:
|
|
24
|
-
```bash
|
|
25
|
-
uvx synth-ai train --type sft --config examples/warming_up_to_rl/configs/crafter_fft.toml --dataset /absolute/path/to/data.jsonl
|
|
26
|
-
```
|
|
27
|
-
- Evaluate FFT checkpoint:
|
|
28
|
-
```bash
|
|
29
|
-
uv run python examples/warming_up_to_rl/run_eval.py --toml examples/warming_up_to_rl/configs/eval_fft_qwen4b.toml --use-rollout
|
|
30
|
-
```
|
|
31
|
-
- RL via CLI (FFT-first):
|
|
32
|
-
```bash
|
|
33
|
-
uvx synth-ai train --type rl --config examples/warming_up_to_rl/configs/rl_from_ft.toml
|
|
34
|
-
```
|
|
35
|
-
|
|
36
|
-
---
|
|
3
|
+
This folder contains an end-to-end Crafter workflow: stand up the task app, collect Groq-powered rollouts, export tracing data for supervised fine-tuning, run FFT/RL jobs, and evaluate checkpoints. Commands assume the repository root as the working directory unless stated otherwise.
|
|
37
4
|
|
|
38
5
|
## 1. Prerequisites
|
|
39
6
|
|
|
40
7
|
- Python 3.11+
|
|
41
|
-
- `uv
|
|
42
|
-
- Modal CLI (`modal token new`) if you plan to deploy the task app
|
|
43
|
-
-
|
|
44
|
-
- `SYNTH_API_KEY`
|
|
45
|
-
- `
|
|
46
|
-
|
|
47
|
-
- Optional: `GROQ_API_KEY`, `OPENAI_API_KEY` for proxy endpoints
|
|
48
|
-
|
|
49
|
-
`uvx synth-ai setup` can populate the `.env` by guiding you through the dashboard handshake.
|
|
8
|
+
- [`uv`](https://docs.astral.sh/uv/) / `uvx` (or install `synth-ai` inside a virtualenv)
|
|
9
|
+
- Modal CLI (`modal token new`) if you plan to deploy the task app remotely
|
|
10
|
+
- API keys:
|
|
11
|
+
- `SYNTH_API_KEY` and `ENVIRONMENT_API_KEY` are required for CLI flows
|
|
12
|
+
- `GROQ_API_KEY` (used by the Groq policy) and optional `OPENAI_API_KEY`
|
|
13
|
+
- Run `uvx synth-ai setup` once to pair with the Synth dashboard and populate `~/.synth-ai/user_config.json`
|
|
50
14
|
|
|
51
|
-
|
|
15
|
+
## 2. Task App
|
|
52
16
|
|
|
53
|
-
|
|
54
|
-
|
|
55
|
-
### Local development
|
|
17
|
+
### Local serve (FastAPI)
|
|
56
18
|
|
|
57
19
|
```bash
|
|
58
|
-
uvx synth-ai serve
|
|
20
|
+
uvx synth-ai serve \
|
|
21
|
+
--env-file examples/warming_up_to_rl/.env \
|
|
22
|
+
--host 127.0.0.1 --port 8001 \
|
|
23
|
+
--trace traces/v3
|
|
59
24
|
```
|
|
60
25
|
|
|
61
|
-
- `--trace`
|
|
62
|
-
- Add `--
|
|
26
|
+
- `--trace` creates/uses `traces/v3/task_app_traces_<timestamp>.db` for the lifetime of the server. All rollouts append to this file.
|
|
27
|
+
- Add `--trace-db` to override the SQLite path (one DB per server instance).
|
|
28
|
+
- Pass `--reload` during development for auto-reload.
|
|
63
29
|
|
|
64
30
|
### Modal deploy / serve
|
|
65
31
|
|
|
66
32
|
```bash
|
|
67
|
-
uvx synth-ai deploy grpo-crafter --name grpo-crafter-task-app
|
|
68
|
-
uvx synth-ai modal-serve grpo-crafter --name grpo-crafter-task-app
|
|
33
|
+
uvx synth-ai deploy grpo-crafter --name grpo-crafter-task-app
|
|
34
|
+
uvx synth-ai modal-serve grpo-crafter --name grpo-crafter-task-app
|
|
69
35
|
```
|
|
70
36
|
|
|
71
|
-
Both commands
|
|
72
|
-
|
|
73
|
-
## 3.
|
|
74
|
-
|
|
75
|
-
|
|
76
|
-
|
|
77
|
-
- Groq Qwen3-32B:
|
|
78
|
-
```bash
|
|
79
|
-
uv run python examples/warming_up_to_rl/run_eval.py --toml examples/warming_up_to_rl/configs/eval_groq_qwen32b.toml --use-rollout
|
|
80
|
-
```
|
|
81
|
-
- Synth vLLM Qwen3-4B (Modal-hosted inference URL specified in TOML):
|
|
82
|
-
```bash
|
|
83
|
-
uv run python examples/warming_up_to_rl/run_eval.py --toml examples/warming_up_to_rl/configs/eval_modal_qwen4b.toml --use-rollout
|
|
84
|
-
```
|
|
85
|
-
|
|
86
|
-
`--use-rollout` drives the task app’s `/rollout` endpoint so achievements and metrics are captured. Without it the script issues per-step `initialize/step/terminate` calls.
|
|
87
|
-
|
|
88
|
-
## 4. Tracing and SFT Dataset Export
|
|
89
|
-
|
|
90
|
-
1. Serve the task app with tracing enabled (see Section 2). Optionally, run the traced rollout helper against the running server:
|
|
91
|
-
```bash
|
|
92
|
-
uv run python examples/warming_up_to_rl/run_local_rollout_traced.py \
|
|
93
|
-
--base-url http://localhost:8001 \
|
|
94
|
-
--api-key "$ENVIRONMENT_API_KEY" \
|
|
95
|
-
--inference-api-key "$GROQ_API_KEY" \
|
|
96
|
-
--model qwen/qwen3-32b \
|
|
97
|
-
--inference-url https://api.groq.com/openai \
|
|
98
|
-
--max-llm-calls 3 \
|
|
99
|
-
--run-id local-trace
|
|
100
|
-
```
|
|
101
|
-
2. Inspect local trace databases:
|
|
102
|
-
```bash
|
|
103
|
-
uvx synth-ai traces --limit 10
|
|
104
|
-
```
|
|
105
|
-
3. Export JSONL suitable for SFT:
|
|
106
|
-
```bash
|
|
107
|
-
python examples/warming_up_to_rl/export_trace_sft.py \
|
|
108
|
-
--db traces/v3/synth_ai.db \
|
|
109
|
-
--min-achievements 3 \
|
|
110
|
-
--output ft_data/crafter_traces.jsonl
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
The exporter enriches each example with achievements unlocked, model metadata, and reward summaries.
|
|
114
|
-
|
|
115
|
-
## 5. SFT / FFT Training
|
|
116
|
-
|
|
117
|
-
### Preferred: `uvx synth-ai train`
|
|
37
|
+
Both commands reuse the same tracing defaults; the backend persists rollouts into the configured SQLite/Turso store.
|
|
38
|
+
|
|
39
|
+
## 3. Collect rollouts
|
|
40
|
+
|
|
41
|
+
Hit the running task app with the local helper to gather a traced rollout (Groq policy shown below):
|
|
118
42
|
|
|
119
43
|
```bash
|
|
120
|
-
|
|
121
|
-
--
|
|
122
|
-
--
|
|
123
|
-
--
|
|
44
|
+
python examples/warming_up_to_rl/run_local_rollout_traced.py \
|
|
45
|
+
--base-url http://localhost:8001 \
|
|
46
|
+
--api-key "$ENVIRONMENT_API_KEY" \
|
|
47
|
+
--inference-api-key "$GROQ_API_KEY" \
|
|
48
|
+
--model qwen/qwen3-32b \
|
|
49
|
+
--inference-url https://api.groq.com/openai \
|
|
50
|
+
--max-llm-calls 3 \
|
|
51
|
+
--run-id local-trace
|
|
124
52
|
```
|
|
125
53
|
|
|
126
|
-
|
|
127
|
-
-
|
|
128
|
-
-
|
|
129
|
-
- Submit the job and poll until completion unless `--no-poll` is set.
|
|
54
|
+
Artifacts produced per rollout:
|
|
55
|
+
- `traces/v3/task_app_traces_<timestamp>.db`: the task app’s append-only database (one per server lifetime; new rollouts append rows).
|
|
56
|
+
- `local-trace_trace.json`: single-run JSON snapshot for inspection.
|
|
130
57
|
|
|
131
|
-
|
|
58
|
+
## 4. Export SFT-ready data
|
|
132
59
|
|
|
133
60
|
```bash
|
|
134
|
-
|
|
135
|
-
--toml examples/warming_up_to_rl/configs/crafter_fft.toml \
|
|
136
|
-
--data /absolute/path/to/crafter_traces.jsonl \
|
|
137
|
-
--poll-seconds 1800
|
|
61
|
+
python examples/warming_up_to_rl/export_trace_sft.py
|
|
138
62
|
```
|
|
139
63
|
|
|
140
|
-
|
|
64
|
+
- When run without `--in`, the script lists every `task_app_traces*.db` under the current directory (and subdirectories), sorted by recency, and prompts you to pick one (the newest is marked `← most recent`).
|
|
65
|
+
- The exporter validates the trace data, filters sessions, and writes JSONL to `ft_data/crafter_sft.jsonl` by default (override with `--out`).
|
|
141
66
|
|
|
142
|
-
##
|
|
67
|
+
## 5. FFT / SFT Training
|
|
143
68
|
|
|
144
|
-
|
|
69
|
+
Recommended via CLI:
|
|
145
70
|
|
|
146
71
|
```bash
|
|
147
|
-
|
|
72
|
+
uvx synth-ai train \
|
|
73
|
+
--type sft \
|
|
74
|
+
--config examples/warming_up_to_rl/configs/crafter_fft.toml \
|
|
75
|
+
--dataset /absolute/path/to/crafter_sft.jsonl
|
|
148
76
|
```
|
|
149
77
|
|
|
150
|
-
|
|
78
|
+
The CLI uploads training data, submits the job to the Synth backend, and polls for completion. A legacy helper (`run_fft_and_save.py`) is still provided for ad-hoc usage.
|
|
151
79
|
|
|
152
|
-
##
|
|
80
|
+
## 6. Evaluate checkpoints
|
|
153
81
|
|
|
154
|
-
|
|
82
|
+
Update the relevant TOML with the model identifier (e.g., `model = "ft:<model_id>"`) and run:
|
|
155
83
|
|
|
156
84
|
```bash
|
|
157
|
-
|
|
158
|
-
--
|
|
159
|
-
--
|
|
85
|
+
uv run python examples/warming_up_to_rl/run_eval.py \
|
|
86
|
+
--toml examples/warming_up_to_rl/configs/eval_fft_qwen4b.toml \
|
|
87
|
+
--use-rollout
|
|
160
88
|
```
|
|
161
89
|
|
|
162
|
-
|
|
90
|
+
`--use-rollout` exercises the `/rollout` endpoint so achievements/rewards are surfaced in traces.
|
|
163
91
|
|
|
164
|
-
|
|
92
|
+
## 7. RL Training
|
|
165
93
|
|
|
166
94
|
```bash
|
|
167
|
-
|
|
168
|
-
--
|
|
95
|
+
uvx synth-ai train \
|
|
96
|
+
--type rl \
|
|
97
|
+
--config examples/warming_up_to_rl/configs/rl_from_base_qwen4b.toml
|
|
169
98
|
```
|
|
170
99
|
|
|
171
|
-
|
|
100
|
+
Start from `rl_from_ft.toml` if you want to bootstrap from a previously fine-tuned checkpoint.
|
|
101
|
+
|
|
102
|
+
---
|
|
172
103
|
|
|
173
|
-
|
|
104
|
+
### Notes on tracing
|
|
174
105
|
|
|
175
|
-
- `
|
|
176
|
-
- `
|
|
177
|
-
- `
|
|
106
|
+
- **One SQLite DB per server:** every task app instance maintains a single `task_app_traces_<timestamp>.db` and appends each new rollout. If you want a fresh file, start another `synth-ai serve` with a different `--trace-db` path.
|
|
107
|
+
- **JSON snapshots per run:** `run_local_rollout_traced.py` writes `<run_id>_trace.json` so you can inspect or hand-edit individual runs.
|
|
108
|
+
- **Exporter discovery:** the SFT exporter recursively catalogs all `task_app_traces*.db` files beneath the task app directory, allowing you to select any historical snapshot when exporting training data.
|
|
178
109
|
|
|
179
|
-
|
|
110
|
+
These conventions keep tracing predictable: continuous history per server, easy selection of historical DBs, and one-off JSON exports for quick analysis.
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
# Crafter Task App
|
|
2
|
+
|
|
3
|
+
This example is now wired through the shared Synth task-app harness. Use the
|
|
4
|
+
`uvx synth-ai` CLI to run it locally or deploy it to Modal without touching the
|
|
5
|
+
underlying FastAPI plumbing.
|
|
6
|
+
|
|
7
|
+
## Local development
|
|
8
|
+
```bash
|
|
9
|
+
uvx synth-ai serve grpo-crafter --port 8001
|
|
10
|
+
# Optional extras:
|
|
11
|
+
# --env-file path/to/.env # load additional environment variables
|
|
12
|
+
# --reload # enable uvicorn auto-reload
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
Useful endpoints while the server is running:
|
|
16
|
+
- `GET http://localhost:8001/health`
|
|
17
|
+
- `GET http://localhost:8001/info`
|
|
18
|
+
- `GET http://localhost:8001/task_info?seed=42`
|
|
19
|
+
- `POST http://localhost:8001/rollout`
|
|
20
|
+
|
|
21
|
+
## Deploy to Modal
|
|
22
|
+
```bash
|
|
23
|
+
uvx synth-ai deploy grpo-crafter --name grpo-crafter-task-app
|
|
24
|
+
```
|
|
25
|
+
|
|
26
|
+
Requirements:
|
|
27
|
+
- Modal CLI installed and authenticated (`modal token new`).
|
|
28
|
+
- Either provide an `.env` with `ENVIRONMENT_API_KEY`, `GROQ_API_KEY`, and `OPENAI_API_KEY`
|
|
29
|
+
(recommended; pass via `--env-file`). The deploy command injects these values via an inline
|
|
30
|
+
Modal secret plus `Secret.from_dotenv`, so the minted environment key stays in sync with
|
|
31
|
+
what the CLI sends.
|
|
32
|
+
- Or ensure Modal secrets `groq-api-key` and `openai-api-key` exist and continue to supply
|
|
33
|
+
model vendor credentials that way.
|
|
34
|
+
|
|
35
|
+
The CLI generates a Modal entrypoint on the fly using the shared
|
|
36
|
+
`TaskAppConfig`, ensuring the container matches the local FastAPI behavior.
|
|
37
|
+
|
|
38
|
+
## Compatibility note
|
|
39
|
+
`examples/warming_up_to_rl/task_app/grpo_crafter_task_app.py` remains as a
|
|
40
|
+
legacy wrapper exposing `fastapi_app()` and a `__main__` entrypoint. Behind the
|
|
41
|
+
scenes it proxies to the shared configuration; prefer the CLI workflow above
|
|
42
|
+
for new automation and tests.
|