@bgicli/bgicli 2.2.8 → 2.2.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (113) hide show
  1. package/data/skills/anthropic-algorithmic-art/SKILL.md +405 -0
  2. package/data/skills/anthropic-canvas-design/SKILL.md +130 -0
  3. package/data/skills/anthropic-claude-api/SKILL.md +243 -0
  4. package/data/skills/anthropic-doc-coauthoring/SKILL.md +375 -0
  5. package/data/skills/anthropic-docx/SKILL.md +590 -0
  6. package/data/skills/anthropic-frontend-design/SKILL.md +42 -0
  7. package/data/skills/anthropic-internal-comms/SKILL.md +32 -0
  8. package/data/skills/anthropic-mcp-builder/SKILL.md +236 -0
  9. package/data/skills/anthropic-pdf/SKILL.md +314 -0
  10. package/data/skills/anthropic-pptx/SKILL.md +232 -0
  11. package/data/skills/anthropic-skill-creator/SKILL.md +485 -0
  12. package/data/skills/anthropic-webapp-testing/SKILL.md +96 -0
  13. package/data/skills/anthropic-xlsx/SKILL.md +292 -0
  14. package/data/skills/arxiv-database/SKILL.md +362 -0
  15. package/data/skills/astropy/SKILL.md +329 -0
  16. package/data/skills/ctx-advanced-evaluation/SKILL.md +402 -0
  17. package/data/skills/ctx-bdi-mental-states/SKILL.md +311 -0
  18. package/data/skills/ctx-context-compression/SKILL.md +272 -0
  19. package/data/skills/ctx-context-degradation/SKILL.md +206 -0
  20. package/data/skills/ctx-context-fundamentals/SKILL.md +201 -0
  21. package/data/skills/ctx-context-optimization/SKILL.md +195 -0
  22. package/data/skills/ctx-evaluation/SKILL.md +251 -0
  23. package/data/skills/ctx-filesystem-context/SKILL.md +287 -0
  24. package/data/skills/ctx-hosted-agents/SKILL.md +260 -0
  25. package/data/skills/ctx-memory-systems/SKILL.md +225 -0
  26. package/data/skills/ctx-multi-agent-patterns/SKILL.md +257 -0
  27. package/data/skills/ctx-project-development/SKILL.md +291 -0
  28. package/data/skills/ctx-tool-design/SKILL.md +271 -0
  29. package/data/skills/dhdna-profiler/SKILL.md +162 -0
  30. package/data/skills/generate-image/SKILL.md +183 -0
  31. package/data/skills/geomaster/SKILL.md +365 -0
  32. package/data/skills/get-available-resources/SKILL.md +275 -0
  33. package/data/skills/hamelsmu-build-review-interface/SKILL.md +96 -0
  34. package/data/skills/hamelsmu-error-analysis/SKILL.md +164 -0
  35. package/data/skills/hamelsmu-eval-audit/SKILL.md +183 -0
  36. package/data/skills/hamelsmu-evaluate-rag/SKILL.md +177 -0
  37. package/data/skills/hamelsmu-generate-synthetic-data/SKILL.md +131 -0
  38. package/data/skills/hamelsmu-validate-evaluator/SKILL.md +212 -0
  39. package/data/skills/hamelsmu-write-judge-prompt/SKILL.md +144 -0
  40. package/data/skills/hf-cli/SKILL.md +174 -0
  41. package/data/skills/hf-mcp/SKILL.md +178 -0
  42. package/data/skills/hugging-face-dataset-viewer/SKILL.md +121 -0
  43. package/data/skills/hugging-face-datasets/SKILL.md +542 -0
  44. package/data/skills/hugging-face-evaluation/SKILL.md +651 -0
  45. package/data/skills/hugging-face-jobs/SKILL.md +1042 -0
  46. package/data/skills/hugging-face-model-trainer/SKILL.md +717 -0
  47. package/data/skills/hugging-face-paper-pages/SKILL.md +239 -0
  48. package/data/skills/hugging-face-paper-publisher/SKILL.md +624 -0
  49. package/data/skills/hugging-face-tool-builder/SKILL.md +110 -0
  50. package/data/skills/hugging-face-trackio/SKILL.md +115 -0
  51. package/data/skills/hugging-face-vision-trainer/SKILL.md +593 -0
  52. package/data/skills/huggingface-gradio/SKILL.md +245 -0
  53. package/data/skills/matlab/SKILL.md +376 -0
  54. package/data/skills/modal/SKILL.md +381 -0
  55. package/data/skills/openai-cloudflare-deploy/SKILL.md +224 -0
  56. package/data/skills/openai-develop-web-game/SKILL.md +149 -0
  57. package/data/skills/openai-doc/SKILL.md +80 -0
  58. package/data/skills/openai-figma/SKILL.md +42 -0
  59. package/data/skills/openai-figma-implement-design/SKILL.md +264 -0
  60. package/data/skills/openai-gh-address-comments/SKILL.md +25 -0
  61. package/data/skills/openai-gh-fix-ci/SKILL.md +69 -0
  62. package/data/skills/openai-imagegen/SKILL.md +174 -0
  63. package/data/skills/openai-jupyter-notebook/SKILL.md +107 -0
  64. package/data/skills/openai-linear/SKILL.md +87 -0
  65. package/data/skills/openai-netlify-deploy/SKILL.md +247 -0
  66. package/data/skills/openai-notion-knowledge-capture/SKILL.md +56 -0
  67. package/data/skills/openai-notion-meeting-intelligence/SKILL.md +60 -0
  68. package/data/skills/openai-notion-research-documentation/SKILL.md +59 -0
  69. package/data/skills/openai-notion-spec-to-implementation/SKILL.md +58 -0
  70. package/data/skills/openai-openai-docs/SKILL.md +69 -0
  71. package/data/skills/openai-pdf/SKILL.md +67 -0
  72. package/data/skills/openai-playwright/SKILL.md +147 -0
  73. package/data/skills/openai-render-deploy/SKILL.md +479 -0
  74. package/data/skills/openai-screenshot/SKILL.md +267 -0
  75. package/data/skills/openai-security-best-practices/SKILL.md +86 -0
  76. package/data/skills/openai-security-ownership-map/SKILL.md +206 -0
  77. package/data/skills/openai-security-threat-model/SKILL.md +81 -0
  78. package/data/skills/openai-sentry/SKILL.md +123 -0
  79. package/data/skills/openai-sora/SKILL.md +178 -0
  80. package/data/skills/openai-speech/SKILL.md +144 -0
  81. package/data/skills/openai-spreadsheet/SKILL.md +145 -0
  82. package/data/skills/openai-transcribe/SKILL.md +81 -0
  83. package/data/skills/openai-vercel-deploy/SKILL.md +77 -0
  84. package/data/skills/openai-yeet/SKILL.md +28 -0
  85. package/data/skills/pennylane/SKILL.md +224 -0
  86. package/data/skills/polars-bio/SKILL.md +374 -0
  87. package/data/skills/primekg/SKILL.md +97 -0
  88. package/data/skills/pymatgen/SKILL.md +689 -0
  89. package/data/skills/qiskit/SKILL.md +273 -0
  90. package/data/skills/qutip/SKILL.md +316 -0
  91. package/data/skills/recursive-decomposition/SKILL.md +185 -0
  92. package/data/skills/rowan/SKILL.md +427 -0
  93. package/data/skills/scholar-evaluation/SKILL.md +298 -0
  94. package/data/skills/sentry-create-alert/SKILL.md +210 -0
  95. package/data/skills/sentry-fix-issues/SKILL.md +126 -0
  96. package/data/skills/sentry-pr-code-review/SKILL.md +105 -0
  97. package/data/skills/sentry-python-sdk/SKILL.md +317 -0
  98. package/data/skills/sentry-setup-ai-monitoring/SKILL.md +217 -0
  99. package/data/skills/stable-baselines3/SKILL.md +297 -0
  100. package/data/skills/sympy/SKILL.md +498 -0
  101. package/data/skills/trailofbits-ask-questions-if-underspecified/SKILL.md +85 -0
  102. package/data/skills/trailofbits-audit-context-building/SKILL.md +302 -0
  103. package/data/skills/trailofbits-differential-review/SKILL.md +220 -0
  104. package/data/skills/trailofbits-insecure-defaults/SKILL.md +117 -0
  105. package/data/skills/trailofbits-modern-python/SKILL.md +333 -0
  106. package/data/skills/trailofbits-property-based-testing/SKILL.md +123 -0
  107. package/data/skills/trailofbits-semgrep-rule-creator/SKILL.md +172 -0
  108. package/data/skills/trailofbits-sharp-edges/SKILL.md +292 -0
  109. package/data/skills/trailofbits-variant-analysis/SKILL.md +142 -0
  110. package/data/skills/transformers.js/SKILL.md +637 -0
  111. package/data/skills/writing/SKILL.md +419 -0
  112. package/dist/bgi.js +66 -2
  113. package/package.json +1 -1
@@ -0,0 +1,297 @@
1
+ ---
2
+ name: stable-baselines3
3
+ description: Production-ready reinforcement learning algorithms (PPO, SAC, DQN, TD3, DDPG, A2C) with scikit-learn-like API. Use for standard RL experiments, quick prototyping, and well-documented algorithm implementations. Best for single-agent RL with Gymnasium environments. For high-performance parallel training, multi-agent systems, or custom vectorized environments, use pufferlib instead.
4
+ license: MIT license
5
+ metadata:
6
+ skill-author: K-Dense Inc.
7
+ ---
8
+
9
+ # Stable Baselines3
10
+
11
+ ## Overview
12
+
13
+ Stable Baselines3 (SB3) is a PyTorch-based library providing reliable implementations of reinforcement learning algorithms. This skill provides comprehensive guidance for training RL agents, creating custom environments, implementing callbacks, and optimizing training workflows using SB3's unified API.
14
+
15
+ ## Core Capabilities
16
+
17
+ ### 1. Training RL Agents
18
+
19
+ **Basic Training Pattern:**
20
+
21
+ ```python
22
+ import gymnasium as gym
23
+ from stable_baselines3 import PPO
24
+
25
+ # Create environment
26
+ env = gym.make("CartPole-v1")
27
+
28
+ # Initialize agent
29
+ model = PPO("MlpPolicy", env, verbose=1)
30
+
31
+ # Train the agent
32
+ model.learn(total_timesteps=10000)
33
+
34
+ # Save the model
35
+ model.save("ppo_cartpole")
36
+
37
+ # Load the model (without prior instantiation)
38
+ model = PPO.load("ppo_cartpole", env=env)
39
+ ```
40
+
41
+ **Important Notes:**
42
+ - `total_timesteps` is a lower bound; actual training may exceed this due to batch collection
43
+ - Use `model.load()` as a static method, not on an existing instance
44
+ - The replay buffer is NOT saved with the model to save space
45
+
46
+ **Algorithm Selection:**
47
+ Use `references/algorithms.md` for detailed algorithm characteristics and selection guidance. Quick reference:
48
+ - **PPO/A2C**: General-purpose, supports all action space types, good for multiprocessing
49
+ - **SAC/TD3**: Continuous control, off-policy, sample-efficient
50
+ - **DQN**: Discrete actions, off-policy
51
+ - **HER**: Goal-conditioned tasks
52
+
53
+ See `scripts/train_rl_agent.py` for a complete training template with best practices.
54
+
55
+ ### 2. Custom Environments
56
+
57
+ **Requirements:**
58
+ Custom environments must inherit from `gymnasium.Env` and implement:
59
+ - `__init__()`: Define action_space and observation_space
60
+ - `reset(seed, options)`: Return initial observation and info dict
61
+ - `step(action)`: Return observation, reward, terminated, truncated, info
62
+ - `render()`: Visualization (optional)
63
+ - `close()`: Cleanup resources
64
+
65
+ **Key Constraints:**
66
+ - Image observations must be `np.uint8` in range [0, 255]
67
+ - Use channel-first format when possible (channels, height, width)
68
+ - SB3 normalizes images automatically by dividing by 255
69
+ - Set `normalize_images=False` in policy_kwargs if pre-normalized
70
+ - SB3 does NOT support `Discrete` or `MultiDiscrete` spaces with `start!=0`
71
+
72
+ **Validation:**
73
+ ```python
74
+ from stable_baselines3.common.env_checker import check_env
75
+
76
+ check_env(env, warn=True)
77
+ ```
78
+
79
+ See `scripts/custom_env_template.py` for a complete custom environment template and `references/custom_environments.md` for comprehensive guidance.
80
+
81
+ ### 3. Vectorized Environments
82
+
83
+ **Purpose:**
84
+ Vectorized environments run multiple environment instances in parallel, accelerating training and enabling certain wrappers (frame-stacking, normalization).
85
+
86
+ **Types:**
87
+ - **DummyVecEnv**: Sequential execution on current process (for lightweight environments)
88
+ - **SubprocVecEnv**: Parallel execution across processes (for compute-heavy environments)
89
+
90
+ **Quick Setup:**
91
+ ```python
92
+ from stable_baselines3.common.env_util import make_vec_env
93
+
94
+ # Create 4 parallel environments
95
+ env = make_vec_env("CartPole-v1", n_envs=4, vec_env_cls=SubprocVecEnv)
96
+
97
+ model = PPO("MlpPolicy", env, verbose=1)
98
+ model.learn(total_timesteps=25000)
99
+ ```
100
+
101
+ **Off-Policy Optimization:**
102
+ When using multiple environments with off-policy algorithms (SAC, TD3, DQN), set `gradient_steps=-1` to perform one gradient update per environment step, balancing wall-clock time and sample efficiency.
103
+
104
+ **API Differences:**
105
+ - `reset()` returns only observations (info available in `vec_env.reset_infos`)
106
+ - `step()` returns 4-tuple: `(obs, rewards, dones, infos)` not 5-tuple
107
+ - Environments auto-reset after episodes
108
+ - Terminal observations available via `infos[env_idx]["terminal_observation"]`
109
+
110
+ See `references/vectorized_envs.md` for detailed information on wrappers and advanced usage.
111
+
112
+ ### 4. Callbacks for Monitoring and Control
113
+
114
+ **Purpose:**
115
+ Callbacks enable monitoring metrics, saving checkpoints, implementing early stopping, and custom training logic without modifying core algorithms.
116
+
117
+ **Common Callbacks:**
118
+ - **EvalCallback**: Evaluate periodically and save best model
119
+ - **CheckpointCallback**: Save model checkpoints at intervals
120
+ - **StopTrainingOnRewardThreshold**: Stop when target reward reached
121
+ - **ProgressBarCallback**: Display training progress with timing
122
+
123
+ **Custom Callback Structure:**
124
+ ```python
125
+ from stable_baselines3.common.callbacks import BaseCallback
126
+
127
+ class CustomCallback(BaseCallback):
128
+ def _on_training_start(self):
129
+ # Called before first rollout
130
+ pass
131
+
132
+ def _on_step(self):
133
+ # Called after each environment step
134
+ # Return False to stop training
135
+ return True
136
+
137
+ def _on_rollout_end(self):
138
+ # Called at end of rollout
139
+ pass
140
+ ```
141
+
142
+ **Available Attributes:**
143
+ - `self.model`: The RL algorithm instance
144
+ - `self.num_timesteps`: Total environment steps
145
+ - `self.training_env`: The training environment
146
+
147
+ **Chaining Callbacks:**
148
+ ```python
149
+ from stable_baselines3.common.callbacks import CallbackList
150
+
151
+ callback = CallbackList([eval_callback, checkpoint_callback, custom_callback])
152
+ model.learn(total_timesteps=10000, callback=callback)
153
+ ```
154
+
155
+ See `references/callbacks.md` for comprehensive callback documentation.
156
+
157
+ ### 5. Model Persistence and Inspection
158
+
159
+ **Saving and Loading:**
160
+ ```python
161
+ # Save model
162
+ model.save("model_name")
163
+
164
+ # Save normalization statistics (if using VecNormalize)
165
+ vec_env.save("vec_normalize.pkl")
166
+
167
+ # Load model
168
+ model = PPO.load("model_name", env=env)
169
+
170
+ # Load normalization statistics
171
+ vec_env = VecNormalize.load("vec_normalize.pkl", vec_env)
172
+ ```
173
+
174
+ **Parameter Access:**
175
+ ```python
176
+ # Get parameters
177
+ params = model.get_parameters()
178
+
179
+ # Set parameters
180
+ model.set_parameters(params)
181
+
182
+ # Access PyTorch state dict
183
+ state_dict = model.policy.state_dict()
184
+ ```
185
+
186
+ ### 6. Evaluation and Recording
187
+
188
+ **Evaluation:**
189
+ ```python
190
+ from stable_baselines3.common.evaluation import evaluate_policy
191
+
192
+ mean_reward, std_reward = evaluate_policy(
193
+ model,
194
+ env,
195
+ n_eval_episodes=10,
196
+ deterministic=True
197
+ )
198
+ ```
199
+
200
+ **Video Recording:**
201
+ ```python
202
+ from stable_baselines3.common.vec_env import VecVideoRecorder
203
+
204
+ # Wrap environment with video recorder
205
+ env = VecVideoRecorder(
206
+ env,
207
+ "videos/",
208
+ record_video_trigger=lambda x: x % 2000 == 0,
209
+ video_length=200
210
+ )
211
+ ```
212
+
213
+ See `scripts/evaluate_agent.py` for a complete evaluation and recording template.
214
+
215
+ ### 7. Advanced Features
216
+
217
+ **Learning Rate Schedules:**
218
+ ```python
219
+ def linear_schedule(initial_value):
220
+ def func(progress_remaining):
221
+ # progress_remaining goes from 1 to 0
222
+ return progress_remaining * initial_value
223
+ return func
224
+
225
+ model = PPO("MlpPolicy", env, learning_rate=linear_schedule(0.001))
226
+ ```
227
+
228
+ **Multi-Input Policies (Dict Observations):**
229
+ ```python
230
+ model = PPO("MultiInputPolicy", env, verbose=1)
231
+ ```
232
+ Use when observations are dictionaries (e.g., combining images with sensor data).
233
+
234
+ **Hindsight Experience Replay:**
235
+ ```python
236
+ from stable_baselines3 import SAC, HerReplayBuffer
237
+
238
+ model = SAC(
239
+ "MultiInputPolicy",
240
+ env,
241
+ replay_buffer_class=HerReplayBuffer,
242
+ replay_buffer_kwargs=dict(
243
+ n_sampled_goal=4,
244
+ goal_selection_strategy="future",
245
+ ),
246
+ )
247
+ ```
248
+
249
+ **TensorBoard Integration:**
250
+ ```python
251
+ model = PPO("MlpPolicy", env, tensorboard_log="./tensorboard/")
252
+ model.learn(total_timesteps=10000)
253
+ ```
254
+
255
+ ## Workflow Guidance
256
+
257
+ **Starting a New RL Project:**
258
+
259
+ 1. **Define the problem**: Identify observation space, action space, and reward structure
260
+ 2. **Choose algorithm**: Use `references/algorithms.md` for selection guidance
261
+ 3. **Create/adapt environment**: Use `scripts/custom_env_template.py` if needed
262
+ 4. **Validate environment**: Always run `check_env()` before training
263
+ 5. **Set up training**: Use `scripts/train_rl_agent.py` as starting template
264
+ 6. **Add monitoring**: Implement callbacks for evaluation and checkpointing
265
+ 7. **Optimize performance**: Consider vectorized environments for speed
266
+ 8. **Evaluate and iterate**: Use `scripts/evaluate_agent.py` for assessment
267
+
268
+ **Common Issues:**
269
+
270
+ - **Memory errors**: Reduce `buffer_size` for off-policy algorithms or use fewer parallel environments
271
+ - **Slow training**: Consider SubprocVecEnv for parallel environments
272
+ - **Unstable training**: Try different algorithms, tune hyperparameters, or check reward scaling
273
+ - **Import errors**: Ensure `stable_baselines3` is installed: `uv pip install stable-baselines3[extra]`
274
+
275
+ ## Resources
276
+
277
+ ### scripts/
278
+ - `train_rl_agent.py`: Complete training script template with best practices
279
+ - `evaluate_agent.py`: Agent evaluation and video recording template
280
+ - `custom_env_template.py`: Custom Gym environment template
281
+
282
+ ### references/
283
+ - `algorithms.md`: Detailed algorithm comparison and selection guide
284
+ - `custom_environments.md`: Comprehensive custom environment creation guide
285
+ - `callbacks.md`: Complete callback system reference
286
+ - `vectorized_envs.md`: Vectorized environment usage and wrappers
287
+
288
+ ## Installation
289
+
290
+ ```bash
291
+ # Basic installation
292
+ uv pip install stable-baselines3
293
+
294
+ # With extra dependencies (Tensorboard, etc.)
295
+ uv pip install stable-baselines3[extra]
296
+ ```
297
+