npm - packwise-skills - Versions diffs - 1.0.0 → 1.2.0 - Mend

packwise-skills 1.0.0 → 1.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (53) hide show

package/.cursorrules +23 -23
package/CLAUDE.md +25 -25
package/LICENSE +21 -0
package/README.md +404 -295
package/audit.md +224 -224
package/bin/packwise.js +322 -155
package/install.sh +123 -0
package/package.json +32 -31
package/skill.md +944 -719
package/sub-skills/ai/local-llm.md +183 -183
package/sub-skills/ai/python-ml.md +164 -164
package/sub-skills/backend/go-server.md +184 -184
package/sub-skills/backend/java-spring.md +241 -241
package/sub-skills/backend/node-server.md +164 -164
package/sub-skills/backend/php-laravel.md +175 -175
package/sub-skills/backend/python-server.md +164 -164
package/sub-skills/backend/rust-backend.md +118 -118
package/sub-skills/cli/python-cli.md +236 -236
package/sub-skills/cli/sdk-library.md +497 -497
package/sub-skills/cloud/ci-cd-pipelines.md +350 -350
package/sub-skills/cloud/docker.md +191 -191
package/sub-skills/cloud/kubernetes.md +277 -277
package/sub-skills/cloud/payment-integration.md +307 -307
package/sub-skills/cross-platform/multiplatform.md +252 -252
package/sub-skills/desktop/electron.md +783 -783
package/sub-skills/desktop/game-dev.md +443 -443
package/sub-skills/desktop/native-app.md +123 -123
package/sub-skills/desktop/scenarios.md +443 -443
package/sub-skills/desktop/smart-platforms.md +324 -324
package/sub-skills/desktop/tauri.md +428 -428
package/sub-skills/desktop/vr-ar.md +252 -252
package/sub-skills/desktop/web-to-desktop.md +153 -153
package/sub-skills/embedded/car-infotainment.md +129 -129
package/sub-skills/embedded/esp32.md +184 -184
package/sub-skills/embedded/ros.md +150 -150
package/sub-skills/embedded/stm32.md +160 -160
package/sub-skills/mobile/android.md +322 -322
package/sub-skills/mobile/capacitor.md +232 -232
package/sub-skills/mobile/flutter-mobile.md +138 -138
package/sub-skills/mobile/harmonyos.md +150 -150
package/sub-skills/mobile/ios.md +245 -245
package/sub-skills/mobile/react-native.md +443 -443
package/sub-skills/mobile/wearables.md +230 -230
package/sub-skills/plugins/browser-extension.md +308 -308
package/sub-skills/plugins/jetbrains-plugin.md +226 -226
package/sub-skills/plugins/vscode-extension.md +204 -204
package/sub-skills/security/security-tools.md +174 -174
package/sub-skills/web/monorepo.md +274 -274
package/sub-skills/web/pwa.md +220 -220
package/sub-skills/web/serverless-edge.md +295 -295
package/sub-skills/web/spa.md +266 -266
package/sub-skills/web/ssr.md +228 -228
package/sub-skills/web/wasm.md +243 -243

package/sub-skills/ai/python-ml.md CHANGED Viewed

@@ -1,164 +1,164 @@
-# Python ML Model Packaging Sub-Skill
-Package, optimize, and serve machine learning models for production.
-**Current version**: Python 3.12+ / PyTorch 2.x / TensorFlow 2.17+ / ONNX Runtime 1.19+ (2025-2026)
-## When to Use
-- Trained ML model deployed as API service
-- Image recognition / NLP / recommendation system
-- Data analysis pipeline
-- Research model publication and reproducibility
-- Edge ML deployment (mobile/embedded)
-## Deployment Options
-### FastAPI + Uvicorn (Recommended for APIs)
-```python
-from fastapi import FastAPI
-from transformers import pipeline
-import torch
-app = FastAPI()
-# Load model once at startup
-device = "cuda" if torch.cuda.is_available() else "cpu"
-classifier = pipeline("text-classification", model="my-model", device=device)
-@app.post("/predict")
-async def predict(text: str):
-    return classifier(text)
-@app.get("/health")
-async def health():
-    return {"status": "ok", "device": device}
-```
-### Flask + Gunicorn
-```python
-from flask import Flask, request, jsonify
-import pickle
-app = Flask(__name__)
-model = pickle.load(open("model.pkl", "rb"))
-@app.route("/predict", methods=["POST"])
-def predict():
-    data = request.json
-    result = model.predict([data["features"]])
-    return jsonify({"prediction": result.tolist()})
-```
-### ONNX Runtime (High-Performance Inference)
-```python
-import onnxruntime as ort
-import numpy as np
-# CPU inference
-session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
-# GPU inference (CUDA)
-session = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
-# TensorRT (fastest, NVIDIA GPU only)
-session = ort.InferenceSession("model.onnx", providers=["TensorrtExecutionProvider", "CUDAExecutionProvider"])
-result = session.run(None, {"input": input_data.astype(np.float32)})
-```
-### Model Export
-```python
-# PyTorch → ONNX
-import torch
-model = torch.load("model.pt")
-model.eval()
-dummy_input = torch.randn(1, 3, 224, 224)
-torch.onnx.export(model, dummy_input, "model.onnx", opset_version=17)
-# TensorFlow → SavedModel
-model.save("saved_model/")
-# TensorFlow → TFLite (mobile)
-converter = tf.lite.TFLiteConverter.from_saved_model("saved_model/")
-converter.optimizations = [tf.lite.Optimize.DEFAULT]
-tflite_model = converter.convert()
-with open("model.tflite", "wb") as f:
-    f.write(tflite_model)
-```
-## Model Formats
-| Format | Framework | Size | Inference Speed | Best For |
-|--------|----------|------|----------------|----------|
-| PyTorch (.pt) | PyTorch | Large | Standard | Research, training |
-| ONNX (.onnx) | Cross-framework | Medium | Fast | Production APIs |
-| SavedModel | TensorFlow | Large | Standard | TF Serving |
-| TFLite (.tflite) | TensorFlow Lite | Small | Fast (mobile) | Mobile/edge |
-| TensorRT | NVIDIA | Medium | Fastest (GPU) | NVIDIA GPU servers |
-| GGUF | llama.cpp | Small | Fast (CPU) | Local LLM inference |
-| CoreML (.mlmodel) | Apple | Medium | Fast (Apple) | iOS/macOS on-device |
-| OpenVINO | Intel | Medium | Fast (Intel CPU) | Intel hardware |
-## Docker (GPU)
-```dockerfile
-FROM nvidia/cuda:12.4-runtime-ubuntu22.04
-RUN apt-get update && apt-get install -y python3 python3-pip && rm -rf /var/lib/apt/lists/*
-WORKDIR /app
-COPY requirements.txt .
-RUN pip3 install --no-cache-dir -r requirements.txt
-COPY . .
-RUN groupadd -r appuser && useradd -r -g appuser appuser
-USER appuser
-EXPOSE 8000
-HEALTHCHECK --interval=30s --timeout=3s CMD curl -f http://localhost:8000/health || exit 1
-CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
-```
-```dockerfile
-# CPU-only (smaller)
-FROM python:3.13-slim
-WORKDIR /app
-COPY requirements.txt .
-RUN pip install --no-cache-dir -r requirements.txt
-COPY . .
-RUN groupadd -r appuser && useradd -r -g appuser appuser
-USER appuser
-EXPOSE 8000
-CMD ["uvicorn", "app:app", "--host", "0.0.0.0"]
-```
-## Model Optimization
-| Technique | Description | Typical Savings |
-|-----------|-------------|----------------|
-| Quantization (INT8) | Reduce precision from FP32 to INT8 | 4x smaller, 2-3x faster |
-| Quantization (INT4) | Further reduction | 8x smaller (LLMs) |
-| Pruning | Remove redundant weights | 2-10x smaller |
-| Knowledge Distillation | Train smaller model from larger | Custom |
-| ONNX Optimization | Graph optimization | 10-30% faster |
-| TensorRT | NVIDIA GPU optimization | 2-5x faster |
-```python
-# ONNX quantization
-from onnxruntime.quantization import quantize_dynamic, QuantType
-quantize_dynamic("model.onnx", "model_quant.onnx", weight_type=QuantType.QUInt8)
-```
-## Common Pitfalls
-| Issue | Fix |
-|-------|-----|
-| Model too large for deployment | Quantize (INT8/INT4); use ONNX; prune |
-| GPU out of memory | Reduce batch size; use gradient checkpointing; quantize |
-| Dependency conflicts | Use Docker isolation; pin exact versions |
-| Slow inference | Convert to ONNX or TensorRT; use GPU; batch requests |
-| CUDA version mismatch | Match PyTorch CUDA version with system CUDA; use Docker |
-| Model loading time too long | Load once at startup; use model server (TorchServe, TF Serving) |
-| Inconsistent results between environments | Fix random seeds; pin all dependency versions |
-| ONNX export fails | Check opset version; use `torch.onnx.export` with `opset_version=17` |
+# Python ML Model Packaging Sub-Skill
+Package, optimize, and serve machine learning models for production.
+**Current version**: Python 3.12+ / PyTorch 2.x / TensorFlow 2.17+ / ONNX Runtime 1.19+ (2025-2026)
+## When to Use
+- Trained ML model deployed as API service
+- Image recognition / NLP / recommendation system
+- Data analysis pipeline
+- Research model publication and reproducibility
+- Edge ML deployment (mobile/embedded)
+## Deployment Options
+### FastAPI + Uvicorn (Recommended for APIs)
+```python
+from fastapi import FastAPI
+from transformers import pipeline
+import torch
+app = FastAPI()
+# Load model once at startup
+device = "cuda" if torch.cuda.is_available() else "cpu"
+classifier = pipeline("text-classification", model="my-model", device=device)
+@app.post("/predict")
+async def predict(text: str):
+    return classifier(text)
+@app.get("/health")
+async def health():
+    return {"status": "ok", "device": device}
+```
+### Flask + Gunicorn
+```python
+from flask import Flask, request, jsonify
+import pickle
+app = Flask(__name__)
+model = pickle.load(open("model.pkl", "rb"))
+@app.route("/predict", methods=["POST"])
+def predict():
+    data = request.json
+    result = model.predict([data["features"]])
+    return jsonify({"prediction": result.tolist()})
+```
+### ONNX Runtime (High-Performance Inference)
+```python
+import onnxruntime as ort
+import numpy as np
+# CPU inference
+session = ort.InferenceSession("model.onnx", providers=["CPUExecutionProvider"])
+# GPU inference (CUDA)
+session = ort.InferenceSession("model.onnx", providers=["CUDAExecutionProvider"])
+# TensorRT (fastest, NVIDIA GPU only)
+session = ort.InferenceSession("model.onnx", providers=["TensorrtExecutionProvider", "CUDAExecutionProvider"])
+result = session.run(None, {"input": input_data.astype(np.float32)})
+```
+### Model Export
+```python
+# PyTorch → ONNX
+import torch
+model = torch.load("model.pt")
+model.eval()
+dummy_input = torch.randn(1, 3, 224, 224)
+torch.onnx.export(model, dummy_input, "model.onnx", opset_version=17)
+# TensorFlow → SavedModel
+model.save("saved_model/")
+# TensorFlow → TFLite (mobile)
+converter = tf.lite.TFLiteConverter.from_saved_model("saved_model/")
+converter.optimizations = [tf.lite.Optimize.DEFAULT]
+tflite_model = converter.convert()
+with open("model.tflite", "wb") as f:
+    f.write(tflite_model)
+```
+## Model Formats
+| Format | Framework | Size | Inference Speed | Best For |
+|--------|----------|------|----------------|----------|
+| PyTorch (.pt) | PyTorch | Large | Standard | Research, training |
+| ONNX (.onnx) | Cross-framework | Medium | Fast | Production APIs |
+| SavedModel | TensorFlow | Large | Standard | TF Serving |
+| TFLite (.tflite) | TensorFlow Lite | Small | Fast (mobile) | Mobile/edge |
+| TensorRT | NVIDIA | Medium | Fastest (GPU) | NVIDIA GPU servers |
+| GGUF | llama.cpp | Small | Fast (CPU) | Local LLM inference |
+| CoreML (.mlmodel) | Apple | Medium | Fast (Apple) | iOS/macOS on-device |
+| OpenVINO | Intel | Medium | Fast (Intel CPU) | Intel hardware |
+## Docker (GPU)
+```dockerfile
+FROM nvidia/cuda:12.4-runtime-ubuntu22.04
+RUN apt-get update && apt-get install -y python3 python3-pip && rm -rf /var/lib/apt/lists/*
+WORKDIR /app
+COPY requirements.txt .
+RUN pip3 install --no-cache-dir -r requirements.txt
+COPY . .
+RUN groupadd -r appuser && useradd -r -g appuser appuser
+USER appuser
+EXPOSE 8000
+HEALTHCHECK --interval=30s --timeout=3s CMD curl -f http://localhost:8000/health || exit 1
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]
+```
+```dockerfile
+# CPU-only (smaller)
+FROM python:3.13-slim
+WORKDIR /app
+COPY requirements.txt .
+RUN pip install --no-cache-dir -r requirements.txt
+COPY . .
+RUN groupadd -r appuser && useradd -r -g appuser appuser
+USER appuser
+EXPOSE 8000
+CMD ["uvicorn", "app:app", "--host", "0.0.0.0"]
+```
+## Model Optimization
+| Technique | Description | Typical Savings |
+|-----------|-------------|----------------|
+| Quantization (INT8) | Reduce precision from FP32 to INT8 | 4x smaller, 2-3x faster |
+| Quantization (INT4) | Further reduction | 8x smaller (LLMs) |
+| Pruning | Remove redundant weights | 2-10x smaller |
+| Knowledge Distillation | Train smaller model from larger | Custom |
+| ONNX Optimization | Graph optimization | 10-30% faster |
+| TensorRT | NVIDIA GPU optimization | 2-5x faster |
+```python
+# ONNX quantization
+from onnxruntime.quantization import quantize_dynamic, QuantType
+quantize_dynamic("model.onnx", "model_quant.onnx", weight_type=QuantType.QUInt8)
+```
+## Common Pitfalls
+| Issue | Fix |
+|-------|-----|
+| Model too large for deployment | Quantize (INT8/INT4); use ONNX; prune |
+| GPU out of memory | Reduce batch size; use gradient checkpointing; quantize |
+| Dependency conflicts | Use Docker isolation; pin exact versions |
+| Slow inference | Convert to ONNX or TensorRT; use GPU; batch requests |
+| CUDA version mismatch | Match PyTorch CUDA version with system CUDA; use Docker |
+| Model loading time too long | Load once at startup; use model server (TorchServe, TF Serving) |
+| Inconsistent results between environments | Fix random seeds; pin all dependency versions |
+| ONNX export fails | Check opset version; use `torch.onnx.export` with `opset_version=17` |