ganbatte-os 0.2.37 → 0.2.38
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.gos/agents/profiles/ganbatte-os-master.md +100 -0
- package/.gos/libraries/caveman-rules.md +58 -0
- package/.gos/libraries/cloudflare-stack-kb.md +161 -0
- package/.gos/libraries/default-stack-kb.md +98 -0
- package/.gos/libraries/engineering-best-practices.md +208 -0
- package/.gos/libraries/gos-compress-setup.md +62 -0
- package/.gos/libraries/intake-questions-mom-test.md +91 -0
- package/.gos/libraries/lucide-icons-policy.md +174 -0
- package/.gos/libraries/security-best-practices.md +138 -0
- package/.gos/libraries/supabase-stack-kb.md +124 -0
- package/.gos/libraries/timer-pattern-spec.md +252 -0
- package/.gos/libraries/typeform-pattern-spec.md +204 -0
- package/.gos/libraries/ui-guardrails-checklist.md +144 -0
- package/.gos/libraries/visual-diff-lenses.md +114 -0
- package/.gos/skills/adr-tech-decisions/SKILL.md +166 -0
- package/.gos/skills/audit-screenshots/SKILL.md +21 -3
- package/.gos/skills/cloudflare-pages-setup/SKILL.md +180 -0
- package/.gos/skills/figma-print-diff/SKILL.md +165 -0
- package/.gos/skills/gos-caveman/SKILL.md +110 -0
- package/.gos/skills/gos-compress/SKILL.md +134 -0
- package/.gos/skills/gos-compress/scripts/compress.py +346 -0
- package/.gos/skills/gos-compress/scripts/setup.py +91 -0
- package/.gos/skills/idea-intake/SKILL.md +147 -0
- package/.gos/skills/plan-blueprint/SKILL.md +10 -3
- package/.gos/skills/plan-to-tasks/SKILL.md +28 -0
- package/.gos/skills/prd-from-intake/SKILL.md +94 -0
- package/.gos/skills/prototype-orchestrator/SKILL.md +120 -0
- package/.gos/skills/registry.json +12 -1
- package/.gos/skills/timer-component-pattern/SKILL.md +245 -0
- package/.gos/skills/typeform-form-pattern/SKILL.md +210 -0
- package/.gos/skills/ui-guardrails/SKILL.md +111 -0
- package/.gos/templates/intakeTemplate.md +41 -0
- package/.gos/templates/prdLeanTemplate.md +40 -0
- package/package.json +1 -1
|
@@ -0,0 +1,134 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: gos-compress
|
|
3
|
+
description: >
|
|
4
|
+
Comprime INPUT de prompts longos antes de enviar a outra skill ou subagent, usando
|
|
5
|
+
LLMLingua-2 da Microsoft (modelo XLM-RoBERTa local). Reducao tipica 40-60% sem perda
|
|
6
|
+
semantica. Wrapper sobre sandeco-token-reduce. Requer setup unico (~1GB modelo HuggingFace).
|
|
7
|
+
argument-hint: "<acao: setup|compress|init> [args]"
|
|
8
|
+
allowedTools: [Read, Write, Bash]
|
|
9
|
+
sourceDocs:
|
|
10
|
+
- libraries/gos-compress-setup.md
|
|
11
|
+
use-when:
|
|
12
|
+
- precisa passar contexto longo (>2000 tokens) para subagent/skill
|
|
13
|
+
- usuario pede "comprimir contexto" ou "reduzir tokens"
|
|
14
|
+
- prototype-orchestrator avancando entre fases com PRD/intake longos
|
|
15
|
+
do-not-use-for:
|
|
16
|
+
- textos curtos (<500 tokens — nao compensa)
|
|
17
|
+
- codigo (perde semantica precisa)
|
|
18
|
+
- frontmatter, tabelas estruturadas (parser quebra)
|
|
19
|
+
metadata:
|
|
20
|
+
category: optimization
|
|
21
|
+
attribution: github sandeco/sandeco-token-reduce + microsoft/llmlingua-2
|
|
22
|
+
---
|
|
23
|
+
|
|
24
|
+
Voce esta executando como **Compressor de Input** via skill `gos-compress`. Wrapper sobre sandeco-token-reduce + LLMLingua-2 da Microsoft.
|
|
25
|
+
|
|
26
|
+
## Pre-flight: setup unico
|
|
27
|
+
|
|
28
|
+
Antes do primeiro uso, o `.venv` precisa existir em `skills/gos-compress/.venv/`.
|
|
29
|
+
|
|
30
|
+
### Verificar
|
|
31
|
+
```bash
|
|
32
|
+
test -d <skill-dir>/.venv && echo "ready" || echo "needs-setup"
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
### Setup (1x apenas, ~1GB download)
|
|
36
|
+
```bash
|
|
37
|
+
python "<skill-dir>/scripts/setup.py"
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
Setup faz:
|
|
41
|
+
- Cria `.venv`
|
|
42
|
+
- Instala `llmlingua` + `anthropic`
|
|
43
|
+
- Baixa `microsoft/llmlingua-2-xlm-roberta-large-meetingbank` (~1GB) em `~/.cache/huggingface/`
|
|
44
|
+
- Idempotente
|
|
45
|
+
|
|
46
|
+
## Acoes
|
|
47
|
+
|
|
48
|
+
### `setup` — instala/atualiza modelo
|
|
49
|
+
```
|
|
50
|
+
*gos-compress setup
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### `compress` — comprime arquivo ou texto
|
|
54
|
+
```
|
|
55
|
+
*gos-compress compress --file path/to/long.md --rate 0.4
|
|
56
|
+
*gos-compress compress --file path/to/long.md --rate 0.4 --output path/comprimido.md
|
|
57
|
+
*gos-compress compress --text "texto longo" --rate 0.5
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Parametros:
|
|
61
|
+
| Flag | Default | Descricao |
|
|
62
|
+
|------|---------|-----------|
|
|
63
|
+
| `--rate` | 0.4 | Fracao de tokens a manter (0.5 leve, 0.2 agressivo) |
|
|
64
|
+
| `--file` | — | Path do arquivo |
|
|
65
|
+
| `--text` | — | Texto direto |
|
|
66
|
+
| `--output` | — | Salva resultado em arquivo |
|
|
67
|
+
| `--json` | false | Output JSON com stats |
|
|
68
|
+
|
|
69
|
+
### `init` — alias para setup
|
|
70
|
+
|
|
71
|
+
## Taxa recomendada por uso
|
|
72
|
+
|
|
73
|
+
| Caso | Rate | Justificativa |
|
|
74
|
+
|------|------|---------------|
|
|
75
|
+
| Intake -> PRD | 0.5 | Preservar nuance |
|
|
76
|
+
| PRD -> ADR | 0.4 | Decisoes precisam contexto |
|
|
77
|
+
| ADR -> Plan | 0.4 | Padrao |
|
|
78
|
+
| Long thread -> resumo | 0.3 | Pode perder detalhe |
|
|
79
|
+
| Codigo (NAO USAR) | — | Use Read direto |
|
|
80
|
+
|
|
81
|
+
## Pipelines no G-OS
|
|
82
|
+
|
|
83
|
+
### Em prototype-orchestrator
|
|
84
|
+
Entre fases consecutivas, comprimir output da fase anterior antes de passar:
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
intake.md (4000 tokens)
|
|
88
|
+
-> gos-compress rate=0.5
|
|
89
|
+
-> intake.compressed (2000 tokens)
|
|
90
|
+
-> input para prd-from-intake
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Em plan-blueprint
|
|
94
|
+
Quando rodando com `--compress-context`, comprimir `docs/stack.md` + `docs/prd/PRD-NNN/prd.md` + `docs/adr/*.md` antes de injetar no contexto.
|
|
95
|
+
|
|
96
|
+
## Output JSON (programatico)
|
|
97
|
+
|
|
98
|
+
```json
|
|
99
|
+
{
|
|
100
|
+
"compression": {
|
|
101
|
+
"compressed_prompt": "<texto>",
|
|
102
|
+
"origin_tokens": 312,
|
|
103
|
+
"compressed_tokens": 124,
|
|
104
|
+
"ratio": 2.52,
|
|
105
|
+
"saving": 188,
|
|
106
|
+
"rate_requested": 0.4
|
|
107
|
+
}
|
|
108
|
+
}
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
## Anti-uso
|
|
112
|
+
|
|
113
|
+
- Codigo: LLMLingua remove tokens "menos relevantes" semanticamente, mas codigo precisa preservacao exata.
|
|
114
|
+
- Tabelas: estrutura `|---|` quebra.
|
|
115
|
+
- YAML frontmatter: idem.
|
|
116
|
+
- Skill `idea-intake` outputs: usuario nao-tecnico pode confundir prosa fragmentada.
|
|
117
|
+
|
|
118
|
+
## Pre-processamento automatico (sandeco)
|
|
119
|
+
|
|
120
|
+
Antes de comprimir, sandeco aplica `strip_markdown()`:
|
|
121
|
+
- Remove `---`, grid de tabelas, `**bold**` (mantem texto), `##` headers (mantem texto), checkboxes, code fences, linhas vazias multiplas.
|
|
122
|
+
|
|
123
|
+
Reducao base ~6% antes do modelo entrar.
|
|
124
|
+
|
|
125
|
+
## Force tokens preservados
|
|
126
|
+
|
|
127
|
+
Sempre preservados pelo modelo (sandeco config):
|
|
128
|
+
- `\n`, `.`, `,`, `?`, `!`, `:`
|
|
129
|
+
- Negacoes PT: `nao`, `sem`, `nenhum`, `nunca`, `nem`, `nenhuma`
|
|
130
|
+
- Tokens com digitos (IDs, ms, percentuais) via `force_reserve_digit=True`
|
|
131
|
+
|
|
132
|
+
## Input
|
|
133
|
+
|
|
134
|
+
$ARGUMENTS
|
|
@@ -0,0 +1,346 @@
|
|
|
1
|
+
"""
|
|
2
|
+
compress.py — Script principal da skill sandeco-token-reduce.
|
|
3
|
+
|
|
4
|
+
Comprime texto com LLMLingua-2 e opcionalmente envia ao Claude.
|
|
5
|
+
Auto-detecta GPU.
|
|
6
|
+
|
|
7
|
+
Requer inicializacao previa com setup.py.
|
|
8
|
+
|
|
9
|
+
Modos:
|
|
10
|
+
So comprimir: python compress.py --file texto.txt --rate 0.4
|
|
11
|
+
Comprimir + salvar: python compress.py --file texto.txt --rate 0.4 --output comprimido.txt
|
|
12
|
+
Comprimir + Claude: python compress.py --file texto.txt --rate 0.4 --ask "Resuma este texto"
|
|
13
|
+
Saida JSON: python compress.py --file texto.txt --rate 0.4 --json
|
|
14
|
+
"""
|
|
15
|
+
|
|
16
|
+
import argparse
|
|
17
|
+
import json as json_mod
|
|
18
|
+
import re
|
|
19
|
+
import sys
|
|
20
|
+
from pathlib import Path
|
|
21
|
+
|
|
22
|
+
|
|
23
|
+
SKILL_DIR = Path(__file__).parent.parent
|
|
24
|
+
VENV_DIR = SKILL_DIR / ".venv"
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
# -- Verificacao de ambiente ---------------------------------------------------
|
|
28
|
+
|
|
29
|
+
def check_environment():
|
|
30
|
+
"""Verifica se o venv existe e se estamos rodando dentro dele."""
|
|
31
|
+
venv_python = VENV_DIR / "Scripts" / "python.exe"
|
|
32
|
+
if not venv_python.exists():
|
|
33
|
+
venv_python = VENV_DIR / "bin" / "python"
|
|
34
|
+
|
|
35
|
+
if not VENV_DIR.exists() or not venv_python.exists():
|
|
36
|
+
print("ERRO: A skill ainda nao foi inicializada.", file=sys.stderr)
|
|
37
|
+
print(f"Execute primeiro: python \"{SKILL_DIR / 'scripts' / 'setup.py'}\"",
|
|
38
|
+
file=sys.stderr)
|
|
39
|
+
sys.exit(1)
|
|
40
|
+
|
|
41
|
+
if sys.prefix == sys.base_prefix:
|
|
42
|
+
print(f"ERRO: Execute este script com o Python do venv:", file=sys.stderr)
|
|
43
|
+
print(f" \"{venv_python}\" \"{__file__}\" ...", file=sys.stderr)
|
|
44
|
+
sys.exit(1)
|
|
45
|
+
|
|
46
|
+
|
|
47
|
+
# -- Compressor ----------------------------------------------------------------
|
|
48
|
+
|
|
49
|
+
def detect_device() -> str:
|
|
50
|
+
try:
|
|
51
|
+
import torch
|
|
52
|
+
if torch.cuda.is_available():
|
|
53
|
+
return "cuda"
|
|
54
|
+
except ImportError:
|
|
55
|
+
pass
|
|
56
|
+
return "cpu"
|
|
57
|
+
|
|
58
|
+
|
|
59
|
+
MODEL_NAME = "microsoft/llmlingua-2-xlm-roberta-large-meetingbank"
|
|
60
|
+
CHUNK_MAX_TOKENS = 400
|
|
61
|
+
COMPRESS_KWARGS = dict(
|
|
62
|
+
force_tokens=[
|
|
63
|
+
# Estruturais
|
|
64
|
+
"\n", ".", ",", "?", "!", ":", ";",
|
|
65
|
+
# Negacoes PT
|
|
66
|
+
"não", "sem", "nenhum", "nunca", "nem", "nenhuma",
|
|
67
|
+
# Prioridade / obrigatoriedade
|
|
68
|
+
"Must", "must", "Obrigatória", "obrigatória",
|
|
69
|
+
# Operacionais
|
|
70
|
+
"npm", "run", "dev",
|
|
71
|
+
# Resiliencia / fallback
|
|
72
|
+
"retry", "fallback", "timeout", "WAL", "graciosamente",
|
|
73
|
+
# Metricas
|
|
74
|
+
"baseline", "target", "Baseline", "Target",
|
|
75
|
+
# Delimitadores de codigo inline
|
|
76
|
+
"`",
|
|
77
|
+
],
|
|
78
|
+
force_reserve_digit=True,
|
|
79
|
+
drop_consecutive=True,
|
|
80
|
+
)
|
|
81
|
+
|
|
82
|
+
|
|
83
|
+
def load_compressor():
|
|
84
|
+
from llmlingua import PromptCompressor
|
|
85
|
+
|
|
86
|
+
device = detect_device()
|
|
87
|
+
print(f"Carregando LLMLingua-2 (device={device})...", file=sys.stderr)
|
|
88
|
+
return PromptCompressor(
|
|
89
|
+
model_name=MODEL_NAME,
|
|
90
|
+
use_llmlingua2=True,
|
|
91
|
+
device_map=device,
|
|
92
|
+
)
|
|
93
|
+
|
|
94
|
+
|
|
95
|
+
def load_tokenizer():
|
|
96
|
+
from transformers import AutoTokenizer
|
|
97
|
+
return AutoTokenizer.from_pretrained(MODEL_NAME)
|
|
98
|
+
|
|
99
|
+
|
|
100
|
+
# -- Pre-processamento markdown ------------------------------------------------
|
|
101
|
+
|
|
102
|
+
def tables_to_keyvalue(text: str) -> str:
|
|
103
|
+
"""Converte tabelas markdown para formato key: value que sobrevive a compressao."""
|
|
104
|
+
lines = text.split('\n')
|
|
105
|
+
result = []
|
|
106
|
+
i = 0
|
|
107
|
+
while i < len(lines):
|
|
108
|
+
if lines[i].strip().startswith('|') and '|' in lines[i][1:]:
|
|
109
|
+
table_lines = []
|
|
110
|
+
while i < len(lines) and lines[i].strip().startswith('|'):
|
|
111
|
+
table_lines.append(lines[i])
|
|
112
|
+
i += 1
|
|
113
|
+
if len(table_lines) < 2:
|
|
114
|
+
result.extend(table_lines)
|
|
115
|
+
continue
|
|
116
|
+
headers = [h.strip() for h in table_lines[0].split('|')[1:-1]]
|
|
117
|
+
for tl in table_lines[1:]:
|
|
118
|
+
if re.match(r'^\s*\|[-:\s|]+\|\s*$', tl):
|
|
119
|
+
continue
|
|
120
|
+
values = [v.strip() for v in tl.split('|')[1:-1]]
|
|
121
|
+
for h, v in zip(headers, values):
|
|
122
|
+
if v:
|
|
123
|
+
result.append(f"{h}: {v}")
|
|
124
|
+
result.append('')
|
|
125
|
+
else:
|
|
126
|
+
result.append(lines[i])
|
|
127
|
+
i += 1
|
|
128
|
+
return '\n'.join(result)
|
|
129
|
+
|
|
130
|
+
|
|
131
|
+
def normalize_identifiers(text: str) -> str:
|
|
132
|
+
"""Gruda prefixos de IDs aos digitos para que force_reserve_digit proteja o conjunto.
|
|
133
|
+
Ex: G-01 -> G01, RF-01 -> RF01, EC-04 -> EC04"""
|
|
134
|
+
return re.sub(r'\b(G|RF|RNF|NG|EC)-(\d+)', r'\1\2', text)
|
|
135
|
+
|
|
136
|
+
|
|
137
|
+
def strip_markdown(text: str) -> str:
|
|
138
|
+
"""Remove decoracao markdown mantendo o conteudo textual."""
|
|
139
|
+
text = re.sub(r'^---+\s*$', '', text, flags=re.MULTILINE) # separadores
|
|
140
|
+
text = re.sub(r'\*\*(.+?)\*\*', r'\1', text) # **bold** -> texto
|
|
141
|
+
text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE) # ## headers -> texto
|
|
142
|
+
text = re.sub(r'- \[[ x]\]\s*', '', text) # checkboxes
|
|
143
|
+
text = re.sub(r'^```\w*\s*$', '', text, flags=re.MULTILINE) # code fences (triple)
|
|
144
|
+
text = re.sub(r'\n{3,}', '\n\n', text) # colapsar linhas vazias
|
|
145
|
+
return text.strip()
|
|
146
|
+
|
|
147
|
+
|
|
148
|
+
# -- Chunking ------------------------------------------------------------------
|
|
149
|
+
|
|
150
|
+
def split_into_chunks(text: str, tokenizer, max_tokens: int = CHUNK_MAX_TOKENS) -> list[str]:
|
|
151
|
+
"""Divide texto em chunks de ate max_tokens, cortando em limites de linha."""
|
|
152
|
+
lines = text.split('\n')
|
|
153
|
+
chunks = []
|
|
154
|
+
current_lines = []
|
|
155
|
+
current_tokens = 0
|
|
156
|
+
|
|
157
|
+
for line in lines:
|
|
158
|
+
line_tokens = len(tokenizer.encode(line, add_special_tokens=False))
|
|
159
|
+
if current_tokens + line_tokens > max_tokens and current_lines:
|
|
160
|
+
chunks.append('\n'.join(current_lines))
|
|
161
|
+
current_lines = []
|
|
162
|
+
current_tokens = 0
|
|
163
|
+
current_lines.append(line)
|
|
164
|
+
current_tokens += line_tokens
|
|
165
|
+
|
|
166
|
+
if current_lines:
|
|
167
|
+
chunks.append('\n'.join(current_lines))
|
|
168
|
+
|
|
169
|
+
return chunks
|
|
170
|
+
|
|
171
|
+
|
|
172
|
+
# -- Compressao ----------------------------------------------------------------
|
|
173
|
+
|
|
174
|
+
def compress(text: str, rate: float) -> dict:
|
|
175
|
+
text = tables_to_keyvalue(text)
|
|
176
|
+
text = normalize_identifiers(text)
|
|
177
|
+
text = strip_markdown(text)
|
|
178
|
+
compressor = load_compressor()
|
|
179
|
+
tokenizer = load_tokenizer()
|
|
180
|
+
|
|
181
|
+
chunks = split_into_chunks(text, tokenizer)
|
|
182
|
+
|
|
183
|
+
if len(chunks) == 1:
|
|
184
|
+
result = compressor.compress_prompt(text, rate=rate, **COMPRESS_KWARGS)
|
|
185
|
+
return {
|
|
186
|
+
"compressed_prompt": result["compressed_prompt"],
|
|
187
|
+
"origin_tokens": result["origin_tokens"],
|
|
188
|
+
"compressed_tokens": result["compressed_tokens"],
|
|
189
|
+
"ratio": round(float(str(result["ratio"]).rstrip("x")), 2),
|
|
190
|
+
"saving": result["origin_tokens"] - result["compressed_tokens"],
|
|
191
|
+
"rate_requested": rate,
|
|
192
|
+
}
|
|
193
|
+
|
|
194
|
+
print(f"Texto dividido em {len(chunks)} chunks (max {CHUNK_MAX_TOKENS} tokens cada).",
|
|
195
|
+
file=sys.stderr)
|
|
196
|
+
|
|
197
|
+
compressed_parts = []
|
|
198
|
+
total_origin = 0
|
|
199
|
+
total_compressed = 0
|
|
200
|
+
|
|
201
|
+
for i, chunk in enumerate(chunks, 1):
|
|
202
|
+
r = compressor.compress_prompt(chunk, rate=rate, **COMPRESS_KWARGS)
|
|
203
|
+
compressed_parts.append(r["compressed_prompt"])
|
|
204
|
+
total_origin += r["origin_tokens"]
|
|
205
|
+
total_compressed += r["compressed_tokens"]
|
|
206
|
+
print(f" Chunk {i}/{len(chunks)}: {r['origin_tokens']} -> {r['compressed_tokens']} tokens",
|
|
207
|
+
file=sys.stderr)
|
|
208
|
+
|
|
209
|
+
ratio = round(total_origin / total_compressed, 2) if total_compressed > 0 else 0.0
|
|
210
|
+
|
|
211
|
+
return {
|
|
212
|
+
"compressed_prompt": "\n".join(compressed_parts),
|
|
213
|
+
"origin_tokens": total_origin,
|
|
214
|
+
"compressed_tokens": total_compressed,
|
|
215
|
+
"ratio": ratio,
|
|
216
|
+
"saving": total_origin - total_compressed,
|
|
217
|
+
"rate_requested": rate,
|
|
218
|
+
}
|
|
219
|
+
|
|
220
|
+
|
|
221
|
+
# -- Claude --------------------------------------------------------------------
|
|
222
|
+
|
|
223
|
+
def ask_claude(compressed_text: str, question: str, model: str, max_tokens: int) -> dict:
|
|
224
|
+
import anthropic
|
|
225
|
+
|
|
226
|
+
client = anthropic.Anthropic()
|
|
227
|
+
prompt = f"""Contexto (comprimido via LLMLingua-2 — tokens irrelevantes foram removidos,
|
|
228
|
+
mas o significado foi preservado):
|
|
229
|
+
|
|
230
|
+
{compressed_text}
|
|
231
|
+
|
|
232
|
+
Pergunta: {question}"""
|
|
233
|
+
|
|
234
|
+
response = client.messages.create(
|
|
235
|
+
model=model,
|
|
236
|
+
max_tokens=max_tokens,
|
|
237
|
+
messages=[{"role": "user", "content": prompt}],
|
|
238
|
+
)
|
|
239
|
+
|
|
240
|
+
answer = ""
|
|
241
|
+
for block in response.content:
|
|
242
|
+
if block.type == "text":
|
|
243
|
+
answer += block.text
|
|
244
|
+
|
|
245
|
+
return {
|
|
246
|
+
"answer": answer,
|
|
247
|
+
"model": model,
|
|
248
|
+
"input_tokens": response.usage.input_tokens,
|
|
249
|
+
"output_tokens": response.usage.output_tokens,
|
|
250
|
+
}
|
|
251
|
+
|
|
252
|
+
|
|
253
|
+
# -- CLI -----------------------------------------------------------------------
|
|
254
|
+
|
|
255
|
+
def parse_args():
|
|
256
|
+
p = argparse.ArgumentParser(
|
|
257
|
+
description="Comprime tokens com LLMLingua-2 (Microsoft)"
|
|
258
|
+
)
|
|
259
|
+
# Entrada
|
|
260
|
+
group = p.add_mutually_exclusive_group(required=True)
|
|
261
|
+
group.add_argument("--text", help="Texto a comprimir")
|
|
262
|
+
group.add_argument("--file", help="Caminho para arquivo de texto")
|
|
263
|
+
|
|
264
|
+
# Compressao
|
|
265
|
+
p.add_argument("--rate", type=float, default=0.4,
|
|
266
|
+
help="Fracao de tokens a manter: 0.5=leve, 0.4=padrao, 0.2=agressivo")
|
|
267
|
+
|
|
268
|
+
# Saida
|
|
269
|
+
p.add_argument("--output", "-o", help="Salva texto comprimido neste arquivo")
|
|
270
|
+
p.add_argument("--json", action="store_true",
|
|
271
|
+
help="Saida em JSON (util para integracao programatica)")
|
|
272
|
+
|
|
273
|
+
# Claude (opcional)
|
|
274
|
+
p.add_argument("--ask", help="Pergunta a enviar ao Claude com o contexto comprimido")
|
|
275
|
+
p.add_argument("--model", default="claude-sonnet-4-6",
|
|
276
|
+
help="Modelo Claude (padrao: claude-sonnet-4-6)")
|
|
277
|
+
p.add_argument("--max-tokens", type=int, default=4096,
|
|
278
|
+
help="Max tokens na resposta do Claude (padrao: 4096)")
|
|
279
|
+
|
|
280
|
+
return p.parse_args()
|
|
281
|
+
|
|
282
|
+
|
|
283
|
+
def main():
|
|
284
|
+
check_environment()
|
|
285
|
+
|
|
286
|
+
args = parse_args()
|
|
287
|
+
|
|
288
|
+
# Carregar texto
|
|
289
|
+
if args.file:
|
|
290
|
+
path = Path(args.file)
|
|
291
|
+
if not path.exists():
|
|
292
|
+
print(f"ERRO: arquivo nao encontrado: {path}", file=sys.stderr)
|
|
293
|
+
sys.exit(1)
|
|
294
|
+
text = path.read_text(encoding="utf-8")
|
|
295
|
+
else:
|
|
296
|
+
text = args.text
|
|
297
|
+
|
|
298
|
+
if not text.strip():
|
|
299
|
+
print("ERRO: texto vazio.", file=sys.stderr)
|
|
300
|
+
sys.exit(1)
|
|
301
|
+
|
|
302
|
+
# Comprimir
|
|
303
|
+
result = compress(text, args.rate)
|
|
304
|
+
|
|
305
|
+
# Salvar se pedido
|
|
306
|
+
if args.output:
|
|
307
|
+
out = Path(args.output)
|
|
308
|
+
out.parent.mkdir(parents=True, exist_ok=True)
|
|
309
|
+
out.write_text(result["compressed_prompt"], encoding="utf-8")
|
|
310
|
+
print(f"Salvo em: {out.resolve()}", file=sys.stderr)
|
|
311
|
+
|
|
312
|
+
# Perguntar ao Claude se pedido
|
|
313
|
+
claude_result = None
|
|
314
|
+
if args.ask:
|
|
315
|
+
claude_result = ask_claude(
|
|
316
|
+
result["compressed_prompt"], args.ask, args.model, args.max_tokens
|
|
317
|
+
)
|
|
318
|
+
|
|
319
|
+
# Saida
|
|
320
|
+
if args.json:
|
|
321
|
+
output = {
|
|
322
|
+
"compression": result,
|
|
323
|
+
}
|
|
324
|
+
if claude_result:
|
|
325
|
+
output["claude"] = claude_result
|
|
326
|
+
print(json_mod.dumps(output, ensure_ascii=False, indent=2))
|
|
327
|
+
else:
|
|
328
|
+
print(f"\nTokens originais: {result['origin_tokens']}")
|
|
329
|
+
print(f"Tokens comprimidos: {result['compressed_tokens']}")
|
|
330
|
+
print(f"Taxa de compressao: {result['ratio']}x")
|
|
331
|
+
print(f"Tokens economizados: {result['saving']}")
|
|
332
|
+
|
|
333
|
+
if not args.output:
|
|
334
|
+
print(f"\n--- Texto comprimido ---")
|
|
335
|
+
print(result["compressed_prompt"])
|
|
336
|
+
print("------------------------")
|
|
337
|
+
|
|
338
|
+
if claude_result:
|
|
339
|
+
print(f"\n--- Resposta do Claude ({args.model}) ---")
|
|
340
|
+
print(claude_result["answer"])
|
|
341
|
+
print(f"\nTokens Claude — input: {claude_result['input_tokens']}, "
|
|
342
|
+
f"output: {claude_result['output_tokens']}")
|
|
343
|
+
|
|
344
|
+
|
|
345
|
+
if __name__ == "__main__":
|
|
346
|
+
main()
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
"""
|
|
2
|
+
setup.py — Inicializacao da skill sandeco-token-reduce.
|
|
3
|
+
|
|
4
|
+
Cria o .venv, instala dependencias e baixa o modelo LLMLingua-2.
|
|
5
|
+
Deve ser executado uma unica vez antes de usar a skill.
|
|
6
|
+
|
|
7
|
+
Uso:
|
|
8
|
+
python setup.py
|
|
9
|
+
"""
|
|
10
|
+
|
|
11
|
+
import subprocess
|
|
12
|
+
import sys
|
|
13
|
+
from pathlib import Path
|
|
14
|
+
|
|
15
|
+
SKILL_DIR = Path(__file__).parent.parent
|
|
16
|
+
VENV_DIR = SKILL_DIR / ".venv"
|
|
17
|
+
PACKAGES = ["llmlingua", "anthropic"]
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
def venv_python():
|
|
21
|
+
"""Retorna o caminho do Python dentro do .venv (Windows e Unix)."""
|
|
22
|
+
win = VENV_DIR / "Scripts" / "python.exe"
|
|
23
|
+
unix = VENV_DIR / "bin" / "python"
|
|
24
|
+
return win if win.exists() else unix
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
def run(cmd, **kwargs):
|
|
28
|
+
print(f" $ {' '.join(str(c) for c in cmd)}")
|
|
29
|
+
result = subprocess.run(cmd, **kwargs)
|
|
30
|
+
if result.returncode != 0:
|
|
31
|
+
print(f"\nERRO: comando falhou com codigo {result.returncode}")
|
|
32
|
+
sys.exit(result.returncode)
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
def main():
|
|
36
|
+
print("=" * 60)
|
|
37
|
+
print(" sandeco-token-reduce — Inicializacao")
|
|
38
|
+
print("=" * 60)
|
|
39
|
+
print(f"\n Skill dir: {SKILL_DIR}")
|
|
40
|
+
print(f" Venv dir: {VENV_DIR}\n")
|
|
41
|
+
|
|
42
|
+
# 1. Criar o venv
|
|
43
|
+
if not VENV_DIR.exists():
|
|
44
|
+
print("[1/4] Criando ambiente virtual (.venv)...")
|
|
45
|
+
run([sys.executable, "-m", "venv", str(VENV_DIR)])
|
|
46
|
+
print(" OK\n")
|
|
47
|
+
else:
|
|
48
|
+
print("[1/4] .venv ja existe, pulando criacao.\n")
|
|
49
|
+
|
|
50
|
+
python = venv_python()
|
|
51
|
+
if not python.exists():
|
|
52
|
+
print(f"ERRO: Python do venv nao encontrado em {python}")
|
|
53
|
+
sys.exit(1)
|
|
54
|
+
|
|
55
|
+
# 2. Atualizar pip
|
|
56
|
+
print("[2/4] Atualizando pip...")
|
|
57
|
+
run([str(python), "-m", "pip", "install", "--upgrade", "pip", "-q"])
|
|
58
|
+
print(" OK\n")
|
|
59
|
+
|
|
60
|
+
# 3. Instalar pacotes
|
|
61
|
+
print(f"[3/4] Instalando dependencias: {', '.join(PACKAGES)}")
|
|
62
|
+
print(" (isso pode levar alguns minutos na primeira vez)")
|
|
63
|
+
run([str(python), "-m", "pip", "install"] + PACKAGES + ["-q"])
|
|
64
|
+
print(" OK\n")
|
|
65
|
+
|
|
66
|
+
# 4. Baixar o modelo LLMLingua-2 (faz uma compressao minima para forcar o download)
|
|
67
|
+
print("[4/4] Baixando modelo LLMLingua-2 (~1 GB na primeira vez)...")
|
|
68
|
+
print(" (o modelo fica em cache em ~/.cache/huggingface/)")
|
|
69
|
+
download_script = (
|
|
70
|
+
"from llmlingua import PromptCompressor; "
|
|
71
|
+
"c = PromptCompressor("
|
|
72
|
+
"model_name='microsoft/llmlingua-2-xlm-roberta-large-meetingbank', "
|
|
73
|
+
"use_llmlingua2=True, device_map='cpu'); "
|
|
74
|
+
"r = c.compress_prompt('Hello world test.', rate=0.5); "
|
|
75
|
+
"print(' Modelo carregado com sucesso!')"
|
|
76
|
+
)
|
|
77
|
+
run([str(python), "-c", download_script])
|
|
78
|
+
print()
|
|
79
|
+
|
|
80
|
+
# Resumo
|
|
81
|
+
print("=" * 60)
|
|
82
|
+
print(" Inicializacao concluida!")
|
|
83
|
+
print("=" * 60)
|
|
84
|
+
print(f"\n Python do venv: {python}")
|
|
85
|
+
print(f"\n A skill esta pronta para uso.")
|
|
86
|
+
print(f" Peca ao Claude para comprimir um texto!")
|
|
87
|
+
print()
|
|
88
|
+
|
|
89
|
+
|
|
90
|
+
if __name__ == "__main__":
|
|
91
|
+
main()
|