create-genia-os 2.1.1 → 2.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +154 -154
- package/package.json +4 -2
- package/template/.claude/CLAUDE.md +215 -215
- package/template/.claude/agent-memory/analyst/MEMORY.md +20 -20
- package/template/.claude/agent-memory/architect/MEMORY.md +20 -20
- package/template/.claude/agent-memory/dev/MEMORY.md +20 -20
- package/template/.claude/agent-memory/devops/MEMORY.md +20 -20
- package/template/.claude/agent-memory/pm/MEMORY.md +20 -20
- package/template/.claude/agent-memory/po/MEMORY.md +20 -20
- package/template/.claude/agent-memory/qa/MEMORY.md +20 -20
- package/template/.claude/agent-memory/reviewer/MEMORY.md +20 -20
- package/template/.claude/agent-memory/sm/MEMORY.md +20 -20
- package/template/.claude/hooks/enforce-git-push-authority.py +70 -70
- package/template/.claude/hooks/metrics-tracker.cjs +65 -0
- package/template/.claude/hooks/precompact-session-digest.cjs +87 -87
- package/template/.claude/hooks/sql-governance.py +65 -65
- package/template/.claude/hooks/synapse-engine.cjs +122 -122
- package/template/.claude/hooks/write-path-validation.py +59 -59
- package/template/.claude/rules/agent-authority.md +39 -39
- package/template/.claude/rules/agent-handoff.md +71 -71
- package/template/.claude/rules/agent-memory.md +61 -61
- package/template/.claude/rules/ids-principles.md +52 -52
- package/template/.claude/rules/mcp-usage.md +49 -49
- package/template/.claude/rules/new-project.md +157 -0
- package/template/.claude/rules/story-lifecycle.md +87 -87
- package/template/.claude/rules/workflow-execution.md +68 -68
- package/template/.claude/settings.json +58 -58
- package/template/.claude/settings.local.json +14 -14
- package/template/.genia/CONSTITUTION.md +129 -129
- package/template/.genia/contexts/api-patterns.md +134 -134
- package/template/.genia/contexts/nextjs-react.md +210 -210
- package/template/.genia/contexts/projeto.md +18 -18
- package/template/.genia/contexts/supabase.md +152 -152
- package/template/.genia/contexts/whatsapp-cloud.md +176 -176
- package/template/.genia/core-config.yaml +192 -192
- package/template/.genia/development/agents/analyst.md +138 -138
- package/template/.genia/development/agents/architect.md +171 -171
- package/template/.genia/development/agents/dev.md +160 -160
- package/template/.genia/development/agents/devops.md +200 -200
- package/template/.genia/development/agents/pm.md +142 -142
- package/template/.genia/development/agents/po.md +165 -165
- package/template/.genia/development/agents/qa.md +183 -183
- package/template/.genia/development/agents/reviewer.md +198 -198
- package/template/.genia/development/agents/sm.md +230 -230
- package/template/.genia/development/checklists/architecture-review.md +189 -189
- package/template/.genia/development/checklists/pre-commit.md +205 -205
- package/template/.genia/development/checklists/pre-deploy.md +230 -230
- package/template/.genia/development/checklists/qa-gate.md +216 -216
- package/template/.genia/development/checklists/story-dod.md +155 -155
- package/template/.genia/development/tasks/code-review.md +197 -197
- package/template/.genia/development/tasks/criar-prd.md +170 -170
- package/template/.genia/development/tasks/criar-spec.md +188 -188
- package/template/.genia/development/tasks/criar-story.md +185 -185
- package/template/.genia/development/tasks/debug-sistematico.md +230 -230
- package/template/.genia/development/tasks/dev-implement.md +199 -199
- package/template/.genia/development/tasks/qa-review.md +224 -224
- package/template/.genia/development/workflows/brownfield.md +178 -178
- package/template/.genia/development/workflows/delivery.md +208 -208
- package/template/.genia/development/workflows/development.md +189 -189
- package/template/.genia/development/workflows/greenfield.md +166 -166
- package/template/.genia/development/workflows/planning.md +167 -167
- package/template/.genia/development/workflows/qa-loop.md +179 -179
- package/template/.genia/development/workflows/spec-pipeline.md +192 -192
- package/template/.genia/development/workflows/story-development-cycle.md +252 -252
- package/template/.genia/guidelines/clean-code.md +98 -98
- package/template/.genia/guidelines/testing.md +176 -176
- package/template/.genia/skills/design/canvas-design.md +109 -109
- package/template/.genia/skills/design/frontend-design.md +140 -140
- package/template/.genia/skills/dev/mcp-builder.md +172 -172
- package/template/.genia/skills/dev/webapp-testing.md +150 -150
- package/template/.genia/skills/documents/docx.md +153 -153
- package/template/.genia/skills/documents/pdf.md +134 -134
- package/template/.genia/skills/documents/pptx.md +118 -118
- package/template/.genia/skills/documents/xlsx.md +140 -140
- package/template/.synapse/agent-analyst +8 -8
- package/template/.synapse/agent-architect +8 -8
- package/template/.synapse/agent-dev +8 -8
- package/template/.synapse/agent-devops +8 -8
- package/template/.synapse/agent-pm +8 -8
- package/template/.synapse/agent-po +7 -7
- package/template/.synapse/agent-qa +8 -8
- package/template/.synapse/agent-reviewer +7 -7
- package/template/.synapse/agent-sm +7 -7
- package/template/.synapse/constitution +7 -7
- package/template/.synapse/context +8 -8
- package/template/.synapse/global +8 -8
- package/template/.synapse/manifest +14 -14
- package/template/README.md +53 -53
|
@@ -1,153 +1,153 @@
|
|
|
1
|
-
# Skill: /docx
|
|
2
|
-
|
|
3
|
-
## Metadata
|
|
4
|
-
- **Nome**: Word Document Processing
|
|
5
|
-
- **Comando**: /docx
|
|
6
|
-
- **Agente**: @dev
|
|
7
|
-
- **Categoria**: documents
|
|
8
|
-
- **Versao**: 2.0
|
|
9
|
-
|
|
10
|
-
## Descricao
|
|
11
|
-
Criacao, edicao e analise de documentos Word com suporte a tracked changes, comentarios, formatacao e extracao de texto.
|
|
12
|
-
|
|
13
|
-
## Quando Usar
|
|
14
|
-
- Criar novos documentos Word
|
|
15
|
-
- Modificar/editar conteudo
|
|
16
|
-
- Trabalhar com tracked changes
|
|
17
|
-
- Adicionar comentarios
|
|
18
|
-
- Extrair texto de documentos
|
|
19
|
-
|
|
20
|
-
## Arvore de Decisao
|
|
21
|
-
|
|
22
|
-
### Ler/Analisar
|
|
23
|
-
→ Usar "Extracao de texto" ou "Acesso XML raw"
|
|
24
|
-
|
|
25
|
-
### Criar Novo Documento
|
|
26
|
-
→ Usar **docx-js** (JavaScript)
|
|
27
|
-
|
|
28
|
-
### Editar Documento Existente
|
|
29
|
-
- Documento proprio + mudancas simples → "Edicao OOXML basica"
|
|
30
|
-
- Documento de terceiros → **Workflow de Redlining**
|
|
31
|
-
- Docs juridicos/academicos/empresariais → **Workflow de Redlining**
|
|
32
|
-
|
|
33
|
-
## Extracao de Texto
|
|
34
|
-
```bash
|
|
35
|
-
# Converter para markdown com tracked changes
|
|
36
|
-
pandoc --track-changes=all documento.docx -o output.md
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
## Acesso XML Raw
|
|
40
|
-
|
|
41
|
-
### Desempacotar arquivo
|
|
42
|
-
```bash
|
|
43
|
-
python ooxml/scripts/unpack.py <arquivo.docx> <diretorio>
|
|
44
|
-
```
|
|
45
|
-
|
|
46
|
-
### Estrutura de arquivos
|
|
47
|
-
- `word/document.xml` - Conteudo principal
|
|
48
|
-
- `word/comments.xml` - Comentarios
|
|
49
|
-
- `word/media/` - Imagens e midia
|
|
50
|
-
- Tracked changes: `<w:ins>` (insercoes) e `<w:del>` (delecoes)
|
|
51
|
-
|
|
52
|
-
## Criar Novo Documento (docx-js)
|
|
53
|
-
|
|
54
|
-
```javascript
|
|
55
|
-
import { Document, Paragraph, TextRun, Packer } from 'docx';
|
|
56
|
-
|
|
57
|
-
const doc = new Document({
|
|
58
|
-
sections: [{
|
|
59
|
-
properties: {},
|
|
60
|
-
children: [
|
|
61
|
-
new Paragraph({
|
|
62
|
-
children: [
|
|
63
|
-
new TextRun({ text: "Titulo", bold: true, size: 32 }),
|
|
64
|
-
],
|
|
65
|
-
}),
|
|
66
|
-
new Paragraph({
|
|
67
|
-
children: [
|
|
68
|
-
new TextRun("Conteudo do documento..."),
|
|
69
|
-
],
|
|
70
|
-
}),
|
|
71
|
-
],
|
|
72
|
-
}],
|
|
73
|
-
});
|
|
74
|
-
|
|
75
|
-
// Exportar
|
|
76
|
-
const buffer = await Packer.toBuffer(doc);
|
|
77
|
-
fs.writeFileSync("documento.docx", buffer);
|
|
78
|
-
```
|
|
79
|
-
|
|
80
|
-
## Workflow de Redlining
|
|
81
|
-
|
|
82
|
-
### 1. Obter markdown
|
|
83
|
-
```bash
|
|
84
|
-
pandoc --track-changes=all arquivo.docx -o atual.md
|
|
85
|
-
```
|
|
86
|
-
|
|
87
|
-
### 2. Identificar mudancas
|
|
88
|
-
- Agrupar em batches de 3-10 mudancas
|
|
89
|
-
- Metodos de localizacao: secao/heading, paragrafos numerados, grep patterns
|
|
90
|
-
|
|
91
|
-
### 3. Ler documentacao e desempacotar
|
|
92
|
-
```bash
|
|
93
|
-
python ooxml/scripts/unpack.py arquivo.docx diretorio
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
### 4. Implementar mudancas em batches
|
|
97
|
-
Usar `get_node` para encontrar nos, implementar mudancas, `doc.save()`
|
|
98
|
-
|
|
99
|
-
### 5. Reempacotar
|
|
100
|
-
```bash
|
|
101
|
-
python ooxml/scripts/pack.py diretorio documento-revisado.docx
|
|
102
|
-
```
|
|
103
|
-
|
|
104
|
-
### 6. Verificacao final
|
|
105
|
-
```bash
|
|
106
|
-
pandoc --track-changes=all documento-revisado.docx -o verificacao.md
|
|
107
|
-
```
|
|
108
|
-
|
|
109
|
-
## Principio: Edicoes Minimas e Precisas
|
|
110
|
-
|
|
111
|
-
### ERRADO - Substitui sentenca inteira
|
|
112
|
-
```xml
|
|
113
|
-
<w:del><w:delText>O prazo e 30 dias.</w:delText></w:del>
|
|
114
|
-
<w:ins><w:t>O prazo e 60 dias.</w:t></w:ins>
|
|
115
|
-
```
|
|
116
|
-
|
|
117
|
-
### CORRETO - Marca apenas o que mudou
|
|
118
|
-
```xml
|
|
119
|
-
<w:r><w:t>O prazo e </w:t></w:r>
|
|
120
|
-
<w:del><w:delText>30</w:delText></w:del>
|
|
121
|
-
<w:ins><w:t>60</w:t></w:ins>
|
|
122
|
-
<w:r><w:t> dias.</w:t></w:r>
|
|
123
|
-
```
|
|
124
|
-
|
|
125
|
-
## Converter para Imagens
|
|
126
|
-
```bash
|
|
127
|
-
# DOCX → PDF
|
|
128
|
-
soffice --headless --convert-to pdf documento.docx
|
|
129
|
-
|
|
130
|
-
# PDF → JPEG
|
|
131
|
-
pdftoppm -jpeg -r 150 documento.pdf pagina
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
## Dependencias
|
|
135
|
-
```bash
|
|
136
|
-
# Python
|
|
137
|
-
pip install defusedxml
|
|
138
|
-
|
|
139
|
-
# Node
|
|
140
|
-
npm install docx
|
|
141
|
-
|
|
142
|
-
# Sistema
|
|
143
|
-
sudo apt-get install pandoc libreoffice poppler-utils
|
|
144
|
-
```
|
|
145
|
-
|
|
146
|
-
## Tasks Relacionadas
|
|
147
|
-
- task:criar-contrato
|
|
148
|
-
- task:revisar-documento
|
|
149
|
-
- task:gerar-relatorio-word
|
|
150
|
-
|
|
151
|
-
## Workflows
|
|
152
|
-
- workflow:revisao-documentos
|
|
153
|
-
- workflow:geracao-contratos
|
|
1
|
+
# Skill: /docx
|
|
2
|
+
|
|
3
|
+
## Metadata
|
|
4
|
+
- **Nome**: Word Document Processing
|
|
5
|
+
- **Comando**: /docx
|
|
6
|
+
- **Agente**: @dev
|
|
7
|
+
- **Categoria**: documents
|
|
8
|
+
- **Versao**: 2.0
|
|
9
|
+
|
|
10
|
+
## Descricao
|
|
11
|
+
Criacao, edicao e analise de documentos Word com suporte a tracked changes, comentarios, formatacao e extracao de texto.
|
|
12
|
+
|
|
13
|
+
## Quando Usar
|
|
14
|
+
- Criar novos documentos Word
|
|
15
|
+
- Modificar/editar conteudo
|
|
16
|
+
- Trabalhar com tracked changes
|
|
17
|
+
- Adicionar comentarios
|
|
18
|
+
- Extrair texto de documentos
|
|
19
|
+
|
|
20
|
+
## Arvore de Decisao
|
|
21
|
+
|
|
22
|
+
### Ler/Analisar
|
|
23
|
+
→ Usar "Extracao de texto" ou "Acesso XML raw"
|
|
24
|
+
|
|
25
|
+
### Criar Novo Documento
|
|
26
|
+
→ Usar **docx-js** (JavaScript)
|
|
27
|
+
|
|
28
|
+
### Editar Documento Existente
|
|
29
|
+
- Documento proprio + mudancas simples → "Edicao OOXML basica"
|
|
30
|
+
- Documento de terceiros → **Workflow de Redlining**
|
|
31
|
+
- Docs juridicos/academicos/empresariais → **Workflow de Redlining**
|
|
32
|
+
|
|
33
|
+
## Extracao de Texto
|
|
34
|
+
```bash
|
|
35
|
+
# Converter para markdown com tracked changes
|
|
36
|
+
pandoc --track-changes=all documento.docx -o output.md
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Acesso XML Raw
|
|
40
|
+
|
|
41
|
+
### Desempacotar arquivo
|
|
42
|
+
```bash
|
|
43
|
+
python ooxml/scripts/unpack.py <arquivo.docx> <diretorio>
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
### Estrutura de arquivos
|
|
47
|
+
- `word/document.xml` - Conteudo principal
|
|
48
|
+
- `word/comments.xml` - Comentarios
|
|
49
|
+
- `word/media/` - Imagens e midia
|
|
50
|
+
- Tracked changes: `<w:ins>` (insercoes) e `<w:del>` (delecoes)
|
|
51
|
+
|
|
52
|
+
## Criar Novo Documento (docx-js)
|
|
53
|
+
|
|
54
|
+
```javascript
|
|
55
|
+
import { Document, Paragraph, TextRun, Packer } from 'docx';
|
|
56
|
+
|
|
57
|
+
const doc = new Document({
|
|
58
|
+
sections: [{
|
|
59
|
+
properties: {},
|
|
60
|
+
children: [
|
|
61
|
+
new Paragraph({
|
|
62
|
+
children: [
|
|
63
|
+
new TextRun({ text: "Titulo", bold: true, size: 32 }),
|
|
64
|
+
],
|
|
65
|
+
}),
|
|
66
|
+
new Paragraph({
|
|
67
|
+
children: [
|
|
68
|
+
new TextRun("Conteudo do documento..."),
|
|
69
|
+
],
|
|
70
|
+
}),
|
|
71
|
+
],
|
|
72
|
+
}],
|
|
73
|
+
});
|
|
74
|
+
|
|
75
|
+
// Exportar
|
|
76
|
+
const buffer = await Packer.toBuffer(doc);
|
|
77
|
+
fs.writeFileSync("documento.docx", buffer);
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
## Workflow de Redlining
|
|
81
|
+
|
|
82
|
+
### 1. Obter markdown
|
|
83
|
+
```bash
|
|
84
|
+
pandoc --track-changes=all arquivo.docx -o atual.md
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
### 2. Identificar mudancas
|
|
88
|
+
- Agrupar em batches de 3-10 mudancas
|
|
89
|
+
- Metodos de localizacao: secao/heading, paragrafos numerados, grep patterns
|
|
90
|
+
|
|
91
|
+
### 3. Ler documentacao e desempacotar
|
|
92
|
+
```bash
|
|
93
|
+
python ooxml/scripts/unpack.py arquivo.docx diretorio
|
|
94
|
+
```
|
|
95
|
+
|
|
96
|
+
### 4. Implementar mudancas em batches
|
|
97
|
+
Usar `get_node` para encontrar nos, implementar mudancas, `doc.save()`
|
|
98
|
+
|
|
99
|
+
### 5. Reempacotar
|
|
100
|
+
```bash
|
|
101
|
+
python ooxml/scripts/pack.py diretorio documento-revisado.docx
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
### 6. Verificacao final
|
|
105
|
+
```bash
|
|
106
|
+
pandoc --track-changes=all documento-revisado.docx -o verificacao.md
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
## Principio: Edicoes Minimas e Precisas
|
|
110
|
+
|
|
111
|
+
### ERRADO - Substitui sentenca inteira
|
|
112
|
+
```xml
|
|
113
|
+
<w:del><w:delText>O prazo e 30 dias.</w:delText></w:del>
|
|
114
|
+
<w:ins><w:t>O prazo e 60 dias.</w:t></w:ins>
|
|
115
|
+
```
|
|
116
|
+
|
|
117
|
+
### CORRETO - Marca apenas o que mudou
|
|
118
|
+
```xml
|
|
119
|
+
<w:r><w:t>O prazo e </w:t></w:r>
|
|
120
|
+
<w:del><w:delText>30</w:delText></w:del>
|
|
121
|
+
<w:ins><w:t>60</w:t></w:ins>
|
|
122
|
+
<w:r><w:t> dias.</w:t></w:r>
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
## Converter para Imagens
|
|
126
|
+
```bash
|
|
127
|
+
# DOCX → PDF
|
|
128
|
+
soffice --headless --convert-to pdf documento.docx
|
|
129
|
+
|
|
130
|
+
# PDF → JPEG
|
|
131
|
+
pdftoppm -jpeg -r 150 documento.pdf pagina
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
## Dependencias
|
|
135
|
+
```bash
|
|
136
|
+
# Python
|
|
137
|
+
pip install defusedxml
|
|
138
|
+
|
|
139
|
+
# Node
|
|
140
|
+
npm install docx
|
|
141
|
+
|
|
142
|
+
# Sistema
|
|
143
|
+
sudo apt-get install pandoc libreoffice poppler-utils
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
## Tasks Relacionadas
|
|
147
|
+
- task:criar-contrato
|
|
148
|
+
- task:revisar-documento
|
|
149
|
+
- task:gerar-relatorio-word
|
|
150
|
+
|
|
151
|
+
## Workflows
|
|
152
|
+
- workflow:revisao-documentos
|
|
153
|
+
- workflow:geracao-contratos
|
|
@@ -1,134 +1,134 @@
|
|
|
1
|
-
# Skill: /pdf
|
|
2
|
-
|
|
3
|
-
## Metadata
|
|
4
|
-
- **Nome**: PDF Processing
|
|
5
|
-
- **Comando**: /pdf
|
|
6
|
-
- **Agente**: @dev
|
|
7
|
-
- **Categoria**: documents
|
|
8
|
-
- **Versao**: 2.0
|
|
9
|
-
|
|
10
|
-
## Descricao
|
|
11
|
-
Toolkit completo para manipulacao de PDFs: extrair texto e tabelas, criar novos PDFs, mesclar/dividir documentos e preencher formularios.
|
|
12
|
-
|
|
13
|
-
## Quando Usar
|
|
14
|
-
- Extrair texto de PDFs
|
|
15
|
-
- Extrair tabelas para Excel/CSV
|
|
16
|
-
- Mesclar multiplos PDFs
|
|
17
|
-
- Dividir PDF em paginas
|
|
18
|
-
- Criar novos PDFs
|
|
19
|
-
- Preencher formularios PDF
|
|
20
|
-
- Adicionar marca d'agua
|
|
21
|
-
- OCR em PDFs escaneados
|
|
22
|
-
|
|
23
|
-
## Bibliotecas Python
|
|
24
|
-
|
|
25
|
-
### pypdf - Operacoes Basicas
|
|
26
|
-
```python
|
|
27
|
-
from pypdf import PdfReader, PdfWriter
|
|
28
|
-
|
|
29
|
-
# Ler PDF
|
|
30
|
-
reader = PdfReader("documento.pdf")
|
|
31
|
-
print(f"Paginas: {len(reader.pages)}")
|
|
32
|
-
|
|
33
|
-
# Extrair texto
|
|
34
|
-
text = ""
|
|
35
|
-
for page in reader.pages:
|
|
36
|
-
text += page.extract_text()
|
|
37
|
-
```
|
|
38
|
-
|
|
39
|
-
### Mesclar PDFs
|
|
40
|
-
```python
|
|
41
|
-
from pypdf import PdfWriter, PdfReader
|
|
42
|
-
|
|
43
|
-
writer = PdfWriter()
|
|
44
|
-
for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
|
|
45
|
-
reader = PdfReader(pdf_file)
|
|
46
|
-
for page in reader.pages:
|
|
47
|
-
writer.add_page(page)
|
|
48
|
-
|
|
49
|
-
with open("mesclado.pdf", "wb") as output:
|
|
50
|
-
writer.write(output)
|
|
51
|
-
```
|
|
52
|
-
|
|
53
|
-
### Dividir PDF
|
|
54
|
-
```python
|
|
55
|
-
reader = PdfReader("input.pdf")
|
|
56
|
-
for i, page in enumerate(reader.pages):
|
|
57
|
-
writer = PdfWriter()
|
|
58
|
-
writer.add_page(page)
|
|
59
|
-
with open(f"pagina_{i+1}.pdf", "wb") as output:
|
|
60
|
-
writer.write(output)
|
|
61
|
-
```
|
|
62
|
-
|
|
63
|
-
### pdfplumber - Extrair Tabelas
|
|
64
|
-
```python
|
|
65
|
-
import pdfplumber
|
|
66
|
-
import pandas as pd
|
|
67
|
-
|
|
68
|
-
with pdfplumber.open("documento.pdf") as pdf:
|
|
69
|
-
all_tables = []
|
|
70
|
-
for page in pdf.pages:
|
|
71
|
-
tables = page.extract_tables()
|
|
72
|
-
for table in tables:
|
|
73
|
-
if table:
|
|
74
|
-
df = pd.DataFrame(table[1:], columns=table[0])
|
|
75
|
-
all_tables.append(df)
|
|
76
|
-
|
|
77
|
-
if all_tables:
|
|
78
|
-
combined_df = pd.concat(all_tables, ignore_index=True)
|
|
79
|
-
combined_df.to_excel("tabelas.xlsx", index=False)
|
|
80
|
-
```
|
|
81
|
-
|
|
82
|
-
### reportlab - Criar PDFs
|
|
83
|
-
```python
|
|
84
|
-
from reportlab.lib.pagesizes import letter
|
|
85
|
-
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
|
|
86
|
-
from reportlab.lib.styles import getSampleStyleSheet
|
|
87
|
-
|
|
88
|
-
doc = SimpleDocTemplate("relatorio.pdf", pagesize=letter)
|
|
89
|
-
styles = getSampleStyleSheet()
|
|
90
|
-
story = []
|
|
91
|
-
|
|
92
|
-
story.append(Paragraph("Titulo", styles['Title']))
|
|
93
|
-
story.append(Spacer(1, 12))
|
|
94
|
-
story.append(Paragraph("Conteudo aqui...", styles['Normal']))
|
|
95
|
-
|
|
96
|
-
doc.build(story)
|
|
97
|
-
```
|
|
98
|
-
|
|
99
|
-
## Comandos CLI
|
|
100
|
-
|
|
101
|
-
### pdftotext (poppler-utils)
|
|
102
|
-
```bash
|
|
103
|
-
pdftotext input.pdf output.txt
|
|
104
|
-
pdftotext -layout input.pdf output.txt # Preserva layout
|
|
105
|
-
```
|
|
106
|
-
|
|
107
|
-
### qpdf
|
|
108
|
-
```bash
|
|
109
|
-
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf # Mesclar
|
|
110
|
-
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf # Extrair paginas
|
|
111
|
-
```
|
|
112
|
-
|
|
113
|
-
## Quick Reference
|
|
114
|
-
|
|
115
|
-
| Tarefa | Biblioteca | Codigo |
|
|
116
|
-
|--------|------------|--------|
|
|
117
|
-
| Mesclar PDFs | pypdf | `writer.add_page(page)` |
|
|
118
|
-
| Extrair texto | pdfplumber | `page.extract_text()` |
|
|
119
|
-
| Extrair tabelas | pdfplumber | `page.extract_tables()` |
|
|
120
|
-
| Criar PDFs | reportlab | Canvas ou Platypus |
|
|
121
|
-
| OCR | pytesseract | Converter para imagem primeiro |
|
|
122
|
-
| Formularios | pypdf/pdf-lib | Ver forms.md |
|
|
123
|
-
|
|
124
|
-
## Dependencias
|
|
125
|
-
```bash
|
|
126
|
-
pip install pypdf pdfplumber reportlab pytesseract pdf2image
|
|
127
|
-
```
|
|
128
|
-
|
|
129
|
-
## Tasks Relacionadas
|
|
130
|
-
- task:extrair-dados-pdf
|
|
131
|
-
- task:gerar-relatorio-pdf
|
|
132
|
-
|
|
133
|
-
## Workflows
|
|
134
|
-
- workflow:processar-documentos
|
|
1
|
+
# Skill: /pdf
|
|
2
|
+
|
|
3
|
+
## Metadata
|
|
4
|
+
- **Nome**: PDF Processing
|
|
5
|
+
- **Comando**: /pdf
|
|
6
|
+
- **Agente**: @dev
|
|
7
|
+
- **Categoria**: documents
|
|
8
|
+
- **Versao**: 2.0
|
|
9
|
+
|
|
10
|
+
## Descricao
|
|
11
|
+
Toolkit completo para manipulacao de PDFs: extrair texto e tabelas, criar novos PDFs, mesclar/dividir documentos e preencher formularios.
|
|
12
|
+
|
|
13
|
+
## Quando Usar
|
|
14
|
+
- Extrair texto de PDFs
|
|
15
|
+
- Extrair tabelas para Excel/CSV
|
|
16
|
+
- Mesclar multiplos PDFs
|
|
17
|
+
- Dividir PDF em paginas
|
|
18
|
+
- Criar novos PDFs
|
|
19
|
+
- Preencher formularios PDF
|
|
20
|
+
- Adicionar marca d'agua
|
|
21
|
+
- OCR em PDFs escaneados
|
|
22
|
+
|
|
23
|
+
## Bibliotecas Python
|
|
24
|
+
|
|
25
|
+
### pypdf - Operacoes Basicas
|
|
26
|
+
```python
|
|
27
|
+
from pypdf import PdfReader, PdfWriter
|
|
28
|
+
|
|
29
|
+
# Ler PDF
|
|
30
|
+
reader = PdfReader("documento.pdf")
|
|
31
|
+
print(f"Paginas: {len(reader.pages)}")
|
|
32
|
+
|
|
33
|
+
# Extrair texto
|
|
34
|
+
text = ""
|
|
35
|
+
for page in reader.pages:
|
|
36
|
+
text += page.extract_text()
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
### Mesclar PDFs
|
|
40
|
+
```python
|
|
41
|
+
from pypdf import PdfWriter, PdfReader
|
|
42
|
+
|
|
43
|
+
writer = PdfWriter()
|
|
44
|
+
for pdf_file in ["doc1.pdf", "doc2.pdf", "doc3.pdf"]:
|
|
45
|
+
reader = PdfReader(pdf_file)
|
|
46
|
+
for page in reader.pages:
|
|
47
|
+
writer.add_page(page)
|
|
48
|
+
|
|
49
|
+
with open("mesclado.pdf", "wb") as output:
|
|
50
|
+
writer.write(output)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
### Dividir PDF
|
|
54
|
+
```python
|
|
55
|
+
reader = PdfReader("input.pdf")
|
|
56
|
+
for i, page in enumerate(reader.pages):
|
|
57
|
+
writer = PdfWriter()
|
|
58
|
+
writer.add_page(page)
|
|
59
|
+
with open(f"pagina_{i+1}.pdf", "wb") as output:
|
|
60
|
+
writer.write(output)
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### pdfplumber - Extrair Tabelas
|
|
64
|
+
```python
|
|
65
|
+
import pdfplumber
|
|
66
|
+
import pandas as pd
|
|
67
|
+
|
|
68
|
+
with pdfplumber.open("documento.pdf") as pdf:
|
|
69
|
+
all_tables = []
|
|
70
|
+
for page in pdf.pages:
|
|
71
|
+
tables = page.extract_tables()
|
|
72
|
+
for table in tables:
|
|
73
|
+
if table:
|
|
74
|
+
df = pd.DataFrame(table[1:], columns=table[0])
|
|
75
|
+
all_tables.append(df)
|
|
76
|
+
|
|
77
|
+
if all_tables:
|
|
78
|
+
combined_df = pd.concat(all_tables, ignore_index=True)
|
|
79
|
+
combined_df.to_excel("tabelas.xlsx", index=False)
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
### reportlab - Criar PDFs
|
|
83
|
+
```python
|
|
84
|
+
from reportlab.lib.pagesizes import letter
|
|
85
|
+
from reportlab.platypus import SimpleDocTemplate, Paragraph, Spacer
|
|
86
|
+
from reportlab.lib.styles import getSampleStyleSheet
|
|
87
|
+
|
|
88
|
+
doc = SimpleDocTemplate("relatorio.pdf", pagesize=letter)
|
|
89
|
+
styles = getSampleStyleSheet()
|
|
90
|
+
story = []
|
|
91
|
+
|
|
92
|
+
story.append(Paragraph("Titulo", styles['Title']))
|
|
93
|
+
story.append(Spacer(1, 12))
|
|
94
|
+
story.append(Paragraph("Conteudo aqui...", styles['Normal']))
|
|
95
|
+
|
|
96
|
+
doc.build(story)
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## Comandos CLI
|
|
100
|
+
|
|
101
|
+
### pdftotext (poppler-utils)
|
|
102
|
+
```bash
|
|
103
|
+
pdftotext input.pdf output.txt
|
|
104
|
+
pdftotext -layout input.pdf output.txt # Preserva layout
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### qpdf
|
|
108
|
+
```bash
|
|
109
|
+
qpdf --empty --pages file1.pdf file2.pdf -- merged.pdf # Mesclar
|
|
110
|
+
qpdf input.pdf --pages . 1-5 -- pages1-5.pdf # Extrair paginas
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
## Quick Reference
|
|
114
|
+
|
|
115
|
+
| Tarefa | Biblioteca | Codigo |
|
|
116
|
+
|--------|------------|--------|
|
|
117
|
+
| Mesclar PDFs | pypdf | `writer.add_page(page)` |
|
|
118
|
+
| Extrair texto | pdfplumber | `page.extract_text()` |
|
|
119
|
+
| Extrair tabelas | pdfplumber | `page.extract_tables()` |
|
|
120
|
+
| Criar PDFs | reportlab | Canvas ou Platypus |
|
|
121
|
+
| OCR | pytesseract | Converter para imagem primeiro |
|
|
122
|
+
| Formularios | pypdf/pdf-lib | Ver forms.md |
|
|
123
|
+
|
|
124
|
+
## Dependencias
|
|
125
|
+
```bash
|
|
126
|
+
pip install pypdf pdfplumber reportlab pytesseract pdf2image
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
## Tasks Relacionadas
|
|
130
|
+
- task:extrair-dados-pdf
|
|
131
|
+
- task:gerar-relatorio-pdf
|
|
132
|
+
|
|
133
|
+
## Workflows
|
|
134
|
+
- workflow:processar-documentos
|