vaultex 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- vaultex-0.1.0/PKG-INFO +106 -0
- vaultex-0.1.0/README.md +87 -0
- vaultex-0.1.0/pyproject.toml +37 -0
- vaultex-0.1.0/setup.cfg +4 -0
- vaultex-0.1.0/setup.py +3 -0
- vaultex-0.1.0/vaultex/__init__.py +4 -0
- vaultex-0.1.0/vaultex/__main__.py +4 -0
- vaultex-0.1.0/vaultex/app.py +343 -0
- vaultex-0.1.0/vaultex/core.py +152 -0
- vaultex-0.1.0/vaultex.egg-info/PKG-INFO +106 -0
- vaultex-0.1.0/vaultex.egg-info/SOURCES.txt +13 -0
- vaultex-0.1.0/vaultex.egg-info/dependency_links.txt +1 -0
- vaultex-0.1.0/vaultex.egg-info/entry_points.txt +2 -0
- vaultex-0.1.0/vaultex.egg-info/requires.txt +1 -0
- vaultex-0.1.0/vaultex.egg-info/top_level.txt +1 -0
vaultex-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: vaultex
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Extract and merge files from a folder with precision — a lightweight GUI tool for developers and LLM workflows.
|
|
5
|
+
Author-email: Ian Gong <gongzhijie535@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/gongzhijie535-ctrl/vaultex
|
|
8
|
+
Project-URL: Repository, https://github.com/gongzhijie535-ctrl/vaultex
|
|
9
|
+
Project-URL: Issues, https://github.com/gongzhijie535-ctrl/vaultex
|
|
10
|
+
Keywords: file,extract,merge,llm,gradio,code,tools
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
13
|
+
Classifier: Operating System :: OS Independent
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Topic :: Utilities
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: gradio>=4.0
|
|
19
|
+
|
|
20
|
+
# 🔐 Vaultex
|
|
21
|
+
|
|
22
|
+
> Extract what you need. Nothing more.
|
|
23
|
+
|
|
24
|
+
Vaultex is a lightweight GUI tool that lets you **scan a folder and merge all matching files into a single text output** — ready to paste into an LLM, a doc, or anywhere you need a full-project snapshot.
|
|
25
|
+
|
|
26
|
+
The idea is simple: when you're working with a codebase or a collection of files, you often need to quickly gather *specific types* of files across nested folders. Vaultex gives you precise control over what gets included, what gets skipped, and how the result is organized — all through a clean interface.
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## ✨ Features
|
|
31
|
+
|
|
32
|
+
- 📂 **Folder picker** — browse or paste a path directly
|
|
33
|
+
- 📄 **File type selector** — pick from common extensions or add your own
|
|
34
|
+
- 🎯 **Whitelist / blacklist filtering** — specify exactly which folders and files to include or exclude
|
|
35
|
+
- 📦 **File size limit** — skip files that are too large
|
|
36
|
+
- 🔁 **Recursive or flat mode** — go deep or stay shallow
|
|
37
|
+
- 🔃 **Sort options** — by path, filename, or last modified time
|
|
38
|
+
- 🔍 **Preview before extracting** — scan first, extract when ready
|
|
39
|
+
- 💾 **Save to file** — optionally write the merged output back to disk
|
|
40
|
+
- 🤖 **Token estimator** — rough count to check if output fits your LLM context window
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## 🚀 Installation
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
pip install vaultex
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Then launch:
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
vaultex
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Or run directly from source:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
git clone https://github.com/gongzhijie535-ctrl/vaultex
|
|
60
|
+
cd vaultex
|
|
61
|
+
pip install -e .
|
|
62
|
+
python -m vaultex
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## 🖥️ Usage
|
|
68
|
+
|
|
69
|
+
1. Select a folder using the **📂 picker** or paste a path
|
|
70
|
+
2. Check the file types you want (`.py`, `.md`, `.json`, etc.)
|
|
71
|
+
3. Expand **Filter Options** to narrow down by folder or filename
|
|
72
|
+
4. Click **🔍 Preview** to confirm the file list
|
|
73
|
+
5. Click **🚀 Extract** to merge and view the output
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## 📁 Project Structure
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
vaultex/
|
|
81
|
+
├── __init__.py
|
|
82
|
+
├── __main__.py # entry point: python -m vaultex
|
|
83
|
+
├── core.py # file collection + merging logic
|
|
84
|
+
└── app.py # Gradio UI
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## 📦 Requirements
|
|
90
|
+
|
|
91
|
+
- Python ≥ 3.10
|
|
92
|
+
- gradio
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## 👤 Author
|
|
97
|
+
|
|
98
|
+
**Ian Gong (龚智杰)**
|
|
99
|
+
📧 gongzhijie535@gmail.com
|
|
100
|
+
🐙 [@gongzhijie535-ctrl](https://github.com/gongzhijie535-ctrl)
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## 📄 License
|
|
105
|
+
|
|
106
|
+
MIT
|
vaultex-0.1.0/README.md
ADDED
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
# 🔐 Vaultex
|
|
2
|
+
|
|
3
|
+
> Extract what you need. Nothing more.
|
|
4
|
+
|
|
5
|
+
Vaultex is a lightweight GUI tool that lets you **scan a folder and merge all matching files into a single text output** — ready to paste into an LLM, a doc, or anywhere you need a full-project snapshot.
|
|
6
|
+
|
|
7
|
+
The idea is simple: when you're working with a codebase or a collection of files, you often need to quickly gather *specific types* of files across nested folders. Vaultex gives you precise control over what gets included, what gets skipped, and how the result is organized — all through a clean interface.
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## ✨ Features
|
|
12
|
+
|
|
13
|
+
- 📂 **Folder picker** — browse or paste a path directly
|
|
14
|
+
- 📄 **File type selector** — pick from common extensions or add your own
|
|
15
|
+
- 🎯 **Whitelist / blacklist filtering** — specify exactly which folders and files to include or exclude
|
|
16
|
+
- 📦 **File size limit** — skip files that are too large
|
|
17
|
+
- 🔁 **Recursive or flat mode** — go deep or stay shallow
|
|
18
|
+
- 🔃 **Sort options** — by path, filename, or last modified time
|
|
19
|
+
- 🔍 **Preview before extracting** — scan first, extract when ready
|
|
20
|
+
- 💾 **Save to file** — optionally write the merged output back to disk
|
|
21
|
+
- 🤖 **Token estimator** — rough count to check if output fits your LLM context window
|
|
22
|
+
|
|
23
|
+
---
|
|
24
|
+
|
|
25
|
+
## 🚀 Installation
|
|
26
|
+
|
|
27
|
+
```bash
|
|
28
|
+
pip install vaultex
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
Then launch:
|
|
32
|
+
|
|
33
|
+
```bash
|
|
34
|
+
vaultex
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
Or run directly from source:
|
|
38
|
+
|
|
39
|
+
```bash
|
|
40
|
+
git clone https://github.com/gongzhijie535-ctrl/vaultex
|
|
41
|
+
cd vaultex
|
|
42
|
+
pip install -e .
|
|
43
|
+
python -m vaultex
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## 🖥️ Usage
|
|
49
|
+
|
|
50
|
+
1. Select a folder using the **📂 picker** or paste a path
|
|
51
|
+
2. Check the file types you want (`.py`, `.md`, `.json`, etc.)
|
|
52
|
+
3. Expand **Filter Options** to narrow down by folder or filename
|
|
53
|
+
4. Click **🔍 Preview** to confirm the file list
|
|
54
|
+
5. Click **🚀 Extract** to merge and view the output
|
|
55
|
+
|
|
56
|
+
---
|
|
57
|
+
|
|
58
|
+
## 📁 Project Structure
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
vaultex/
|
|
62
|
+
├── __init__.py
|
|
63
|
+
├── __main__.py # entry point: python -m vaultex
|
|
64
|
+
├── core.py # file collection + merging logic
|
|
65
|
+
└── app.py # Gradio UI
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
---
|
|
69
|
+
|
|
70
|
+
## 📦 Requirements
|
|
71
|
+
|
|
72
|
+
- Python ≥ 3.10
|
|
73
|
+
- gradio
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## 👤 Author
|
|
78
|
+
|
|
79
|
+
**Ian Gong (龚智杰)**
|
|
80
|
+
📧 gongzhijie535@gmail.com
|
|
81
|
+
🐙 [@gongzhijie535-ctrl](https://github.com/gongzhijie535-ctrl)
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## 📄 License
|
|
86
|
+
|
|
87
|
+
MIT
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
[build-system]
|
|
2
|
+
requires = ["setuptools>=68", "wheel"]
|
|
3
|
+
build-backend = "setuptools.build_meta"
|
|
4
|
+
|
|
5
|
+
[project]
|
|
6
|
+
name = "vaultex"
|
|
7
|
+
version = "0.1.0"
|
|
8
|
+
description = "Extract and merge files from a folder with precision — a lightweight GUI tool for developers and LLM workflows."
|
|
9
|
+
readme = "README.md"
|
|
10
|
+
license = { text = "MIT" }
|
|
11
|
+
authors = [
|
|
12
|
+
{ name = "Ian Gong", email = "gongzhijie535@gmail.com" }
|
|
13
|
+
]
|
|
14
|
+
keywords = ["file", "extract", "merge", "llm", "gradio", "code", "tools"]
|
|
15
|
+
classifiers = [
|
|
16
|
+
"Programming Language :: Python :: 3",
|
|
17
|
+
"License :: OSI Approved :: MIT License",
|
|
18
|
+
"Operating System :: OS Independent",
|
|
19
|
+
"Intended Audience :: Developers",
|
|
20
|
+
"Topic :: Utilities",
|
|
21
|
+
]
|
|
22
|
+
requires-python = ">=3.10"
|
|
23
|
+
dependencies = [
|
|
24
|
+
"gradio>=4.0",
|
|
25
|
+
]
|
|
26
|
+
|
|
27
|
+
[project.urls]
|
|
28
|
+
Homepage = "https://github.com/gongzhijie535-ctrl/vaultex"
|
|
29
|
+
Repository = "https://github.com/gongzhijie535-ctrl/vaultex"
|
|
30
|
+
Issues = "https://github.com/gongzhijie535-ctrl/vaultex"
|
|
31
|
+
|
|
32
|
+
[project.scripts]
|
|
33
|
+
vaultex = "vaultex.app:launch"
|
|
34
|
+
|
|
35
|
+
[tool.setuptools.packages.find]
|
|
36
|
+
where = ["."]
|
|
37
|
+
include = ["vaultex*"]
|
vaultex-0.1.0/setup.cfg
ADDED
vaultex-0.1.0/setup.py
ADDED
|
@@ -0,0 +1,343 @@
|
|
|
1
|
+
import os
|
|
2
|
+
import gradio as gr
|
|
3
|
+
from vaultex.core import extract, DEFAULT_EXTENSIONS
|
|
4
|
+
|
|
5
|
+
|
|
6
|
+
def pick_folder():
|
|
7
|
+
try:
|
|
8
|
+
import tkinter as tk
|
|
9
|
+
from tkinter import filedialog
|
|
10
|
+
root = tk.Tk()
|
|
11
|
+
root.withdraw()
|
|
12
|
+
root.wm_attributes("-topmost", True)
|
|
13
|
+
folder = filedialog.askdirectory(title="选择文件夹")
|
|
14
|
+
root.destroy()
|
|
15
|
+
return folder or ""
|
|
16
|
+
except Exception as e:
|
|
17
|
+
return f"⚠️ 无法打开选择窗口:{e}"
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
def _merge_extensions(selected, custom_str):
|
|
21
|
+
exts = set(selected or [])
|
|
22
|
+
for line in custom_str.splitlines():
|
|
23
|
+
line = line.strip()
|
|
24
|
+
if not line:
|
|
25
|
+
continue
|
|
26
|
+
if not line.startswith("."):
|
|
27
|
+
line = "." + line
|
|
28
|
+
exts.add(line.lower())
|
|
29
|
+
return list(exts)
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
def _parse_lines(raw: str) -> list:
|
|
33
|
+
return [s.strip() for s in raw.splitlines() if s.strip()]
|
|
34
|
+
|
|
35
|
+
|
|
36
|
+
def _parse_keyword_files(raw: str) -> set:
|
|
37
|
+
return {s.strip().lower() for s in raw.splitlines() if s.strip()}
|
|
38
|
+
|
|
39
|
+
|
|
40
|
+
def _build_common_args(folder_path, selected_extensions, custom_extensions_str,
|
|
41
|
+
recursive, only_folders_str, skip_folders_str,
|
|
42
|
+
keyword_files_str, skip_files_str, max_file_kb):
|
|
43
|
+
folder_path = folder_path.strip()
|
|
44
|
+
extensions = _merge_extensions(selected_extensions, custom_extensions_str)
|
|
45
|
+
only_folders = _parse_lines(only_folders_str)
|
|
46
|
+
skip_folders = _parse_lines(skip_folders_str)
|
|
47
|
+
keyword_files = _parse_keyword_files(keyword_files_str)
|
|
48
|
+
skip_files = _parse_lines(skip_files_str)
|
|
49
|
+
max_kb = int(max_file_kb) if str(max_file_kb).strip().isdigit() else 0
|
|
50
|
+
return folder_path, extensions, only_folders, skip_folders, keyword_files, skip_files, max_kb
|
|
51
|
+
|
|
52
|
+
|
|
53
|
+
def run_scan(folder_path, selected_extensions, custom_extensions_str,
|
|
54
|
+
recursive, only_folders_str, skip_folders_str,
|
|
55
|
+
keyword_files_str, skip_files_str, max_file_kb):
|
|
56
|
+
|
|
57
|
+
folder_path, extensions, only_folders, skip_folders, keyword_files, skip_files, max_kb = _build_common_args(
|
|
58
|
+
folder_path, selected_extensions, custom_extensions_str,
|
|
59
|
+
recursive, only_folders_str, skip_folders_str,
|
|
60
|
+
keyword_files_str, skip_files_str, max_file_kb
|
|
61
|
+
)
|
|
62
|
+
|
|
63
|
+
if not folder_path:
|
|
64
|
+
return "⚠️ 请输入文件夹路径"
|
|
65
|
+
if not os.path.isdir(folder_path):
|
|
66
|
+
return "⚠️ 路径不存在或不是文件夹"
|
|
67
|
+
if not extensions:
|
|
68
|
+
return "⚠️ 请至少选择或填写一种文件类型"
|
|
69
|
+
|
|
70
|
+
from vaultex.core import _collect_files
|
|
71
|
+
file_list = _collect_files(
|
|
72
|
+
folder_path, extensions, recursive,
|
|
73
|
+
skip_folders, skip_files, only_folders, max_kb, keyword_files
|
|
74
|
+
)
|
|
75
|
+
file_list.sort()
|
|
76
|
+
|
|
77
|
+
if not file_list:
|
|
78
|
+
return "⚠️ 没有找到符合条件的文件"
|
|
79
|
+
|
|
80
|
+
lines = [f"🔍 扫描完成,共找到 {len(file_list)} 个文件:", ""]
|
|
81
|
+
for i, f in enumerate(file_list, 1):
|
|
82
|
+
rel = os.path.relpath(f, folder_path)
|
|
83
|
+
size = os.path.getsize(f) / 1024
|
|
84
|
+
lines.append(f" {i:>3}. {rel} ({size:.1f} KB)")
|
|
85
|
+
|
|
86
|
+
return "\n".join(lines)
|
|
87
|
+
|
|
88
|
+
|
|
89
|
+
def run_extract(folder_path, selected_extensions, custom_extensions_str,
|
|
90
|
+
recursive, separator, only_folders_str, skip_folders_str,
|
|
91
|
+
keyword_files_str, skip_files_str, max_file_kb,
|
|
92
|
+
sort_by, save_to_file, output_filename):
|
|
93
|
+
|
|
94
|
+
folder_path, extensions, only_folders, skip_folders, keyword_files, skip_files, max_kb = _build_common_args(
|
|
95
|
+
folder_path, selected_extensions, custom_extensions_str,
|
|
96
|
+
recursive, only_folders_str, skip_folders_str,
|
|
97
|
+
keyword_files_str, skip_files_str, max_file_kb
|
|
98
|
+
)
|
|
99
|
+
|
|
100
|
+
if not folder_path:
|
|
101
|
+
return "⚠️ 请输入文件夹路径", "未选择文件夹"
|
|
102
|
+
if not os.path.isdir(folder_path):
|
|
103
|
+
return "⚠️ 路径不存在或不是文件夹", "路径无效"
|
|
104
|
+
if not extensions:
|
|
105
|
+
return "⚠️ 请至少选择或填写一种文件类型", "未选择文件类型"
|
|
106
|
+
|
|
107
|
+
separator_str = separator.strip() if separator.strip() else "=" * 60
|
|
108
|
+
|
|
109
|
+
merged, file_list, stats = extract(
|
|
110
|
+
folder_path=folder_path,
|
|
111
|
+
extensions=extensions,
|
|
112
|
+
recursive=recursive,
|
|
113
|
+
separator=separator_str,
|
|
114
|
+
skip_folders=skip_folders,
|
|
115
|
+
skip_files=skip_files,
|
|
116
|
+
only_folders=only_folders,
|
|
117
|
+
max_file_kb=max_kb,
|
|
118
|
+
sort_by=sort_by,
|
|
119
|
+
keyword_files=keyword_files,
|
|
120
|
+
)
|
|
121
|
+
|
|
122
|
+
if not file_list:
|
|
123
|
+
return "⚠️ 没有找到符合条件的文件", "0 个文件"
|
|
124
|
+
|
|
125
|
+
saved_msg = ""
|
|
126
|
+
if save_to_file:
|
|
127
|
+
out_name = output_filename.strip() or "代码汇总.txt"
|
|
128
|
+
if not out_name.endswith(".txt"):
|
|
129
|
+
out_name += ".txt"
|
|
130
|
+
out_path = os.path.join(folder_path, out_name)
|
|
131
|
+
try:
|
|
132
|
+
with open(out_path, "w", encoding="utf-8") as f:
|
|
133
|
+
f.write(merged)
|
|
134
|
+
saved_msg = f"\n💾 已保存到:{out_path}"
|
|
135
|
+
except Exception as e:
|
|
136
|
+
saved_msg = f"\n⚠️ 保存失败:{e}"
|
|
137
|
+
|
|
138
|
+
summary_lines = [
|
|
139
|
+
f"✅ 共提取 {stats['file_count']} 个文件",
|
|
140
|
+
f"📊 总字符数:{stats['char_count']:,}",
|
|
141
|
+
f"🤖 估算 Token:{stats['token_est']:,} (字符数 ÷ 4,英文准 / 中文偏低)",
|
|
142
|
+
"",
|
|
143
|
+
"📋 文件清单:",
|
|
144
|
+
]
|
|
145
|
+
for f in file_list:
|
|
146
|
+
summary_lines.append(f" • {os.path.relpath(f, folder_path)}")
|
|
147
|
+
if saved_msg:
|
|
148
|
+
summary_lines.append(saved_msg)
|
|
149
|
+
|
|
150
|
+
return merged, "\n".join(summary_lines)
|
|
151
|
+
|
|
152
|
+
|
|
153
|
+
HELP_TEXT = """
|
|
154
|
+
## 📖 用法说明
|
|
155
|
+
|
|
156
|
+
**基本流程**
|
|
157
|
+
1. 点击 `📂 选择` 按钮选择目标文件夹,或直接粘贴路径
|
|
158
|
+
2. 勾选需要的文件类型(也可在下方自定义)
|
|
159
|
+
3. 按需展开「过滤条件」和「输出设置」填写参数
|
|
160
|
+
4. 点 `🔍 预览文件列表` 确认文件范围
|
|
161
|
+
5. 确认无误后点 `🚀 开始提取`
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
**过滤功能说明**
|
|
166
|
+
|
|
167
|
+
| 功能 | 说明 |
|
|
168
|
+
|------|------|
|
|
169
|
+
| ✅ 指定文件夹 | 只在这些文件夹里查找,每行一个,留空 = 不限制 |
|
|
170
|
+
| ⛔ 跳过文件夹 | 排除这些文件夹,每行一个 |
|
|
171
|
+
| ✅ 指定文件名 | 精确匹配完整文件名(含后缀),每行一个,留空 = 不限制 |
|
|
172
|
+
| ⛔ 跳过文件名 | 排除这些文件,每行一个完整文件名(含后缀) |
|
|
173
|
+
| 📦 文件大小上限 | 跳过超过指定 KB 的文件,0 = 不限制 |
|
|
174
|
+
| 🔁 递归模式 | 勾选则进入所有子文件夹,取消则只读当前层 |
|
|
175
|
+
|
|
176
|
+
---
|
|
177
|
+
|
|
178
|
+
**Token 估算说明**
|
|
179
|
+
|
|
180
|
+
- 纯英文代码:误差较小
|
|
181
|
+
- 含中文注释:实际 Token 会更高
|
|
182
|
+
|
|
183
|
+
主流模型上下文参考:GPT-4o ≈ 128K,Claude ≈ 200K,Gemini 1.5 Pro ≈ 1M
|
|
184
|
+
""".strip()
|
|
185
|
+
|
|
186
|
+
|
|
187
|
+
def launch():
|
|
188
|
+
with gr.Blocks(title="Vaultex") as demo:
|
|
189
|
+
|
|
190
|
+
gr.Markdown("# 🔐 Vaultex\n### 从文件夹中提取并合并文本文件内容")
|
|
191
|
+
|
|
192
|
+
with gr.Tabs():
|
|
193
|
+
|
|
194
|
+
# ── Tab 1:主界面 ────────────────────────────────────
|
|
195
|
+
with gr.Tab("🚀 提取"):
|
|
196
|
+
with gr.Row():
|
|
197
|
+
|
|
198
|
+
# 左栏:配置
|
|
199
|
+
with gr.Column(scale=1, min_width=340):
|
|
200
|
+
|
|
201
|
+
# 路径
|
|
202
|
+
with gr.Row():
|
|
203
|
+
folder_input = gr.Textbox(
|
|
204
|
+
label="📁 文件夹路径",
|
|
205
|
+
placeholder="手动输入或点右侧按钮选择",
|
|
206
|
+
lines=1,
|
|
207
|
+
scale=5
|
|
208
|
+
)
|
|
209
|
+
pick_btn = gr.Button("📂 选择", scale=1, min_width=60)
|
|
210
|
+
pick_btn.click(fn=pick_folder, inputs=[], outputs=[folder_input])
|
|
211
|
+
|
|
212
|
+
# 文件类型
|
|
213
|
+
with gr.Accordion("📄 文件类型", open=True):
|
|
214
|
+
ext_selector = gr.CheckboxGroup(
|
|
215
|
+
choices=DEFAULT_EXTENSIONS,
|
|
216
|
+
value=[".txt", ".md", ".py", ".js", ".json"],
|
|
217
|
+
label="勾选类型"
|
|
218
|
+
)
|
|
219
|
+
custom_ext_input = gr.Textbox(
|
|
220
|
+
label="➕ 自定义类型(每行一个)",
|
|
221
|
+
placeholder=".vue\n.svelte\n.lock",
|
|
222
|
+
lines=2
|
|
223
|
+
)
|
|
224
|
+
|
|
225
|
+
# 过滤条件(默认折叠)
|
|
226
|
+
with gr.Accordion("🔽 过滤条件", open=False):
|
|
227
|
+
|
|
228
|
+
gr.Markdown("**📂 文件夹**")
|
|
229
|
+
with gr.Row():
|
|
230
|
+
only_folders_input = gr.Textbox(
|
|
231
|
+
label="✅ 指定文件夹(只看这些)",
|
|
232
|
+
placeholder="src\nlib\nutils",
|
|
233
|
+
lines=4
|
|
234
|
+
)
|
|
235
|
+
skip_folders_input = gr.Textbox(
|
|
236
|
+
label="⛔ 跳过文件夹(排除这些)",
|
|
237
|
+
placeholder="__pycache__\n.git\nnode_modules",
|
|
238
|
+
lines=4
|
|
239
|
+
)
|
|
240
|
+
|
|
241
|
+
gr.Markdown("**📄 文件**")
|
|
242
|
+
with gr.Row():
|
|
243
|
+
keyword_files_input = gr.Textbox(
|
|
244
|
+
label="✅ 指定文件名(只要这些,含后缀)",
|
|
245
|
+
placeholder="model.py\nconfig.json",
|
|
246
|
+
lines=4
|
|
247
|
+
)
|
|
248
|
+
skip_files_input = gr.Textbox(
|
|
249
|
+
label="⛔ 跳过文件名(排除这些,含后缀)",
|
|
250
|
+
placeholder="setup.py\nconfig.py",
|
|
251
|
+
lines=4
|
|
252
|
+
)
|
|
253
|
+
|
|
254
|
+
max_kb_input = gr.Number(
|
|
255
|
+
label="📦 文件大小上限(KB,0 = 不限)",
|
|
256
|
+
value=0,
|
|
257
|
+
precision=0
|
|
258
|
+
)
|
|
259
|
+
|
|
260
|
+
# 输出设置(默认折叠)
|
|
261
|
+
with gr.Accordion("💾 输出设置", open=False):
|
|
262
|
+
with gr.Row():
|
|
263
|
+
recursive_toggle = gr.Checkbox(
|
|
264
|
+
label="🔁 包含子文件夹",
|
|
265
|
+
value=True,
|
|
266
|
+
scale=1
|
|
267
|
+
)
|
|
268
|
+
sort_selector = gr.Radio(
|
|
269
|
+
choices=[("路径", "path"), ("文件名", "name"), ("修改时间", "mtime")],
|
|
270
|
+
value="path",
|
|
271
|
+
label="🔃 排序方式",
|
|
272
|
+
scale=2
|
|
273
|
+
)
|
|
274
|
+
separator_input = gr.Textbox(
|
|
275
|
+
label="✂️ 文件分隔符",
|
|
276
|
+
value="=" * 60,
|
|
277
|
+
lines=1
|
|
278
|
+
)
|
|
279
|
+
save_toggle = gr.Checkbox(
|
|
280
|
+
label="💾 同时保存到文件",
|
|
281
|
+
value=False
|
|
282
|
+
)
|
|
283
|
+
output_filename_input = gr.Textbox(
|
|
284
|
+
label="📄 输出文件名",
|
|
285
|
+
value="代码汇总.txt",
|
|
286
|
+
lines=1
|
|
287
|
+
)
|
|
288
|
+
|
|
289
|
+
# 按钮
|
|
290
|
+
with gr.Row():
|
|
291
|
+
scan_btn = gr.Button("🔍 预览文件列表", variant="secondary", scale=1)
|
|
292
|
+
extract_btn = gr.Button("🚀 开始提取", variant="primary", scale=1)
|
|
293
|
+
|
|
294
|
+
# 右栏:输出
|
|
295
|
+
with gr.Column(scale=2):
|
|
296
|
+
scan_output = gr.Textbox(
|
|
297
|
+
label="🔍 文件列表预览",
|
|
298
|
+
lines=8,
|
|
299
|
+
interactive=False
|
|
300
|
+
)
|
|
301
|
+
summary_output = gr.Textbox(
|
|
302
|
+
label="📋 提取摘要",
|
|
303
|
+
lines=8,
|
|
304
|
+
interactive=False
|
|
305
|
+
)
|
|
306
|
+
result_output = gr.Textbox(
|
|
307
|
+
label="📝 合并内容",
|
|
308
|
+
lines=18,
|
|
309
|
+
interactive=False
|
|
310
|
+
)
|
|
311
|
+
|
|
312
|
+
# ── Tab 2:用法说明 ──────────────────────────────────
|
|
313
|
+
with gr.Tab("📖 用法说明"):
|
|
314
|
+
gr.Markdown(HELP_TEXT)
|
|
315
|
+
|
|
316
|
+
# 事件绑定
|
|
317
|
+
scan_btn.click(
|
|
318
|
+
fn=run_scan,
|
|
319
|
+
inputs=[
|
|
320
|
+
folder_input, ext_selector, custom_ext_input,
|
|
321
|
+
recursive_toggle, only_folders_input, skip_folders_input,
|
|
322
|
+
keyword_files_input, skip_files_input, max_kb_input
|
|
323
|
+
],
|
|
324
|
+
outputs=[scan_output]
|
|
325
|
+
)
|
|
326
|
+
|
|
327
|
+
extract_btn.click(
|
|
328
|
+
fn=run_extract,
|
|
329
|
+
inputs=[
|
|
330
|
+
folder_input, ext_selector, custom_ext_input,
|
|
331
|
+
recursive_toggle, separator_input,
|
|
332
|
+
only_folders_input, skip_folders_input,
|
|
333
|
+
keyword_files_input, skip_files_input, max_kb_input,
|
|
334
|
+
sort_selector, save_toggle, output_filename_input
|
|
335
|
+
],
|
|
336
|
+
outputs=[result_output, summary_output]
|
|
337
|
+
)
|
|
338
|
+
|
|
339
|
+
demo.launch(theme=gr.themes.Soft())
|
|
340
|
+
|
|
341
|
+
|
|
342
|
+
if __name__ == "__main__":
|
|
343
|
+
launch()
|
|
@@ -0,0 +1,152 @@
|
|
|
1
|
+
import os
|
|
2
|
+
|
|
3
|
+
DEFAULT_EXTENSIONS = [
|
|
4
|
+
".py", ".js", ".ts", ".jsx", ".tsx",
|
|
5
|
+
".html", ".css", ".scss",
|
|
6
|
+
".json", ".yaml", ".yml", ".toml", ".ini", ".env",
|
|
7
|
+
".md", ".txt", ".rst", ".csv", ".xml",
|
|
8
|
+
".sh", ".bat", ".ps1",
|
|
9
|
+
".c", ".cpp", ".h", ".java", ".go", ".rs",
|
|
10
|
+
]
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
def extract(
|
|
14
|
+
folder_path: str,
|
|
15
|
+
extensions: list,
|
|
16
|
+
recursive: bool = True,
|
|
17
|
+
separator: str = "=" * 60,
|
|
18
|
+
skip_folders: list = None,
|
|
19
|
+
skip_files: list = None,
|
|
20
|
+
only_folders: list = None,
|
|
21
|
+
max_file_kb: int = 0,
|
|
22
|
+
sort_by: str = "path",
|
|
23
|
+
keyword_files: set = None,
|
|
24
|
+
) -> tuple[str, list, dict]:
|
|
25
|
+
|
|
26
|
+
skip_folders = skip_folders or []
|
|
27
|
+
skip_files = skip_files or []
|
|
28
|
+
only_folders = only_folders or []
|
|
29
|
+
keyword_files = keyword_files or set()
|
|
30
|
+
|
|
31
|
+
file_list = _collect_files(
|
|
32
|
+
folder_path, extensions, recursive,
|
|
33
|
+
skip_folders, skip_files, only_folders, max_file_kb, keyword_files
|
|
34
|
+
)
|
|
35
|
+
|
|
36
|
+
if sort_by == "name":
|
|
37
|
+
file_list.sort(key=lambda p: os.path.basename(p).lower())
|
|
38
|
+
elif sort_by == "mtime":
|
|
39
|
+
file_list.sort(key=lambda p: os.path.getmtime(p), reverse=True)
|
|
40
|
+
else:
|
|
41
|
+
file_list.sort()
|
|
42
|
+
|
|
43
|
+
if not file_list:
|
|
44
|
+
return "", [], {}
|
|
45
|
+
|
|
46
|
+
lines = []
|
|
47
|
+
total = len(file_list)
|
|
48
|
+
|
|
49
|
+
lines.append("📁 代码汇总报告")
|
|
50
|
+
lines.append(f"根目录:{folder_path}")
|
|
51
|
+
lines.append(f"共读取:{total} 个文件")
|
|
52
|
+
lines.append(separator)
|
|
53
|
+
lines.append("")
|
|
54
|
+
|
|
55
|
+
lines.append("📋 文件目录索引")
|
|
56
|
+
lines.append("-" * 40)
|
|
57
|
+
for i, filepath in enumerate(file_list, 1):
|
|
58
|
+
rel = os.path.relpath(filepath, folder_path)
|
|
59
|
+
size_kb = os.path.getsize(filepath) / 1024
|
|
60
|
+
lines.append(f" {i:>3}. {rel} ({size_kb:.1f} KB)")
|
|
61
|
+
lines.append("")
|
|
62
|
+
lines.append("")
|
|
63
|
+
|
|
64
|
+
for i, filepath in enumerate(file_list, 1):
|
|
65
|
+
rel = os.path.relpath(filepath, folder_path)
|
|
66
|
+
lines.append(separator)
|
|
67
|
+
lines.append(f"# 文件 {i}/{total}:{rel}")
|
|
68
|
+
lines.append(separator)
|
|
69
|
+
lines.append("")
|
|
70
|
+
content = _read_file(filepath)
|
|
71
|
+
lines.append(content)
|
|
72
|
+
lines.append("")
|
|
73
|
+
lines.append("")
|
|
74
|
+
|
|
75
|
+
merged = "\n".join(lines)
|
|
76
|
+
|
|
77
|
+
char_count = sum(len(_read_file(f)) for f in file_list)
|
|
78
|
+
stats = {
|
|
79
|
+
"file_count": total,
|
|
80
|
+
"char_count": char_count,
|
|
81
|
+
"token_est": char_count // 4,
|
|
82
|
+
}
|
|
83
|
+
|
|
84
|
+
return merged, file_list, stats
|
|
85
|
+
|
|
86
|
+
|
|
87
|
+
def _collect_files(folder, extensions, recursive, skip_folders, skip_files,
|
|
88
|
+
only_folders, max_file_kb, keyword_files=None):
|
|
89
|
+
collected = []
|
|
90
|
+
keyword_files = keyword_files or set()
|
|
91
|
+
|
|
92
|
+
if recursive:
|
|
93
|
+
for root, dirs, files in os.walk(folder):
|
|
94
|
+
# 指定文件夹:当前层级的文件夹名必须在列表里
|
|
95
|
+
if only_folders:
|
|
96
|
+
dirs[:] = [
|
|
97
|
+
d for d in dirs
|
|
98
|
+
if d in only_folders and d not in skip_folders and not d.startswith(".")
|
|
99
|
+
]
|
|
100
|
+
else:
|
|
101
|
+
dirs[:] = [
|
|
102
|
+
d for d in dirs
|
|
103
|
+
if d not in skip_folders and not d.startswith(".")
|
|
104
|
+
]
|
|
105
|
+
|
|
106
|
+
# 指定文件夹时,根目录本身的文件也要判断是否在指定范围内
|
|
107
|
+
# 根目录直接收录,子目录只收录在 only_folders 里的
|
|
108
|
+
rel_root = os.path.relpath(root, folder)
|
|
109
|
+
if only_folders and rel_root != ".":
|
|
110
|
+
top_dir = rel_root.split(os.sep)[0]
|
|
111
|
+
if top_dir not in only_folders:
|
|
112
|
+
continue
|
|
113
|
+
|
|
114
|
+
for file in files:
|
|
115
|
+
full_path = os.path.join(root, file)
|
|
116
|
+
if not _passes_filters(file, extensions, skip_files, max_file_kb, keyword_files, full_path):
|
|
117
|
+
continue
|
|
118
|
+
collected.append(full_path)
|
|
119
|
+
else:
|
|
120
|
+
for item in os.listdir(folder):
|
|
121
|
+
full_path = os.path.join(folder, item)
|
|
122
|
+
if not os.path.isfile(full_path):
|
|
123
|
+
continue
|
|
124
|
+
if not _passes_filters(item, extensions, skip_files, max_file_kb, keyword_files, full_path):
|
|
125
|
+
continue
|
|
126
|
+
collected.append(full_path)
|
|
127
|
+
|
|
128
|
+
return collected
|
|
129
|
+
|
|
130
|
+
|
|
131
|
+
def _passes_filters(filename, extensions, skip_files, max_file_kb, keyword_files, full_path):
|
|
132
|
+
if filename in skip_files:
|
|
133
|
+
return False
|
|
134
|
+
if not any(filename.endswith(ext) for ext in extensions):
|
|
135
|
+
return False
|
|
136
|
+
if max_file_kb > 0 and os.path.getsize(full_path) / 1024 > max_file_kb:
|
|
137
|
+
return False
|
|
138
|
+
if keyword_files and filename.lower() not in keyword_files:
|
|
139
|
+
return False
|
|
140
|
+
return True
|
|
141
|
+
|
|
142
|
+
|
|
143
|
+
def _read_file(filepath):
|
|
144
|
+
for encoding in ("utf-8", "gbk", "latin-1"):
|
|
145
|
+
try:
|
|
146
|
+
with open(filepath, "r", encoding=encoding) as f:
|
|
147
|
+
return f.read()
|
|
148
|
+
except UnicodeDecodeError:
|
|
149
|
+
continue
|
|
150
|
+
except Exception as e:
|
|
151
|
+
return f"⚠️ 读取失败:{e}"
|
|
152
|
+
return "⚠️ 读取失败:所有编码均无法解析"
|
|
@@ -0,0 +1,106 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: vaultex
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: Extract and merge files from a folder with precision — a lightweight GUI tool for developers and LLM workflows.
|
|
5
|
+
Author-email: Ian Gong <gongzhijie535@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/gongzhijie535-ctrl/vaultex
|
|
8
|
+
Project-URL: Repository, https://github.com/gongzhijie535-ctrl/vaultex
|
|
9
|
+
Project-URL: Issues, https://github.com/gongzhijie535-ctrl/vaultex
|
|
10
|
+
Keywords: file,extract,merge,llm,gradio,code,tools
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
13
|
+
Classifier: Operating System :: OS Independent
|
|
14
|
+
Classifier: Intended Audience :: Developers
|
|
15
|
+
Classifier: Topic :: Utilities
|
|
16
|
+
Requires-Python: >=3.10
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: gradio>=4.0
|
|
19
|
+
|
|
20
|
+
# 🔐 Vaultex
|
|
21
|
+
|
|
22
|
+
> Extract what you need. Nothing more.
|
|
23
|
+
|
|
24
|
+
Vaultex is a lightweight GUI tool that lets you **scan a folder and merge all matching files into a single text output** — ready to paste into an LLM, a doc, or anywhere you need a full-project snapshot.
|
|
25
|
+
|
|
26
|
+
The idea is simple: when you're working with a codebase or a collection of files, you often need to quickly gather *specific types* of files across nested folders. Vaultex gives you precise control over what gets included, what gets skipped, and how the result is organized — all through a clean interface.
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## ✨ Features
|
|
31
|
+
|
|
32
|
+
- 📂 **Folder picker** — browse or paste a path directly
|
|
33
|
+
- 📄 **File type selector** — pick from common extensions or add your own
|
|
34
|
+
- 🎯 **Whitelist / blacklist filtering** — specify exactly which folders and files to include or exclude
|
|
35
|
+
- 📦 **File size limit** — skip files that are too large
|
|
36
|
+
- 🔁 **Recursive or flat mode** — go deep or stay shallow
|
|
37
|
+
- 🔃 **Sort options** — by path, filename, or last modified time
|
|
38
|
+
- 🔍 **Preview before extracting** — scan first, extract when ready
|
|
39
|
+
- 💾 **Save to file** — optionally write the merged output back to disk
|
|
40
|
+
- 🤖 **Token estimator** — rough count to check if output fits your LLM context window
|
|
41
|
+
|
|
42
|
+
---
|
|
43
|
+
|
|
44
|
+
## 🚀 Installation
|
|
45
|
+
|
|
46
|
+
```bash
|
|
47
|
+
pip install vaultex
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Then launch:
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
vaultex
|
|
54
|
+
```
|
|
55
|
+
|
|
56
|
+
Or run directly from source:
|
|
57
|
+
|
|
58
|
+
```bash
|
|
59
|
+
git clone https://github.com/gongzhijie535-ctrl/vaultex
|
|
60
|
+
cd vaultex
|
|
61
|
+
pip install -e .
|
|
62
|
+
python -m vaultex
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
---
|
|
66
|
+
|
|
67
|
+
## 🖥️ Usage
|
|
68
|
+
|
|
69
|
+
1. Select a folder using the **📂 picker** or paste a path
|
|
70
|
+
2. Check the file types you want (`.py`, `.md`, `.json`, etc.)
|
|
71
|
+
3. Expand **Filter Options** to narrow down by folder or filename
|
|
72
|
+
4. Click **🔍 Preview** to confirm the file list
|
|
73
|
+
5. Click **🚀 Extract** to merge and view the output
|
|
74
|
+
|
|
75
|
+
---
|
|
76
|
+
|
|
77
|
+
## 📁 Project Structure
|
|
78
|
+
|
|
79
|
+
```
|
|
80
|
+
vaultex/
|
|
81
|
+
├── __init__.py
|
|
82
|
+
├── __main__.py # entry point: python -m vaultex
|
|
83
|
+
├── core.py # file collection + merging logic
|
|
84
|
+
└── app.py # Gradio UI
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## 📦 Requirements
|
|
90
|
+
|
|
91
|
+
- Python ≥ 3.10
|
|
92
|
+
- gradio
|
|
93
|
+
|
|
94
|
+
---
|
|
95
|
+
|
|
96
|
+
## 👤 Author
|
|
97
|
+
|
|
98
|
+
**Ian Gong (龚智杰)**
|
|
99
|
+
📧 gongzhijie535@gmail.com
|
|
100
|
+
🐙 [@gongzhijie535-ctrl](https://github.com/gongzhijie535-ctrl)
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## 📄 License
|
|
105
|
+
|
|
106
|
+
MIT
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
setup.py
|
|
4
|
+
vaultex/__init__.py
|
|
5
|
+
vaultex/__main__.py
|
|
6
|
+
vaultex/app.py
|
|
7
|
+
vaultex/core.py
|
|
8
|
+
vaultex.egg-info/PKG-INFO
|
|
9
|
+
vaultex.egg-info/SOURCES.txt
|
|
10
|
+
vaultex.egg-info/dependency_links.txt
|
|
11
|
+
vaultex.egg-info/entry_points.txt
|
|
12
|
+
vaultex.egg-info/requires.txt
|
|
13
|
+
vaultex.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
gradio>=4.0
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
vaultex
|