sortai 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- sortai-0.1.0/PKG-INFO +158 -0
- sortai-0.1.0/README.md +136 -0
- sortai-0.1.0/pyproject.toml +36 -0
- sortai-0.1.0/setup.cfg +4 -0
- sortai-0.1.0/sortai/__init__.py +3 -0
- sortai-0.1.0/sortai/ai.py +105 -0
- sortai-0.1.0/sortai/cli.py +105 -0
- sortai-0.1.0/sortai/organizer.py +53 -0
- sortai-0.1.0/sortai/reader.py +101 -0
- sortai-0.1.0/sortai.egg-info/PKG-INFO +158 -0
- sortai-0.1.0/sortai.egg-info/SOURCES.txt +13 -0
- sortai-0.1.0/sortai.egg-info/dependency_links.txt +1 -0
- sortai-0.1.0/sortai.egg-info/entry_points.txt +2 -0
- sortai-0.1.0/sortai.egg-info/requires.txt +4 -0
- sortai-0.1.0/sortai.egg-info/top_level.txt +1 -0
sortai-0.1.0/PKG-INFO
ADDED
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: sortai
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: LLM-powered directory organizer using Google Gemini
|
|
5
|
+
Author: sortai
|
|
6
|
+
License: MIT
|
|
7
|
+
Keywords: cli,organize,files,gemini,llm
|
|
8
|
+
Classifier: Development Status :: 4 - Beta
|
|
9
|
+
Classifier: Environment :: Console
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
16
|
+
Requires-Python: >=3.9
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: click>=8.0
|
|
19
|
+
Requires-Dist: google-generativeai>=0.3.0
|
|
20
|
+
Requires-Dist: pdfplumber>=0.10.0
|
|
21
|
+
Requires-Dist: python-docx>=1.0
|
|
22
|
+
|
|
23
|
+
# sortai
|
|
24
|
+
|
|
25
|
+
LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
|
|
26
|
+
|
|
27
|
+
- **Dry-run by default** – see exactly what would move where before touching anything.
|
|
28
|
+
- **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
|
|
29
|
+
|
|
30
|
+
## Install
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
pip install sortai
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Development install from source:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
git clone <repo>
|
|
40
|
+
cd sortai
|
|
41
|
+
pip install -e .
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## Setup
|
|
45
|
+
|
|
46
|
+
Set your Google Gemini API key (required):
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
export GEMINI_API_KEY=your_key_here
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Get a key at: **https://aistudio.google.com/app/apikey**
|
|
53
|
+
|
|
54
|
+
You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
|
|
55
|
+
|
|
56
|
+
## Demo
|
|
57
|
+
|
|
58
|
+
<!-- TODO: add demo GIF -->
|
|
59
|
+

|
|
60
|
+
|
|
61
|
+
*(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
|
|
62
|
+
|
|
63
|
+
## Usage
|
|
64
|
+
|
|
65
|
+
| Command | Description |
|
|
66
|
+
|--------|-------------|
|
|
67
|
+
| `sortai <path>` | Dry-run: show what would be moved where (default). |
|
|
68
|
+
| `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
|
|
69
|
+
| `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
|
|
70
|
+
| `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
|
|
71
|
+
| `sortai --version` | Print version. |
|
|
72
|
+
| `sortai --help` | Show help. |
|
|
73
|
+
|
|
74
|
+
### Example output
|
|
75
|
+
|
|
76
|
+
**Before (flat directory):**
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
my-folder/
|
|
80
|
+
├── report.pdf
|
|
81
|
+
├── notes.txt
|
|
82
|
+
├── budget.csv
|
|
83
|
+
├── vacation.jpg
|
|
84
|
+
└── readme.md
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
**Dry-run:**
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
$ sortai ./my-folder
|
|
91
|
+
Dry run – would move:
|
|
92
|
+
report.pdf -> documents/
|
|
93
|
+
notes.txt -> documents/
|
|
94
|
+
budget.csv -> finance/
|
|
95
|
+
vacation.jpg -> images/
|
|
96
|
+
readme.md -> (keep at root)
|
|
97
|
+
Run with --apply to perform moves.
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**After applying:**
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
my-folder/
|
|
104
|
+
├── readme.md
|
|
105
|
+
├── documents/
|
|
106
|
+
│ ├── report.pdf
|
|
107
|
+
│ └── notes.txt
|
|
108
|
+
├── finance/
|
|
109
|
+
│ └── budget.csv
|
|
110
|
+
└── images/
|
|
111
|
+
└── vacation.jpg
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
## Supported file types for content reading
|
|
115
|
+
|
|
116
|
+
sortai reads the **first ~500 characters** of content for:
|
|
117
|
+
|
|
118
|
+
- `.pdf` (first page via pdfplumber)
|
|
119
|
+
- `.txt`, `.md`, `.csv` (plain text)
|
|
120
|
+
- `.docx` (paragraph text via python-docx)
|
|
121
|
+
|
|
122
|
+
All other files are categorized by **filename and extension only**.
|
|
123
|
+
|
|
124
|
+
## Publishing to PyPI
|
|
125
|
+
|
|
126
|
+
1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
|
|
127
|
+
- https://pypi.org/account/register/
|
|
128
|
+
|
|
129
|
+
2. **Install build tools** (one-time):
|
|
130
|
+
```bash
|
|
131
|
+
pip install build twine
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
|
|
135
|
+
|
|
136
|
+
4. **Build the package** (from the project root):
|
|
137
|
+
```bash
|
|
138
|
+
python -m build
|
|
139
|
+
```
|
|
140
|
+
This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
|
|
141
|
+
|
|
142
|
+
5. **Upload to PyPI**:
|
|
143
|
+
```bash
|
|
144
|
+
twine upload dist/*
|
|
145
|
+
```
|
|
146
|
+
Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
|
|
147
|
+
|
|
148
|
+
To try Test PyPI first:
|
|
149
|
+
```bash
|
|
150
|
+
twine upload --repository testpypi dist/*
|
|
151
|
+
```
|
|
152
|
+
Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
|
|
153
|
+
|
|
154
|
+
**Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
|
|
155
|
+
|
|
156
|
+
## License
|
|
157
|
+
|
|
158
|
+
MIT
|
sortai-0.1.0/README.md
ADDED
|
@@ -0,0 +1,136 @@
|
|
|
1
|
+
# sortai
|
|
2
|
+
|
|
3
|
+
LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
|
|
4
|
+
|
|
5
|
+
- **Dry-run by default** – see exactly what would move where before touching anything.
|
|
6
|
+
- **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
|
|
7
|
+
|
|
8
|
+
## Install
|
|
9
|
+
|
|
10
|
+
```bash
|
|
11
|
+
pip install sortai
|
|
12
|
+
```
|
|
13
|
+
|
|
14
|
+
Development install from source:
|
|
15
|
+
|
|
16
|
+
```bash
|
|
17
|
+
git clone <repo>
|
|
18
|
+
cd sortai
|
|
19
|
+
pip install -e .
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Setup
|
|
23
|
+
|
|
24
|
+
Set your Google Gemini API key (required):
|
|
25
|
+
|
|
26
|
+
```bash
|
|
27
|
+
export GEMINI_API_KEY=your_key_here
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
Get a key at: **https://aistudio.google.com/app/apikey**
|
|
31
|
+
|
|
32
|
+
You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
|
|
33
|
+
|
|
34
|
+
## Demo
|
|
35
|
+
|
|
36
|
+
<!-- TODO: add demo GIF -->
|
|
37
|
+

|
|
38
|
+
|
|
39
|
+
*(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
|
|
40
|
+
|
|
41
|
+
## Usage
|
|
42
|
+
|
|
43
|
+
| Command | Description |
|
|
44
|
+
|--------|-------------|
|
|
45
|
+
| `sortai <path>` | Dry-run: show what would be moved where (default). |
|
|
46
|
+
| `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
|
|
47
|
+
| `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
|
|
48
|
+
| `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
|
|
49
|
+
| `sortai --version` | Print version. |
|
|
50
|
+
| `sortai --help` | Show help. |
|
|
51
|
+
|
|
52
|
+
### Example output
|
|
53
|
+
|
|
54
|
+
**Before (flat directory):**
|
|
55
|
+
|
|
56
|
+
```
|
|
57
|
+
my-folder/
|
|
58
|
+
├── report.pdf
|
|
59
|
+
├── notes.txt
|
|
60
|
+
├── budget.csv
|
|
61
|
+
├── vacation.jpg
|
|
62
|
+
└── readme.md
|
|
63
|
+
```
|
|
64
|
+
|
|
65
|
+
**Dry-run:**
|
|
66
|
+
|
|
67
|
+
```
|
|
68
|
+
$ sortai ./my-folder
|
|
69
|
+
Dry run – would move:
|
|
70
|
+
report.pdf -> documents/
|
|
71
|
+
notes.txt -> documents/
|
|
72
|
+
budget.csv -> finance/
|
|
73
|
+
vacation.jpg -> images/
|
|
74
|
+
readme.md -> (keep at root)
|
|
75
|
+
Run with --apply to perform moves.
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**After applying:**
|
|
79
|
+
|
|
80
|
+
```
|
|
81
|
+
my-folder/
|
|
82
|
+
├── readme.md
|
|
83
|
+
├── documents/
|
|
84
|
+
│ ├── report.pdf
|
|
85
|
+
│ └── notes.txt
|
|
86
|
+
├── finance/
|
|
87
|
+
│ └── budget.csv
|
|
88
|
+
└── images/
|
|
89
|
+
└── vacation.jpg
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## Supported file types for content reading
|
|
93
|
+
|
|
94
|
+
sortai reads the **first ~500 characters** of content for:
|
|
95
|
+
|
|
96
|
+
- `.pdf` (first page via pdfplumber)
|
|
97
|
+
- `.txt`, `.md`, `.csv` (plain text)
|
|
98
|
+
- `.docx` (paragraph text via python-docx)
|
|
99
|
+
|
|
100
|
+
All other files are categorized by **filename and extension only**.
|
|
101
|
+
|
|
102
|
+
## Publishing to PyPI
|
|
103
|
+
|
|
104
|
+
1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
|
|
105
|
+
- https://pypi.org/account/register/
|
|
106
|
+
|
|
107
|
+
2. **Install build tools** (one-time):
|
|
108
|
+
```bash
|
|
109
|
+
pip install build twine
|
|
110
|
+
```
|
|
111
|
+
|
|
112
|
+
3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
|
|
113
|
+
|
|
114
|
+
4. **Build the package** (from the project root):
|
|
115
|
+
```bash
|
|
116
|
+
python -m build
|
|
117
|
+
```
|
|
118
|
+
This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
|
|
119
|
+
|
|
120
|
+
5. **Upload to PyPI**:
|
|
121
|
+
```bash
|
|
122
|
+
twine upload dist/*
|
|
123
|
+
```
|
|
124
|
+
Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
|
|
125
|
+
|
|
126
|
+
To try Test PyPI first:
|
|
127
|
+
```bash
|
|
128
|
+
twine upload --repository testpypi dist/*
|
|
129
|
+
```
|
|
130
|
+
Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
|
|
131
|
+
|
|
132
|
+
**Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
|
|
133
|
+
|
|
134
|
+
## License
|
|
135
|
+
|
|
136
|
+
MIT
|
|
@@ -0,0 +1,36 @@
|
|
|
1
|
+
[build-system]
|
|
2
|
+
requires = ["setuptools>=61.0"]
|
|
3
|
+
build-backend = "setuptools.build_meta"
|
|
4
|
+
|
|
5
|
+
[project]
|
|
6
|
+
name = "sortai"
|
|
7
|
+
version = "0.1.0"
|
|
8
|
+
description = "LLM-powered directory organizer using Google Gemini"
|
|
9
|
+
readme = "README.md"
|
|
10
|
+
requires-python = ">=3.9"
|
|
11
|
+
license = { text = "MIT" }
|
|
12
|
+
authors = [{ name = "sortai" }]
|
|
13
|
+
keywords = ["cli", "organize", "files", "gemini", "llm"]
|
|
14
|
+
classifiers = [
|
|
15
|
+
"Development Status :: 4 - Beta",
|
|
16
|
+
"Environment :: Console",
|
|
17
|
+
"Intended Audience :: Developers",
|
|
18
|
+
"Programming Language :: Python :: 3",
|
|
19
|
+
"Programming Language :: Python :: 3.9",
|
|
20
|
+
"Programming Language :: Python :: 3.10",
|
|
21
|
+
"Programming Language :: Python :: 3.11",
|
|
22
|
+
"Programming Language :: Python :: 3.12",
|
|
23
|
+
]
|
|
24
|
+
dependencies = [
|
|
25
|
+
"click>=8.0",
|
|
26
|
+
"google-generativeai>=0.3.0",
|
|
27
|
+
"pdfplumber>=0.10.0",
|
|
28
|
+
"python-docx>=1.0",
|
|
29
|
+
]
|
|
30
|
+
|
|
31
|
+
[project.scripts]
|
|
32
|
+
sortai = "sortai.cli:main"
|
|
33
|
+
|
|
34
|
+
[tool.setuptools.packages.find]
|
|
35
|
+
where = ["."]
|
|
36
|
+
include = ["sortai*"]
|
sortai-0.1.0/setup.cfg
ADDED
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
"""Gemini API client: build prompt from file list, call model, parse JSON moves."""
|
|
2
|
+
|
|
3
|
+
import json
|
|
4
|
+
import os
|
|
5
|
+
import re
|
|
6
|
+
from typing import Any
|
|
7
|
+
|
|
8
|
+
GEMINI_API_KEY_URL = "https://aistudio.google.com/app/apikey"
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
class MissingApiKeyError(Exception):
|
|
12
|
+
"""Raised when GEMINI_API_KEY is not set."""
|
|
13
|
+
|
|
14
|
+
def __init__(self) -> None:
|
|
15
|
+
super().__init__(
|
|
16
|
+
"GEMINI_API_KEY is not set. Get an API key at " + GEMINI_API_KEY_URL
|
|
17
|
+
)
|
|
18
|
+
|
|
19
|
+
|
|
20
|
+
def get_moves(
|
|
21
|
+
file_list: list[dict],
|
|
22
|
+
depth: int,
|
|
23
|
+
model_name: str = "gemini-1.5-flash",
|
|
24
|
+
) -> list[tuple[str, str]]:
|
|
25
|
+
"""
|
|
26
|
+
Call Gemini to suggest folder structure. Returns list of (relative_path, target_folder).
|
|
27
|
+
target_folder may be "." for root or e.g. "documents" or "documents/work" when depth > 1.
|
|
28
|
+
Raises MissingApiKeyError if GEMINI_API_KEY is not set.
|
|
29
|
+
"""
|
|
30
|
+
api_key = os.environ.get("GEMINI_API_KEY")
|
|
31
|
+
if not api_key or not api_key.strip():
|
|
32
|
+
raise MissingApiKeyError()
|
|
33
|
+
|
|
34
|
+
import google.generativeai as genai
|
|
35
|
+
genai.configure(api_key=api_key.strip())
|
|
36
|
+
model = genai.GenerativeModel(model_name)
|
|
37
|
+
|
|
38
|
+
prompt = _build_prompt(file_list, depth)
|
|
39
|
+
response = model.generate_content(prompt)
|
|
40
|
+
text = (response.text or "").strip()
|
|
41
|
+
return _parse_moves(text, file_list)
|
|
42
|
+
|
|
43
|
+
|
|
44
|
+
def _build_prompt(file_list: list[dict], depth: int) -> str:
|
|
45
|
+
"""Build the prompt for Gemini with file list and depth rules."""
|
|
46
|
+
lines = [
|
|
47
|
+
"You are organizing files in a directory. Given the list of files below (with optional content previews), suggest a folder structure.",
|
|
48
|
+
"",
|
|
49
|
+
"Rules:",
|
|
50
|
+
"- Use only the relative paths exactly as given in the file list.",
|
|
51
|
+
f"- Maximum folder depth is {depth}. So target_folder must be at most {depth} path segments (e.g. for depth 1 use a single folder name like 'documents'; for depth 2 you can use 'documents/work').",
|
|
52
|
+
"- Use forward slashes in target_folder (e.g. 'documents/work').",
|
|
53
|
+
"- To leave a file at the root, use target_folder: '.'.",
|
|
54
|
+
"- Do not suggest moving outside the given directory or using absolute paths.",
|
|
55
|
+
"- Output ONLY a single JSON object, no other text. Format:",
|
|
56
|
+
'{"moves": [{"path": "filename.txt", "target_folder": "documents"}, ...]}',
|
|
57
|
+
"",
|
|
58
|
+
"File list:",
|
|
59
|
+
]
|
|
60
|
+
for item in file_list:
|
|
61
|
+
path = item.get("path", "")
|
|
62
|
+
preview = item.get("content_preview")
|
|
63
|
+
if preview:
|
|
64
|
+
lines.append(f"- {path}")
|
|
65
|
+
lines.append(f" content_preview: {preview!r}")
|
|
66
|
+
else:
|
|
67
|
+
lines.append(f"- {path} (filename/extension only)")
|
|
68
|
+
return "\n".join(lines)
|
|
69
|
+
|
|
70
|
+
|
|
71
|
+
def _parse_moves(response_text: str, file_list: list[dict]) -> list[tuple[str, str]]:
|
|
72
|
+
"""Extract JSON from response, validate paths against file_list, return list of (path, target_folder)."""
|
|
73
|
+
valid_paths = {item["path"] for item in file_list}
|
|
74
|
+
|
|
75
|
+
# Strip markdown code fences if present
|
|
76
|
+
text = response_text.strip()
|
|
77
|
+
match = re.search(r"```(?:json)?\s*([\s\S]*?)\s*```", text)
|
|
78
|
+
if match:
|
|
79
|
+
text = match.group(1).strip()
|
|
80
|
+
# Try raw parse in case there's no fence
|
|
81
|
+
try:
|
|
82
|
+
data = json.loads(text)
|
|
83
|
+
except json.JSONDecodeError:
|
|
84
|
+
return []
|
|
85
|
+
|
|
86
|
+
moves_raw = data.get("moves")
|
|
87
|
+
if not isinstance(moves_raw, list):
|
|
88
|
+
return []
|
|
89
|
+
|
|
90
|
+
result = []
|
|
91
|
+
for entry in moves_raw:
|
|
92
|
+
if not isinstance(entry, dict):
|
|
93
|
+
continue
|
|
94
|
+
path = entry.get("path")
|
|
95
|
+
target = entry.get("target_folder")
|
|
96
|
+
if path is None or target is None:
|
|
97
|
+
continue
|
|
98
|
+
path = str(path).strip()
|
|
99
|
+
target = str(target).strip().replace("\\", "/")
|
|
100
|
+
if path not in valid_paths:
|
|
101
|
+
continue
|
|
102
|
+
if not target:
|
|
103
|
+
target = "."
|
|
104
|
+
result.append((path, target))
|
|
105
|
+
return result
|
|
@@ -0,0 +1,105 @@
|
|
|
1
|
+
"""Click CLI entrypoint for sortai."""
|
|
2
|
+
|
|
3
|
+
import os
|
|
4
|
+
from pathlib import Path
|
|
5
|
+
|
|
6
|
+
import click
|
|
7
|
+
|
|
8
|
+
from sortai import __version__
|
|
9
|
+
from sortai.ai import GEMINI_API_KEY_URL, MissingApiKeyError, get_moves
|
|
10
|
+
from sortai.organizer import apply_moves, confirm, dry_run
|
|
11
|
+
from sortai.reader import list_files
|
|
12
|
+
|
|
13
|
+
|
|
14
|
+
@click.command()
|
|
15
|
+
@click.argument(
|
|
16
|
+
"path",
|
|
17
|
+
type=click.Path(exists=True, file_okay=False, path_type=Path),
|
|
18
|
+
required=False,
|
|
19
|
+
default=None,
|
|
20
|
+
)
|
|
21
|
+
@click.option(
|
|
22
|
+
"--apply",
|
|
23
|
+
is_flag=True,
|
|
24
|
+
default=False,
|
|
25
|
+
help="Actually move files after confirmation (default: dry-run only).",
|
|
26
|
+
)
|
|
27
|
+
@click.option(
|
|
28
|
+
"--depth",
|
|
29
|
+
type=int,
|
|
30
|
+
default=1,
|
|
31
|
+
help="Organize recursively up to N levels of subfolders (default: 1).",
|
|
32
|
+
)
|
|
33
|
+
@click.option(
|
|
34
|
+
"--model",
|
|
35
|
+
type=str,
|
|
36
|
+
default="gemini-1.5-flash",
|
|
37
|
+
help="Gemini model name (default: gemini-1.5-flash).",
|
|
38
|
+
)
|
|
39
|
+
@click.option(
|
|
40
|
+
"--version",
|
|
41
|
+
"show_version",
|
|
42
|
+
is_flag=True,
|
|
43
|
+
default=False,
|
|
44
|
+
help="Show version and exit.",
|
|
45
|
+
)
|
|
46
|
+
def main(
|
|
47
|
+
path: Path | None,
|
|
48
|
+
apply: bool,
|
|
49
|
+
depth: int,
|
|
50
|
+
model: str,
|
|
51
|
+
show_version: bool,
|
|
52
|
+
) -> None:
|
|
53
|
+
"""Organize files in a directory using Google Gemini."""
|
|
54
|
+
if show_version:
|
|
55
|
+
click.echo(f"sortai {__version__}")
|
|
56
|
+
raise SystemExit(0)
|
|
57
|
+
|
|
58
|
+
if path is None:
|
|
59
|
+
click.echo("Error: PATH is required. Use --help for usage.", err=True)
|
|
60
|
+
raise SystemExit(1)
|
|
61
|
+
|
|
62
|
+
root = path.resolve()
|
|
63
|
+
if not root.is_dir():
|
|
64
|
+
click.echo(f"Error: not a directory: {path}", err=True)
|
|
65
|
+
raise SystemExit(1)
|
|
66
|
+
|
|
67
|
+
if not (os.environ.get("GEMINI_API_KEY") or "").strip():
|
|
68
|
+
click.echo("Error: GEMINI_API_KEY is not set.", err=True)
|
|
69
|
+
click.echo(f"Get an API key at: {GEMINI_API_KEY_URL}", err=True)
|
|
70
|
+
raise SystemExit(1)
|
|
71
|
+
|
|
72
|
+
file_list = list_files(root, max_depth=depth)
|
|
73
|
+
if not file_list:
|
|
74
|
+
click.echo("No files found to organize.")
|
|
75
|
+
raise SystemExit(0)
|
|
76
|
+
|
|
77
|
+
try:
|
|
78
|
+
moves = get_moves(file_list, depth=depth, model_name=model)
|
|
79
|
+
except MissingApiKeyError as e:
|
|
80
|
+
click.echo(f"Error: {e}", err=True)
|
|
81
|
+
click.echo(f"Get an API key at: {GEMINI_API_KEY_URL}", err=True)
|
|
82
|
+
raise SystemExit(1)
|
|
83
|
+
except Exception as e:
|
|
84
|
+
click.echo(f"Error calling Gemini: {e}", err=True)
|
|
85
|
+
raise SystemExit(1)
|
|
86
|
+
|
|
87
|
+
if not moves:
|
|
88
|
+
click.echo("No moves suggested.")
|
|
89
|
+
raise SystemExit(0)
|
|
90
|
+
|
|
91
|
+
dry_run(root, moves, echo=click.echo)
|
|
92
|
+
if not apply:
|
|
93
|
+
click.echo("Run with --apply to perform moves.")
|
|
94
|
+
raise SystemExit(0)
|
|
95
|
+
|
|
96
|
+
if not confirm(echo=click.echo):
|
|
97
|
+
click.echo("Aborted.")
|
|
98
|
+
raise SystemExit(0)
|
|
99
|
+
|
|
100
|
+
apply_moves(root, moves, echo=click.echo)
|
|
101
|
+
click.echo("Done.")
|
|
102
|
+
|
|
103
|
+
|
|
104
|
+
if __name__ == "__main__":
|
|
105
|
+
main()
|
|
@@ -0,0 +1,53 @@
|
|
|
1
|
+
"""Dry-run display, confirmation prompt, and file move execution."""
|
|
2
|
+
|
|
3
|
+
import os
|
|
4
|
+
import shutil
|
|
5
|
+
from pathlib import Path
|
|
6
|
+
from typing import Callable
|
|
7
|
+
|
|
8
|
+
|
|
9
|
+
def dry_run(root: Path, moves: list[tuple[str, str]], echo: Callable[[str], None]) -> None:
|
|
10
|
+
"""Print what would be moved where. No filesystem changes."""
|
|
11
|
+
root = root.resolve()
|
|
12
|
+
echo("Dry run – would move:")
|
|
13
|
+
for rel_path, target_folder in moves:
|
|
14
|
+
if target_folder == ".":
|
|
15
|
+
dest_desc = "(keep at root)"
|
|
16
|
+
else:
|
|
17
|
+
dest_desc = f"{target_folder}/"
|
|
18
|
+
echo(f" {rel_path} -> {dest_desc}")
|
|
19
|
+
|
|
20
|
+
|
|
21
|
+
def confirm(echo: Callable[[str], None]) -> bool:
|
|
22
|
+
"""Prompt 'Apply these moves? [y/N]'. Returns True for y/yes, False otherwise."""
|
|
23
|
+
try:
|
|
24
|
+
answer = input("Apply these moves? [y/N] ").strip().lower()
|
|
25
|
+
except EOFError:
|
|
26
|
+
return False
|
|
27
|
+
return answer in ("y", "yes")
|
|
28
|
+
|
|
29
|
+
|
|
30
|
+
def apply_moves(root: Path, moves: list[tuple[str, str]], echo: Callable[[str], None]) -> None:
|
|
31
|
+
"""
|
|
32
|
+
Create target dirs and move files. Skips and warns if destination file already exists
|
|
33
|
+
(no overwrite without user consent).
|
|
34
|
+
"""
|
|
35
|
+
root = root.resolve()
|
|
36
|
+
for rel_path, target_folder in moves:
|
|
37
|
+
src = root / rel_path.replace("/", os.sep)
|
|
38
|
+
if not src.is_file():
|
|
39
|
+
echo(f" Skip (not a file): {rel_path}")
|
|
40
|
+
continue
|
|
41
|
+
if target_folder == ".":
|
|
42
|
+
continue
|
|
43
|
+
target_dir = root / target_folder.replace("/", os.sep)
|
|
44
|
+
target_dir.mkdir(parents=True, exist_ok=True)
|
|
45
|
+
dest = target_dir / src.name
|
|
46
|
+
if dest.exists() and dest.resolve() != src.resolve():
|
|
47
|
+
echo(f" Skip (destination exists): {rel_path} -> {dest.relative_to(root)}")
|
|
48
|
+
continue
|
|
49
|
+
try:
|
|
50
|
+
shutil.move(str(src), str(dest))
|
|
51
|
+
echo(f" Moved: {rel_path} -> {target_folder}/")
|
|
52
|
+
except OSError as e:
|
|
53
|
+
echo(f" Error moving {rel_path}: {e}")
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
"""File listing and content extraction (first ~500 chars) for text-based types."""
|
|
2
|
+
|
|
3
|
+
import os
|
|
4
|
+
from pathlib import Path
|
|
5
|
+
from typing import Optional
|
|
6
|
+
|
|
7
|
+
CONTENT_PREVIEW_LENGTH = 500
|
|
8
|
+
|
|
9
|
+
# Extensions for which we read file content; everything else is categorized by filename/extension only.
|
|
10
|
+
CONTENT_EXTENSIONS = {".pdf", ".txt", ".md", ".docx", ".csv"}
|
|
11
|
+
|
|
12
|
+
|
|
13
|
+
def _read_text_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
|
|
14
|
+
"""Read first `limit` characters from a text file. Returns None on error."""
|
|
15
|
+
try:
|
|
16
|
+
with open(path, encoding="utf-8", errors="replace") as f:
|
|
17
|
+
return (f.read(limit + 1))[:limit] or None
|
|
18
|
+
except OSError:
|
|
19
|
+
return None
|
|
20
|
+
|
|
21
|
+
|
|
22
|
+
def _read_pdf_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
|
|
23
|
+
"""Extract text from first page of PDF and truncate. Returns None on error."""
|
|
24
|
+
try:
|
|
25
|
+
import pdfplumber
|
|
26
|
+
with pdfplumber.open(path) as pdf:
|
|
27
|
+
if not pdf.pages:
|
|
28
|
+
return None
|
|
29
|
+
text = pdf.pages[0].extract_text()
|
|
30
|
+
if not text:
|
|
31
|
+
return None
|
|
32
|
+
return (text[: limit + 1])[:limit]
|
|
33
|
+
except Exception:
|
|
34
|
+
return None
|
|
35
|
+
|
|
36
|
+
|
|
37
|
+
def _read_docx_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
|
|
38
|
+
"""Extract paragraph text from docx and truncate. Returns None on error."""
|
|
39
|
+
try:
|
|
40
|
+
from docx import Document
|
|
41
|
+
doc = Document(path)
|
|
42
|
+
parts = []
|
|
43
|
+
n = 0
|
|
44
|
+
for para in doc.paragraphs:
|
|
45
|
+
if para.text:
|
|
46
|
+
parts.append(para.text)
|
|
47
|
+
n += len(para.text)
|
|
48
|
+
if n >= limit:
|
|
49
|
+
break
|
|
50
|
+
text = " ".join(parts)
|
|
51
|
+
return (text[: limit + 1])[:limit] if text else None
|
|
52
|
+
except Exception:
|
|
53
|
+
return None
|
|
54
|
+
|
|
55
|
+
|
|
56
|
+
def get_content_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
|
|
57
|
+
"""Return first ~limit characters of file content for supported types, else None."""
|
|
58
|
+
ext = path.suffix.lower()
|
|
59
|
+
if ext in (".txt", ".md", ".csv"):
|
|
60
|
+
return _read_text_preview(path, limit)
|
|
61
|
+
if ext == ".pdf":
|
|
62
|
+
return _read_pdf_preview(path, limit)
|
|
63
|
+
if ext == ".docx":
|
|
64
|
+
return _read_docx_preview(path, limit)
|
|
65
|
+
return None
|
|
66
|
+
|
|
67
|
+
|
|
68
|
+
def list_files(
|
|
69
|
+
root: Path,
|
|
70
|
+
max_depth: Optional[int] = None,
|
|
71
|
+
) -> list[dict]:
|
|
72
|
+
"""
|
|
73
|
+
Walk directory up to max_depth levels; return list of dicts with path (relative),
|
|
74
|
+
name (filename), and content_preview (first ~500 chars for supported types, else None).
|
|
75
|
+
"""
|
|
76
|
+
root = root.resolve()
|
|
77
|
+
if not root.is_dir():
|
|
78
|
+
return []
|
|
79
|
+
|
|
80
|
+
result = []
|
|
81
|
+
for dirpath, dirnames, filenames in os.walk(root):
|
|
82
|
+
dirpath = Path(dirpath)
|
|
83
|
+
depth = len(dirpath.relative_to(root).parts) if dirpath != root else 0
|
|
84
|
+
if max_depth is not None and depth > max_depth:
|
|
85
|
+
continue
|
|
86
|
+
if max_depth is not None and depth >= max_depth:
|
|
87
|
+
dirnames[:] = [] # do not descend further
|
|
88
|
+
for name in filenames:
|
|
89
|
+
full_path = dirpath / name
|
|
90
|
+
try:
|
|
91
|
+
rel = full_path.relative_to(root)
|
|
92
|
+
except ValueError:
|
|
93
|
+
continue
|
|
94
|
+
rel_str = str(rel).replace("\\", "/")
|
|
95
|
+
content = get_content_preview(full_path) if full_path.suffix.lower() in CONTENT_EXTENSIONS else None
|
|
96
|
+
result.append({
|
|
97
|
+
"path": rel_str,
|
|
98
|
+
"name": name,
|
|
99
|
+
"content_preview": content,
|
|
100
|
+
})
|
|
101
|
+
return result
|
|
@@ -0,0 +1,158 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: sortai
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Summary: LLM-powered directory organizer using Google Gemini
|
|
5
|
+
Author: sortai
|
|
6
|
+
License: MIT
|
|
7
|
+
Keywords: cli,organize,files,gemini,llm
|
|
8
|
+
Classifier: Development Status :: 4 - Beta
|
|
9
|
+
Classifier: Environment :: Console
|
|
10
|
+
Classifier: Intended Audience :: Developers
|
|
11
|
+
Classifier: Programming Language :: Python :: 3
|
|
12
|
+
Classifier: Programming Language :: Python :: 3.9
|
|
13
|
+
Classifier: Programming Language :: Python :: 3.10
|
|
14
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
15
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
16
|
+
Requires-Python: >=3.9
|
|
17
|
+
Description-Content-Type: text/markdown
|
|
18
|
+
Requires-Dist: click>=8.0
|
|
19
|
+
Requires-Dist: google-generativeai>=0.3.0
|
|
20
|
+
Requires-Dist: pdfplumber>=0.10.0
|
|
21
|
+
Requires-Dist: python-docx>=1.0
|
|
22
|
+
|
|
23
|
+
# sortai
|
|
24
|
+
|
|
25
|
+
LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
|
|
26
|
+
|
|
27
|
+
- **Dry-run by default** – see exactly what would move where before touching anything.
|
|
28
|
+
- **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
|
|
29
|
+
|
|
30
|
+
## Install
|
|
31
|
+
|
|
32
|
+
```bash
|
|
33
|
+
pip install sortai
|
|
34
|
+
```
|
|
35
|
+
|
|
36
|
+
Development install from source:
|
|
37
|
+
|
|
38
|
+
```bash
|
|
39
|
+
git clone <repo>
|
|
40
|
+
cd sortai
|
|
41
|
+
pip install -e .
|
|
42
|
+
```
|
|
43
|
+
|
|
44
|
+
## Setup
|
|
45
|
+
|
|
46
|
+
Set your Google Gemini API key (required):
|
|
47
|
+
|
|
48
|
+
```bash
|
|
49
|
+
export GEMINI_API_KEY=your_key_here
|
|
50
|
+
```
|
|
51
|
+
|
|
52
|
+
Get a key at: **https://aistudio.google.com/app/apikey**
|
|
53
|
+
|
|
54
|
+
You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
|
|
55
|
+
|
|
56
|
+
## Demo
|
|
57
|
+
|
|
58
|
+
<!-- TODO: add demo GIF -->
|
|
59
|
+

|
|
60
|
+
|
|
61
|
+
*(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
|
|
62
|
+
|
|
63
|
+
## Usage
|
|
64
|
+
|
|
65
|
+
| Command | Description |
|
|
66
|
+
|--------|-------------|
|
|
67
|
+
| `sortai <path>` | Dry-run: show what would be moved where (default). |
|
|
68
|
+
| `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
|
|
69
|
+
| `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
|
|
70
|
+
| `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
|
|
71
|
+
| `sortai --version` | Print version. |
|
|
72
|
+
| `sortai --help` | Show help. |
|
|
73
|
+
|
|
74
|
+
### Example output
|
|
75
|
+
|
|
76
|
+
**Before (flat directory):**
|
|
77
|
+
|
|
78
|
+
```
|
|
79
|
+
my-folder/
|
|
80
|
+
├── report.pdf
|
|
81
|
+
├── notes.txt
|
|
82
|
+
├── budget.csv
|
|
83
|
+
├── vacation.jpg
|
|
84
|
+
└── readme.md
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
**Dry-run:**
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
$ sortai ./my-folder
|
|
91
|
+
Dry run – would move:
|
|
92
|
+
report.pdf -> documents/
|
|
93
|
+
notes.txt -> documents/
|
|
94
|
+
budget.csv -> finance/
|
|
95
|
+
vacation.jpg -> images/
|
|
96
|
+
readme.md -> (keep at root)
|
|
97
|
+
Run with --apply to perform moves.
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**After applying:**
|
|
101
|
+
|
|
102
|
+
```
|
|
103
|
+
my-folder/
|
|
104
|
+
├── readme.md
|
|
105
|
+
├── documents/
|
|
106
|
+
│ ├── report.pdf
|
|
107
|
+
│ └── notes.txt
|
|
108
|
+
├── finance/
|
|
109
|
+
│ └── budget.csv
|
|
110
|
+
└── images/
|
|
111
|
+
└── vacation.jpg
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
## Supported file types for content reading
|
|
115
|
+
|
|
116
|
+
sortai reads the **first ~500 characters** of content for:
|
|
117
|
+
|
|
118
|
+
- `.pdf` (first page via pdfplumber)
|
|
119
|
+
- `.txt`, `.md`, `.csv` (plain text)
|
|
120
|
+
- `.docx` (paragraph text via python-docx)
|
|
121
|
+
|
|
122
|
+
All other files are categorized by **filename and extension only**.
|
|
123
|
+
|
|
124
|
+
## Publishing to PyPI
|
|
125
|
+
|
|
126
|
+
1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
|
|
127
|
+
- https://pypi.org/account/register/
|
|
128
|
+
|
|
129
|
+
2. **Install build tools** (one-time):
|
|
130
|
+
```bash
|
|
131
|
+
pip install build twine
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
|
|
135
|
+
|
|
136
|
+
4. **Build the package** (from the project root):
|
|
137
|
+
```bash
|
|
138
|
+
python -m build
|
|
139
|
+
```
|
|
140
|
+
This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
|
|
141
|
+
|
|
142
|
+
5. **Upload to PyPI**:
|
|
143
|
+
```bash
|
|
144
|
+
twine upload dist/*
|
|
145
|
+
```
|
|
146
|
+
Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
|
|
147
|
+
|
|
148
|
+
To try Test PyPI first:
|
|
149
|
+
```bash
|
|
150
|
+
twine upload --repository testpypi dist/*
|
|
151
|
+
```
|
|
152
|
+
Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
|
|
153
|
+
|
|
154
|
+
**Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
|
|
155
|
+
|
|
156
|
+
## License
|
|
157
|
+
|
|
158
|
+
MIT
|
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
README.md
|
|
2
|
+
pyproject.toml
|
|
3
|
+
sortai/__init__.py
|
|
4
|
+
sortai/ai.py
|
|
5
|
+
sortai/cli.py
|
|
6
|
+
sortai/organizer.py
|
|
7
|
+
sortai/reader.py
|
|
8
|
+
sortai.egg-info/PKG-INFO
|
|
9
|
+
sortai.egg-info/SOURCES.txt
|
|
10
|
+
sortai.egg-info/dependency_links.txt
|
|
11
|
+
sortai.egg-info/entry_points.txt
|
|
12
|
+
sortai.egg-info/requires.txt
|
|
13
|
+
sortai.egg-info/top_level.txt
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
sortai
|