sortai 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
sortai-0.1.0/PKG-INFO ADDED
@@ -0,0 +1,158 @@
1
+ Metadata-Version: 2.4
2
+ Name: sortai
3
+ Version: 0.1.0
4
+ Summary: LLM-powered directory organizer using Google Gemini
5
+ Author: sortai
6
+ License: MIT
7
+ Keywords: cli,organize,files,gemini,llm
8
+ Classifier: Development Status :: 4 - Beta
9
+ Classifier: Environment :: Console
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.9
13
+ Classifier: Programming Language :: Python :: 3.10
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Requires-Python: >=3.9
17
+ Description-Content-Type: text/markdown
18
+ Requires-Dist: click>=8.0
19
+ Requires-Dist: google-generativeai>=0.3.0
20
+ Requires-Dist: pdfplumber>=0.10.0
21
+ Requires-Dist: python-docx>=1.0
22
+
23
+ # sortai
24
+
25
+ LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
26
+
27
+ - **Dry-run by default** – see exactly what would move where before touching anything.
28
+ - **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
29
+
30
+ ## Install
31
+
32
+ ```bash
33
+ pip install sortai
34
+ ```
35
+
36
+ Development install from source:
37
+
38
+ ```bash
39
+ git clone <repo>
40
+ cd sortai
41
+ pip install -e .
42
+ ```
43
+
44
+ ## Setup
45
+
46
+ Set your Google Gemini API key (required):
47
+
48
+ ```bash
49
+ export GEMINI_API_KEY=your_key_here
50
+ ```
51
+
52
+ Get a key at: **https://aistudio.google.com/app/apikey**
53
+
54
+ You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
55
+
56
+ ## Demo
57
+
58
+ <!-- TODO: add demo GIF -->
59
+ ![Demo](docs/demo.gif)
60
+
61
+ *(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
62
+
63
+ ## Usage
64
+
65
+ | Command | Description |
66
+ |--------|-------------|
67
+ | `sortai <path>` | Dry-run: show what would be moved where (default). |
68
+ | `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
69
+ | `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
70
+ | `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
71
+ | `sortai --version` | Print version. |
72
+ | `sortai --help` | Show help. |
73
+
74
+ ### Example output
75
+
76
+ **Before (flat directory):**
77
+
78
+ ```
79
+ my-folder/
80
+ ├── report.pdf
81
+ ├── notes.txt
82
+ ├── budget.csv
83
+ ├── vacation.jpg
84
+ └── readme.md
85
+ ```
86
+
87
+ **Dry-run:**
88
+
89
+ ```
90
+ $ sortai ./my-folder
91
+ Dry run – would move:
92
+ report.pdf -> documents/
93
+ notes.txt -> documents/
94
+ budget.csv -> finance/
95
+ vacation.jpg -> images/
96
+ readme.md -> (keep at root)
97
+ Run with --apply to perform moves.
98
+ ```
99
+
100
+ **After applying:**
101
+
102
+ ```
103
+ my-folder/
104
+ ├── readme.md
105
+ ├── documents/
106
+ │ ├── report.pdf
107
+ │ └── notes.txt
108
+ ├── finance/
109
+ │ └── budget.csv
110
+ └── images/
111
+ └── vacation.jpg
112
+ ```
113
+
114
+ ## Supported file types for content reading
115
+
116
+ sortai reads the **first ~500 characters** of content for:
117
+
118
+ - `.pdf` (first page via pdfplumber)
119
+ - `.txt`, `.md`, `.csv` (plain text)
120
+ - `.docx` (paragraph text via python-docx)
121
+
122
+ All other files are categorized by **filename and extension only**.
123
+
124
+ ## Publishing to PyPI
125
+
126
+ 1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
127
+ - https://pypi.org/account/register/
128
+
129
+ 2. **Install build tools** (one-time):
130
+ ```bash
131
+ pip install build twine
132
+ ```
133
+
134
+ 3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
135
+
136
+ 4. **Build the package** (from the project root):
137
+ ```bash
138
+ python -m build
139
+ ```
140
+ This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
141
+
142
+ 5. **Upload to PyPI**:
143
+ ```bash
144
+ twine upload dist/*
145
+ ```
146
+ Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
147
+
148
+ To try Test PyPI first:
149
+ ```bash
150
+ twine upload --repository testpypi dist/*
151
+ ```
152
+ Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
153
+
154
+ **Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
155
+
156
+ ## License
157
+
158
+ MIT
sortai-0.1.0/README.md ADDED
@@ -0,0 +1,136 @@
1
+ # sortai
2
+
3
+ LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
4
+
5
+ - **Dry-run by default** – see exactly what would move where before touching anything.
6
+ - **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
7
+
8
+ ## Install
9
+
10
+ ```bash
11
+ pip install sortai
12
+ ```
13
+
14
+ Development install from source:
15
+
16
+ ```bash
17
+ git clone <repo>
18
+ cd sortai
19
+ pip install -e .
20
+ ```
21
+
22
+ ## Setup
23
+
24
+ Set your Google Gemini API key (required):
25
+
26
+ ```bash
27
+ export GEMINI_API_KEY=your_key_here
28
+ ```
29
+
30
+ Get a key at: **https://aistudio.google.com/app/apikey**
31
+
32
+ You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
33
+
34
+ ## Demo
35
+
36
+ <!-- TODO: add demo GIF -->
37
+ ![Demo](docs/demo.gif)
38
+
39
+ *(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
40
+
41
+ ## Usage
42
+
43
+ | Command | Description |
44
+ |--------|-------------|
45
+ | `sortai <path>` | Dry-run: show what would be moved where (default). |
46
+ | `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
47
+ | `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
48
+ | `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
49
+ | `sortai --version` | Print version. |
50
+ | `sortai --help` | Show help. |
51
+
52
+ ### Example output
53
+
54
+ **Before (flat directory):**
55
+
56
+ ```
57
+ my-folder/
58
+ ├── report.pdf
59
+ ├── notes.txt
60
+ ├── budget.csv
61
+ ├── vacation.jpg
62
+ └── readme.md
63
+ ```
64
+
65
+ **Dry-run:**
66
+
67
+ ```
68
+ $ sortai ./my-folder
69
+ Dry run – would move:
70
+ report.pdf -> documents/
71
+ notes.txt -> documents/
72
+ budget.csv -> finance/
73
+ vacation.jpg -> images/
74
+ readme.md -> (keep at root)
75
+ Run with --apply to perform moves.
76
+ ```
77
+
78
+ **After applying:**
79
+
80
+ ```
81
+ my-folder/
82
+ ├── readme.md
83
+ ├── documents/
84
+ │ ├── report.pdf
85
+ │ └── notes.txt
86
+ ├── finance/
87
+ │ └── budget.csv
88
+ └── images/
89
+ └── vacation.jpg
90
+ ```
91
+
92
+ ## Supported file types for content reading
93
+
94
+ sortai reads the **first ~500 characters** of content for:
95
+
96
+ - `.pdf` (first page via pdfplumber)
97
+ - `.txt`, `.md`, `.csv` (plain text)
98
+ - `.docx` (paragraph text via python-docx)
99
+
100
+ All other files are categorized by **filename and extension only**.
101
+
102
+ ## Publishing to PyPI
103
+
104
+ 1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
105
+ - https://pypi.org/account/register/
106
+
107
+ 2. **Install build tools** (one-time):
108
+ ```bash
109
+ pip install build twine
110
+ ```
111
+
112
+ 3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
113
+
114
+ 4. **Build the package** (from the project root):
115
+ ```bash
116
+ python -m build
117
+ ```
118
+ This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
119
+
120
+ 5. **Upload to PyPI**:
121
+ ```bash
122
+ twine upload dist/*
123
+ ```
124
+ Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
125
+
126
+ To try Test PyPI first:
127
+ ```bash
128
+ twine upload --repository testpypi dist/*
129
+ ```
130
+ Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
131
+
132
+ **Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
133
+
134
+ ## License
135
+
136
+ MIT
@@ -0,0 +1,36 @@
1
+ [build-system]
2
+ requires = ["setuptools>=61.0"]
3
+ build-backend = "setuptools.build_meta"
4
+
5
+ [project]
6
+ name = "sortai"
7
+ version = "0.1.0"
8
+ description = "LLM-powered directory organizer using Google Gemini"
9
+ readme = "README.md"
10
+ requires-python = ">=3.9"
11
+ license = { text = "MIT" }
12
+ authors = [{ name = "sortai" }]
13
+ keywords = ["cli", "organize", "files", "gemini", "llm"]
14
+ classifiers = [
15
+ "Development Status :: 4 - Beta",
16
+ "Environment :: Console",
17
+ "Intended Audience :: Developers",
18
+ "Programming Language :: Python :: 3",
19
+ "Programming Language :: Python :: 3.9",
20
+ "Programming Language :: Python :: 3.10",
21
+ "Programming Language :: Python :: 3.11",
22
+ "Programming Language :: Python :: 3.12",
23
+ ]
24
+ dependencies = [
25
+ "click>=8.0",
26
+ "google-generativeai>=0.3.0",
27
+ "pdfplumber>=0.10.0",
28
+ "python-docx>=1.0",
29
+ ]
30
+
31
+ [project.scripts]
32
+ sortai = "sortai.cli:main"
33
+
34
+ [tool.setuptools.packages.find]
35
+ where = ["."]
36
+ include = ["sortai*"]
sortai-0.1.0/setup.cfg ADDED
@@ -0,0 +1,4 @@
1
+ [egg_info]
2
+ tag_build =
3
+ tag_date = 0
4
+
@@ -0,0 +1,3 @@
1
+ """sortai – LLM-powered directory organizer using Google Gemini."""
2
+
3
+ __version__ = "0.1.0"
@@ -0,0 +1,105 @@
1
+ """Gemini API client: build prompt from file list, call model, parse JSON moves."""
2
+
3
+ import json
4
+ import os
5
+ import re
6
+ from typing import Any
7
+
8
+ GEMINI_API_KEY_URL = "https://aistudio.google.com/app/apikey"
9
+
10
+
11
+ class MissingApiKeyError(Exception):
12
+ """Raised when GEMINI_API_KEY is not set."""
13
+
14
+ def __init__(self) -> None:
15
+ super().__init__(
16
+ "GEMINI_API_KEY is not set. Get an API key at " + GEMINI_API_KEY_URL
17
+ )
18
+
19
+
20
+ def get_moves(
21
+ file_list: list[dict],
22
+ depth: int,
23
+ model_name: str = "gemini-1.5-flash",
24
+ ) -> list[tuple[str, str]]:
25
+ """
26
+ Call Gemini to suggest folder structure. Returns list of (relative_path, target_folder).
27
+ target_folder may be "." for root or e.g. "documents" or "documents/work" when depth > 1.
28
+ Raises MissingApiKeyError if GEMINI_API_KEY is not set.
29
+ """
30
+ api_key = os.environ.get("GEMINI_API_KEY")
31
+ if not api_key or not api_key.strip():
32
+ raise MissingApiKeyError()
33
+
34
+ import google.generativeai as genai
35
+ genai.configure(api_key=api_key.strip())
36
+ model = genai.GenerativeModel(model_name)
37
+
38
+ prompt = _build_prompt(file_list, depth)
39
+ response = model.generate_content(prompt)
40
+ text = (response.text or "").strip()
41
+ return _parse_moves(text, file_list)
42
+
43
+
44
+ def _build_prompt(file_list: list[dict], depth: int) -> str:
45
+ """Build the prompt for Gemini with file list and depth rules."""
46
+ lines = [
47
+ "You are organizing files in a directory. Given the list of files below (with optional content previews), suggest a folder structure.",
48
+ "",
49
+ "Rules:",
50
+ "- Use only the relative paths exactly as given in the file list.",
51
+ f"- Maximum folder depth is {depth}. So target_folder must be at most {depth} path segments (e.g. for depth 1 use a single folder name like 'documents'; for depth 2 you can use 'documents/work').",
52
+ "- Use forward slashes in target_folder (e.g. 'documents/work').",
53
+ "- To leave a file at the root, use target_folder: '.'.",
54
+ "- Do not suggest moving outside the given directory or using absolute paths.",
55
+ "- Output ONLY a single JSON object, no other text. Format:",
56
+ '{"moves": [{"path": "filename.txt", "target_folder": "documents"}, ...]}',
57
+ "",
58
+ "File list:",
59
+ ]
60
+ for item in file_list:
61
+ path = item.get("path", "")
62
+ preview = item.get("content_preview")
63
+ if preview:
64
+ lines.append(f"- {path}")
65
+ lines.append(f" content_preview: {preview!r}")
66
+ else:
67
+ lines.append(f"- {path} (filename/extension only)")
68
+ return "\n".join(lines)
69
+
70
+
71
+ def _parse_moves(response_text: str, file_list: list[dict]) -> list[tuple[str, str]]:
72
+ """Extract JSON from response, validate paths against file_list, return list of (path, target_folder)."""
73
+ valid_paths = {item["path"] for item in file_list}
74
+
75
+ # Strip markdown code fences if present
76
+ text = response_text.strip()
77
+ match = re.search(r"```(?:json)?\s*([\s\S]*?)\s*```", text)
78
+ if match:
79
+ text = match.group(1).strip()
80
+ # Try raw parse in case there's no fence
81
+ try:
82
+ data = json.loads(text)
83
+ except json.JSONDecodeError:
84
+ return []
85
+
86
+ moves_raw = data.get("moves")
87
+ if not isinstance(moves_raw, list):
88
+ return []
89
+
90
+ result = []
91
+ for entry in moves_raw:
92
+ if not isinstance(entry, dict):
93
+ continue
94
+ path = entry.get("path")
95
+ target = entry.get("target_folder")
96
+ if path is None or target is None:
97
+ continue
98
+ path = str(path).strip()
99
+ target = str(target).strip().replace("\\", "/")
100
+ if path not in valid_paths:
101
+ continue
102
+ if not target:
103
+ target = "."
104
+ result.append((path, target))
105
+ return result
@@ -0,0 +1,105 @@
1
+ """Click CLI entrypoint for sortai."""
2
+
3
+ import os
4
+ from pathlib import Path
5
+
6
+ import click
7
+
8
+ from sortai import __version__
9
+ from sortai.ai import GEMINI_API_KEY_URL, MissingApiKeyError, get_moves
10
+ from sortai.organizer import apply_moves, confirm, dry_run
11
+ from sortai.reader import list_files
12
+
13
+
14
+ @click.command()
15
+ @click.argument(
16
+ "path",
17
+ type=click.Path(exists=True, file_okay=False, path_type=Path),
18
+ required=False,
19
+ default=None,
20
+ )
21
+ @click.option(
22
+ "--apply",
23
+ is_flag=True,
24
+ default=False,
25
+ help="Actually move files after confirmation (default: dry-run only).",
26
+ )
27
+ @click.option(
28
+ "--depth",
29
+ type=int,
30
+ default=1,
31
+ help="Organize recursively up to N levels of subfolders (default: 1).",
32
+ )
33
+ @click.option(
34
+ "--model",
35
+ type=str,
36
+ default="gemini-1.5-flash",
37
+ help="Gemini model name (default: gemini-1.5-flash).",
38
+ )
39
+ @click.option(
40
+ "--version",
41
+ "show_version",
42
+ is_flag=True,
43
+ default=False,
44
+ help="Show version and exit.",
45
+ )
46
+ def main(
47
+ path: Path | None,
48
+ apply: bool,
49
+ depth: int,
50
+ model: str,
51
+ show_version: bool,
52
+ ) -> None:
53
+ """Organize files in a directory using Google Gemini."""
54
+ if show_version:
55
+ click.echo(f"sortai {__version__}")
56
+ raise SystemExit(0)
57
+
58
+ if path is None:
59
+ click.echo("Error: PATH is required. Use --help for usage.", err=True)
60
+ raise SystemExit(1)
61
+
62
+ root = path.resolve()
63
+ if not root.is_dir():
64
+ click.echo(f"Error: not a directory: {path}", err=True)
65
+ raise SystemExit(1)
66
+
67
+ if not (os.environ.get("GEMINI_API_KEY") or "").strip():
68
+ click.echo("Error: GEMINI_API_KEY is not set.", err=True)
69
+ click.echo(f"Get an API key at: {GEMINI_API_KEY_URL}", err=True)
70
+ raise SystemExit(1)
71
+
72
+ file_list = list_files(root, max_depth=depth)
73
+ if not file_list:
74
+ click.echo("No files found to organize.")
75
+ raise SystemExit(0)
76
+
77
+ try:
78
+ moves = get_moves(file_list, depth=depth, model_name=model)
79
+ except MissingApiKeyError as e:
80
+ click.echo(f"Error: {e}", err=True)
81
+ click.echo(f"Get an API key at: {GEMINI_API_KEY_URL}", err=True)
82
+ raise SystemExit(1)
83
+ except Exception as e:
84
+ click.echo(f"Error calling Gemini: {e}", err=True)
85
+ raise SystemExit(1)
86
+
87
+ if not moves:
88
+ click.echo("No moves suggested.")
89
+ raise SystemExit(0)
90
+
91
+ dry_run(root, moves, echo=click.echo)
92
+ if not apply:
93
+ click.echo("Run with --apply to perform moves.")
94
+ raise SystemExit(0)
95
+
96
+ if not confirm(echo=click.echo):
97
+ click.echo("Aborted.")
98
+ raise SystemExit(0)
99
+
100
+ apply_moves(root, moves, echo=click.echo)
101
+ click.echo("Done.")
102
+
103
+
104
+ if __name__ == "__main__":
105
+ main()
@@ -0,0 +1,53 @@
1
+ """Dry-run display, confirmation prompt, and file move execution."""
2
+
3
+ import os
4
+ import shutil
5
+ from pathlib import Path
6
+ from typing import Callable
7
+
8
+
9
+ def dry_run(root: Path, moves: list[tuple[str, str]], echo: Callable[[str], None]) -> None:
10
+ """Print what would be moved where. No filesystem changes."""
11
+ root = root.resolve()
12
+ echo("Dry run – would move:")
13
+ for rel_path, target_folder in moves:
14
+ if target_folder == ".":
15
+ dest_desc = "(keep at root)"
16
+ else:
17
+ dest_desc = f"{target_folder}/"
18
+ echo(f" {rel_path} -> {dest_desc}")
19
+
20
+
21
+ def confirm(echo: Callable[[str], None]) -> bool:
22
+ """Prompt 'Apply these moves? [y/N]'. Returns True for y/yes, False otherwise."""
23
+ try:
24
+ answer = input("Apply these moves? [y/N] ").strip().lower()
25
+ except EOFError:
26
+ return False
27
+ return answer in ("y", "yes")
28
+
29
+
30
+ def apply_moves(root: Path, moves: list[tuple[str, str]], echo: Callable[[str], None]) -> None:
31
+ """
32
+ Create target dirs and move files. Skips and warns if destination file already exists
33
+ (no overwrite without user consent).
34
+ """
35
+ root = root.resolve()
36
+ for rel_path, target_folder in moves:
37
+ src = root / rel_path.replace("/", os.sep)
38
+ if not src.is_file():
39
+ echo(f" Skip (not a file): {rel_path}")
40
+ continue
41
+ if target_folder == ".":
42
+ continue
43
+ target_dir = root / target_folder.replace("/", os.sep)
44
+ target_dir.mkdir(parents=True, exist_ok=True)
45
+ dest = target_dir / src.name
46
+ if dest.exists() and dest.resolve() != src.resolve():
47
+ echo(f" Skip (destination exists): {rel_path} -> {dest.relative_to(root)}")
48
+ continue
49
+ try:
50
+ shutil.move(str(src), str(dest))
51
+ echo(f" Moved: {rel_path} -> {target_folder}/")
52
+ except OSError as e:
53
+ echo(f" Error moving {rel_path}: {e}")
@@ -0,0 +1,101 @@
1
+ """File listing and content extraction (first ~500 chars) for text-based types."""
2
+
3
+ import os
4
+ from pathlib import Path
5
+ from typing import Optional
6
+
7
+ CONTENT_PREVIEW_LENGTH = 500
8
+
9
+ # Extensions for which we read file content; everything else is categorized by filename/extension only.
10
+ CONTENT_EXTENSIONS = {".pdf", ".txt", ".md", ".docx", ".csv"}
11
+
12
+
13
+ def _read_text_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
14
+ """Read first `limit` characters from a text file. Returns None on error."""
15
+ try:
16
+ with open(path, encoding="utf-8", errors="replace") as f:
17
+ return (f.read(limit + 1))[:limit] or None
18
+ except OSError:
19
+ return None
20
+
21
+
22
+ def _read_pdf_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
23
+ """Extract text from first page of PDF and truncate. Returns None on error."""
24
+ try:
25
+ import pdfplumber
26
+ with pdfplumber.open(path) as pdf:
27
+ if not pdf.pages:
28
+ return None
29
+ text = pdf.pages[0].extract_text()
30
+ if not text:
31
+ return None
32
+ return (text[: limit + 1])[:limit]
33
+ except Exception:
34
+ return None
35
+
36
+
37
+ def _read_docx_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
38
+ """Extract paragraph text from docx and truncate. Returns None on error."""
39
+ try:
40
+ from docx import Document
41
+ doc = Document(path)
42
+ parts = []
43
+ n = 0
44
+ for para in doc.paragraphs:
45
+ if para.text:
46
+ parts.append(para.text)
47
+ n += len(para.text)
48
+ if n >= limit:
49
+ break
50
+ text = " ".join(parts)
51
+ return (text[: limit + 1])[:limit] if text else None
52
+ except Exception:
53
+ return None
54
+
55
+
56
+ def get_content_preview(path: Path, limit: int = CONTENT_PREVIEW_LENGTH) -> Optional[str]:
57
+ """Return first ~limit characters of file content for supported types, else None."""
58
+ ext = path.suffix.lower()
59
+ if ext in (".txt", ".md", ".csv"):
60
+ return _read_text_preview(path, limit)
61
+ if ext == ".pdf":
62
+ return _read_pdf_preview(path, limit)
63
+ if ext == ".docx":
64
+ return _read_docx_preview(path, limit)
65
+ return None
66
+
67
+
68
+ def list_files(
69
+ root: Path,
70
+ max_depth: Optional[int] = None,
71
+ ) -> list[dict]:
72
+ """
73
+ Walk directory up to max_depth levels; return list of dicts with path (relative),
74
+ name (filename), and content_preview (first ~500 chars for supported types, else None).
75
+ """
76
+ root = root.resolve()
77
+ if not root.is_dir():
78
+ return []
79
+
80
+ result = []
81
+ for dirpath, dirnames, filenames in os.walk(root):
82
+ dirpath = Path(dirpath)
83
+ depth = len(dirpath.relative_to(root).parts) if dirpath != root else 0
84
+ if max_depth is not None and depth > max_depth:
85
+ continue
86
+ if max_depth is not None and depth >= max_depth:
87
+ dirnames[:] = [] # do not descend further
88
+ for name in filenames:
89
+ full_path = dirpath / name
90
+ try:
91
+ rel = full_path.relative_to(root)
92
+ except ValueError:
93
+ continue
94
+ rel_str = str(rel).replace("\\", "/")
95
+ content = get_content_preview(full_path) if full_path.suffix.lower() in CONTENT_EXTENSIONS else None
96
+ result.append({
97
+ "path": rel_str,
98
+ "name": name,
99
+ "content_preview": content,
100
+ })
101
+ return result
@@ -0,0 +1,158 @@
1
+ Metadata-Version: 2.4
2
+ Name: sortai
3
+ Version: 0.1.0
4
+ Summary: LLM-powered directory organizer using Google Gemini
5
+ Author: sortai
6
+ License: MIT
7
+ Keywords: cli,organize,files,gemini,llm
8
+ Classifier: Development Status :: 4 - Beta
9
+ Classifier: Environment :: Console
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Programming Language :: Python :: 3
12
+ Classifier: Programming Language :: Python :: 3.9
13
+ Classifier: Programming Language :: Python :: 3.10
14
+ Classifier: Programming Language :: Python :: 3.11
15
+ Classifier: Programming Language :: Python :: 3.12
16
+ Requires-Python: >=3.9
17
+ Description-Content-Type: text/markdown
18
+ Requires-Dist: click>=8.0
19
+ Requires-Dist: google-generativeai>=0.3.0
20
+ Requires-Dist: pdfplumber>=0.10.0
21
+ Requires-Dist: python-docx>=1.0
22
+
23
+ # sortai
24
+
25
+ LLM-powered directory organizer. Uses **Google Gemini** to suggest a folder structure from filenames and (for text-based files) the first ~500 characters of content, then moves files into the suggested subfolders.
26
+
27
+ - **Dry-run by default** – see exactly what would move where before touching anything.
28
+ - **Confirm before apply** – with `--apply`, you are prompted to confirm before any files are moved.
29
+
30
+ ## Install
31
+
32
+ ```bash
33
+ pip install sortai
34
+ ```
35
+
36
+ Development install from source:
37
+
38
+ ```bash
39
+ git clone <repo>
40
+ cd sortai
41
+ pip install -e .
42
+ ```
43
+
44
+ ## Setup
45
+
46
+ Set your Google Gemini API key (required):
47
+
48
+ ```bash
49
+ export GEMINI_API_KEY=your_key_here
50
+ ```
51
+
52
+ Get a key at: **https://aistudio.google.com/app/apikey**
53
+
54
+ You can copy `.env.example` to `.env` and set `GEMINI_API_KEY` there; load it with your shell or a tool like `python-dotenv` if you use one (sortai does not load `.env` automatically).
55
+
56
+ ## Demo
57
+
58
+ <!-- TODO: add demo GIF -->
59
+ ![Demo](docs/demo.gif)
60
+
61
+ *(Placeholder: add a short GIF showing `sortai ./folder`, dry-run output, then `--apply` and confirmation.)*
62
+
63
+ ## Usage
64
+
65
+ | Command | Description |
66
+ |--------|-------------|
67
+ | `sortai <path>` | Dry-run: show what would be moved where (default). |
68
+ | `sortai <path> --apply` | After dry-run, prompt and then actually move files. |
69
+ | `sortai <path> --depth 2` | Organize up to 2 levels of subfolders (e.g. `documents/work`). |
70
+ | `sortai <path> --model gemini-1.5-flash` | Override Gemini model (default: gemini-1.5-flash). |
71
+ | `sortai --version` | Print version. |
72
+ | `sortai --help` | Show help. |
73
+
74
+ ### Example output
75
+
76
+ **Before (flat directory):**
77
+
78
+ ```
79
+ my-folder/
80
+ ├── report.pdf
81
+ ├── notes.txt
82
+ ├── budget.csv
83
+ ├── vacation.jpg
84
+ └── readme.md
85
+ ```
86
+
87
+ **Dry-run:**
88
+
89
+ ```
90
+ $ sortai ./my-folder
91
+ Dry run – would move:
92
+ report.pdf -> documents/
93
+ notes.txt -> documents/
94
+ budget.csv -> finance/
95
+ vacation.jpg -> images/
96
+ readme.md -> (keep at root)
97
+ Run with --apply to perform moves.
98
+ ```
99
+
100
+ **After applying:**
101
+
102
+ ```
103
+ my-folder/
104
+ ├── readme.md
105
+ ├── documents/
106
+ │ ├── report.pdf
107
+ │ └── notes.txt
108
+ ├── finance/
109
+ │ └── budget.csv
110
+ └── images/
111
+ └── vacation.jpg
112
+ ```
113
+
114
+ ## Supported file types for content reading
115
+
116
+ sortai reads the **first ~500 characters** of content for:
117
+
118
+ - `.pdf` (first page via pdfplumber)
119
+ - `.txt`, `.md`, `.csv` (plain text)
120
+ - `.docx` (paragraph text via python-docx)
121
+
122
+ All other files are categorized by **filename and extension only**.
123
+
124
+ ## Publishing to PyPI
125
+
126
+ 1. **Create a PyPI account** (and optionally [Test PyPI](https://test.pypi.org/) for testing):
127
+ - https://pypi.org/account/register/
128
+
129
+ 2. **Install build tools** (one-time):
130
+ ```bash
131
+ pip install build twine
132
+ ```
133
+
134
+ 3. **Bump version** in `pyproject.toml` and `sortai/__init__.py` when releasing a new version.
135
+
136
+ 4. **Build the package** (from the project root):
137
+ ```bash
138
+ python -m build
139
+ ```
140
+ This creates `dist/sortai-0.1.0.tar.gz` and a wheel.
141
+
142
+ 5. **Upload to PyPI**:
143
+ ```bash
144
+ twine upload dist/*
145
+ ```
146
+ Twine will prompt for your PyPI username and password. Prefer an [API token](https://pypi.org/manage/account/token/) (username: `__token__`, password: your token) over your account password.
147
+
148
+ To try Test PyPI first:
149
+ ```bash
150
+ twine upload --repository testpypi dist/*
151
+ ```
152
+ Then install with: `pip install -i https://test.pypi.org/simple/ sortai`
153
+
154
+ **Note:** If the name `sortai` is already taken on PyPI, change the `name` in `pyproject.toml` to something unique (e.g. `sortai-cli`) and publish under that name.
155
+
156
+ ## License
157
+
158
+ MIT
@@ -0,0 +1,13 @@
1
+ README.md
2
+ pyproject.toml
3
+ sortai/__init__.py
4
+ sortai/ai.py
5
+ sortai/cli.py
6
+ sortai/organizer.py
7
+ sortai/reader.py
8
+ sortai.egg-info/PKG-INFO
9
+ sortai.egg-info/SOURCES.txt
10
+ sortai.egg-info/dependency_links.txt
11
+ sortai.egg-info/entry_points.txt
12
+ sortai.egg-info/requires.txt
13
+ sortai.egg-info/top_level.txt
@@ -0,0 +1,2 @@
1
+ [console_scripts]
2
+ sortai = sortai.cli:main
@@ -0,0 +1,4 @@
1
+ click>=8.0
2
+ google-generativeai>=0.3.0
3
+ pdfplumber>=0.10.0
4
+ python-docx>=1.0
@@ -0,0 +1 @@
1
+ sortai