tensor-grep 0.1.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- tensor_grep-0.1.0/.github/workflows/benchmark.yml +13 -0
- tensor_grep-0.1.0/.github/workflows/release.yml +102 -0
- tensor_grep-0.1.0/.gitignore +31 -0
- tensor_grep-0.1.0/.hypothesis/constants/0321b4d9a56ecd5d +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/08b4847abc9ccb4c +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/09619f2e7b2664f4 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/097541b56a59acb5 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/0aa14fe3f4d66eb8 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/1987ea0e0b593c3b +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/244a932d4f46a9e8 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/259b95774007f88d +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/26f238250b8499f1 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/2d172376ce08cd3e +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/2fec718b7d5b449f +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/3881db1d5f997c09 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/468c441a699c5e31 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/54e4b49940b89cd5 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/5a48abcf717d7583 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/6524b0fcb7a55341 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/67abd2ccbf381759 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/6b442457f5745159 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/6dfa700ee8bb8fa5 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/72f590d9256fc7a6 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/740a6fb9febec20c +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/74ea32f85ecfd56c +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/a3fd68449bfef454 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/ab596ae2844fd274 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/ac1d131da99aea9b +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/b0714817c49af4f1 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/b0f490b7b2d6bf29 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/c30bf139df59b92f +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/c63b8b09cb09eb60 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/cabbe58a6de53f9b +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/cba1b20bea830cc3 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/d7e5661845384844 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/da39a3ee5e6b4b0d +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/dff29feff30d7749 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/e2c8c4005a3c9dad +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/f06fa362a827014e +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/f167f3c3dea5b11a +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/f46aa3ceacc645fd +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/f5078de4866a65c6 +4 -0
- tensor_grep-0.1.0/.hypothesis/constants/f9ba50acb0fd2077 +4 -0
- tensor_grep-0.1.0/.hypothesis/patches/2026-02-24--4ebab487.patch +16 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmp30a3q4k9 +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmp6up_04ih +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmp7hc5c3hh +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmpdyqfr9tp +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmpjnolnllk +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmpmmxg2dc2 +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmppojmtll6 +0 -0
- tensor_grep-0.1.0/.hypothesis/tmp/tmpzkk8myux +0 -0
- tensor_grep-0.1.0/.hypothesis/unicode_data/16.0.0/charmap.json.gz +0 -0
- tensor_grep-0.1.0/.hypothesis/unicode_data/16.0.0/codec-utf-8.json.gz +0 -0
- tensor_grep-0.1.0/PKG-INFO +32 -0
- tensor_grep-0.1.0/README.md +87 -0
- tensor_grep-0.1.0/benchmarks/run_ast_benchmarks.py +127 -0
- tensor_grep-0.1.0/benchmarks/run_benchmarks.py +209 -0
- tensor_grep-0.1.0/docs/architecture.md +19 -0
- tensor_grep-0.1.0/docs/benchmarks.md +18 -0
- tensor_grep-0.1.0/docs/benchmarks_ast.md +18 -0
- tensor_grep-0.1.0/docs/index.md +22 -0
- tensor_grep-0.1.0/docs/installation.md +37 -0
- tensor_grep-0.1.0/docs/planning/ENTERPRISE_PLAN.md +44 -0
- tensor_grep-0.1.0/docs/planning/PROJECT_PLAN.md +1088 -0
- tensor_grep-0.1.0/docs/planning/V1_RELEASE_PLAN.md +80 -0
- tensor_grep-0.1.0/main.build/.gitignore +1 -0
- tensor_grep-0.1.0/mkdocs.yml +46 -0
- tensor_grep-0.1.0/npm/install.js +70 -0
- tensor_grep-0.1.0/npm/package.json +32 -0
- tensor_grep-0.1.0/oimiragieo.tensor-grep.yaml +22 -0
- tensor_grep-0.1.0/pyproject.toml +84 -0
- tensor_grep-0.1.0/scripts/build_binaries.py +47 -0
- tensor_grep-0.1.0/scripts/verify_gpu.py +36 -0
- tensor_grep-0.1.0/src/tensor_grep/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/ast_backend.py +162 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/base.py +12 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/cpu_backend.py +88 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/cudf_backend.py +136 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/cybert_backend.py +65 -0
- tensor_grep-0.1.0/src/tensor_grep/backends/torch_backend.py +166 -0
- tensor_grep-0.1.0/src/tensor_grep/cli/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/cli/main.py +582 -0
- tensor_grep-0.1.0/src/tensor_grep/core/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/core/config.py +123 -0
- tensor_grep-0.1.0/src/tensor_grep/core/pipeline.py +51 -0
- tensor_grep-0.1.0/src/tensor_grep/core/query_analyzer.py +23 -0
- tensor_grep-0.1.0/src/tensor_grep/core/result.py +19 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/base.py +7 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/csv_fmt.py +15 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/json_fmt.py +17 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/ripgrep_fmt.py +41 -0
- tensor_grep-0.1.0/src/tensor_grep/formatters/table_fmt.py +10 -0
- tensor_grep-0.1.0/src/tensor_grep/gpu/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/gpu/device_detect.py +60 -0
- tensor_grep-0.1.0/src/tensor_grep/gpu/memory_manager.py +33 -0
- tensor_grep-0.1.0/src/tensor_grep/io/__init__.py +0 -0
- tensor_grep-0.1.0/src/tensor_grep/io/base.py +6 -0
- tensor_grep-0.1.0/src/tensor_grep/io/directory_scanner.py +84 -0
- tensor_grep-0.1.0/src/tensor_grep/io/reader_cudf.py +22 -0
- tensor_grep-0.1.0/src/tensor_grep/io/reader_dstorage.py +17 -0
- tensor_grep-0.1.0/src/tensor_grep/io/reader_fallback.py +24 -0
- tensor_grep-0.1.0/src/tensor_grep/io/reader_kvikio.py +15 -0
- tensor_grep-0.1.0/tensor-grep.rb +29 -0
- tensor_grep-0.1.0/tests/conftest.py +43 -0
- tensor_grep-0.1.0/tests/e2e/snapshots/test_output_snapshots/test_json_output_snapshot/json_output.json +1 -0
- tensor_grep-0.1.0/tests/e2e/test_backend_contracts.py +26 -0
- tensor_grep-0.1.0/tests/e2e/test_cli_classify.py +19 -0
- tensor_grep-0.1.0/tests/e2e/test_cli_no_gpu.py +28 -0
- tensor_grep-0.1.0/tests/e2e/test_cli_search.py +26 -0
- tensor_grep-0.1.0/tests/e2e/test_io_contracts.py +14 -0
- tensor_grep-0.1.0/tests/e2e/test_output_snapshots.py +16 -0
- tensor_grep-0.1.0/tests/e2e/test_reader_props.py +46 -0
- tensor_grep-0.1.0/tests/e2e/test_ripgrep_parity.py +24 -0
- tensor_grep-0.1.0/tests/e2e/test_throughput.py +27 -0
- tensor_grep-0.1.0/tests/e2e/test_tokenizer_props.py +15 -0
- tensor_grep-0.1.0/tests/e2e/test_vs_ripgrep.py +39 -0
- tensor_grep-0.1.0/tests/integration/test_cudf_read_text.py +27 -0
- tensor_grep-0.1.0/tests/integration/test_gpu_memory.py +59 -0
- tensor_grep-0.1.0/tests/integration/test_pipeline_e2e.py +23 -0
- tensor_grep-0.1.0/tests/unit/test_ast_backend.py +39 -0
- tensor_grep-0.1.0/tests/unit/test_cpu_backend.py +104 -0
- tensor_grep-0.1.0/tests/unit/test_cudf_backend.py +104 -0
- tensor_grep-0.1.0/tests/unit/test_cybert_backend.py +79 -0
- tensor_grep-0.1.0/tests/unit/test_device_detect.py +75 -0
- tensor_grep-0.1.0/tests/unit/test_directory_scanner.py +65 -0
- tensor_grep-0.1.0/tests/unit/test_formatters.py +42 -0
- tensor_grep-0.1.0/tests/unit/test_memory_manager.py +84 -0
- tensor_grep-0.1.0/tests/unit/test_pipeline.py +37 -0
- tensor_grep-0.1.0/tests/unit/test_query_analyzer.py +20 -0
- tensor_grep-0.1.0/tests/unit/test_reader_cudf.py +21 -0
- tensor_grep-0.1.0/tests/unit/test_reader_dstorage.py +39 -0
- tensor_grep-0.1.0/tests/unit/test_reader_fallback.py +40 -0
- tensor_grep-0.1.0/tests/unit/test_reader_kvikio.py +24 -0
- tensor_grep-0.1.0/tests/unit/test_result.py +13 -0
- tensor_grep-0.1.0/uv.lock +3530 -0
|
@@ -0,0 +1,13 @@
|
|
|
1
|
+
name: Benchmarks
|
|
2
|
+
on:
|
|
3
|
+
pull_request:
|
|
4
|
+
paths: ['src/**', 'tests/performance/**']
|
|
5
|
+
jobs:
|
|
6
|
+
benchmark:
|
|
7
|
+
runs-on: ubuntu-latest
|
|
8
|
+
steps:
|
|
9
|
+
- uses: actions/checkout@v4
|
|
10
|
+
- uses: actions/setup-python@v5
|
|
11
|
+
with: { python-version: '3.11' }
|
|
12
|
+
- run: pip install -e ".[dev]"
|
|
13
|
+
- run: pytest tests/performance/ -v --tb=short
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
name: Release
|
|
2
|
+
on:
|
|
3
|
+
push:
|
|
4
|
+
tags:
|
|
5
|
+
- 'v*'
|
|
6
|
+
|
|
7
|
+
jobs:
|
|
8
|
+
build-binaries:
|
|
9
|
+
name: Build on ${{ matrix.os }}
|
|
10
|
+
runs-on: ${{ matrix.os }}
|
|
11
|
+
strategy:
|
|
12
|
+
matrix:
|
|
13
|
+
os: [ubuntu-latest, windows-latest, macos-latest]
|
|
14
|
+
|
|
15
|
+
steps:
|
|
16
|
+
- uses: actions/checkout@v4
|
|
17
|
+
|
|
18
|
+
- name: Set up Python
|
|
19
|
+
uses: actions/setup-python@v5
|
|
20
|
+
with:
|
|
21
|
+
python-version: '3.11'
|
|
22
|
+
|
|
23
|
+
- name: Install dependencies
|
|
24
|
+
run: |
|
|
25
|
+
pip install -e ".[dev]"
|
|
26
|
+
pip install nuitka
|
|
27
|
+
|
|
28
|
+
- name: Build Binary
|
|
29
|
+
run: python build_binaries.py
|
|
30
|
+
|
|
31
|
+
- name: Rename Artifact (Windows)
|
|
32
|
+
if: runner.os == 'Windows'
|
|
33
|
+
run: mv tg.exe tg-windows-amd64.exe
|
|
34
|
+
|
|
35
|
+
- name: Rename Artifact (Linux)
|
|
36
|
+
if: runner.os == 'Linux'
|
|
37
|
+
run: mv tg.bin tg-linux-amd64
|
|
38
|
+
|
|
39
|
+
- name: Rename Artifact (macOS)
|
|
40
|
+
if: runner.os == 'macOS'
|
|
41
|
+
run: mv tg.bin tg-macos-amd64
|
|
42
|
+
|
|
43
|
+
- name: Upload Artifact
|
|
44
|
+
uses: actions/upload-artifact@v4
|
|
45
|
+
with:
|
|
46
|
+
name: binary-${{ runner.os }}
|
|
47
|
+
path: tg-*
|
|
48
|
+
|
|
49
|
+
create-release:
|
|
50
|
+
needs: build-binaries
|
|
51
|
+
runs-on: ubuntu-latest
|
|
52
|
+
permissions:
|
|
53
|
+
contents: write
|
|
54
|
+
steps:
|
|
55
|
+
- uses: actions/checkout@v4
|
|
56
|
+
|
|
57
|
+
- name: Download Artifacts
|
|
58
|
+
uses: actions/download-artifact@v4
|
|
59
|
+
with:
|
|
60
|
+
path: artifacts
|
|
61
|
+
|
|
62
|
+
- name: Create GitHub Release
|
|
63
|
+
uses: softprops/action-gh-release@v2
|
|
64
|
+
with:
|
|
65
|
+
files: artifacts/**/tg-*
|
|
66
|
+
generate_release_notes: true
|
|
67
|
+
|
|
68
|
+
publish-npm:
|
|
69
|
+
needs: create-release
|
|
70
|
+
runs-on: ubuntu-latest
|
|
71
|
+
steps:
|
|
72
|
+
- uses: actions/checkout@v4
|
|
73
|
+
|
|
74
|
+
- name: Setup Node.js
|
|
75
|
+
uses: actions/setup-node@v4
|
|
76
|
+
with:
|
|
77
|
+
node-version: '20'
|
|
78
|
+
registry-url: 'https://registry.npmjs.org'
|
|
79
|
+
|
|
80
|
+
- name: Publish NPM Package
|
|
81
|
+
working-directory: npm
|
|
82
|
+
run: npm publish --access public
|
|
83
|
+
env:
|
|
84
|
+
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
|
|
85
|
+
|
|
86
|
+
publish-docs:
|
|
87
|
+
runs-on: ubuntu-latest
|
|
88
|
+
permissions:
|
|
89
|
+
contents: write
|
|
90
|
+
steps:
|
|
91
|
+
- uses: actions/checkout@v4
|
|
92
|
+
|
|
93
|
+
- name: Set up Python
|
|
94
|
+
uses: actions/setup-python@v5
|
|
95
|
+
with:
|
|
96
|
+
python-version: '3.11'
|
|
97
|
+
|
|
98
|
+
- name: Install mkdocs
|
|
99
|
+
run: pip install mkdocs-material
|
|
100
|
+
|
|
101
|
+
- name: Deploy Docs
|
|
102
|
+
run: mkdocs gh-deploy --force
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
# Python
|
|
2
|
+
__pycache__/
|
|
3
|
+
*.py[cod]
|
|
4
|
+
*$py.class
|
|
5
|
+
|
|
6
|
+
# Environments
|
|
7
|
+
.env
|
|
8
|
+
.venv
|
|
9
|
+
env/
|
|
10
|
+
venv/
|
|
11
|
+
ENV/
|
|
12
|
+
|
|
13
|
+
# Distribution / packaging
|
|
14
|
+
dist/
|
|
15
|
+
build/
|
|
16
|
+
*.egg-info/
|
|
17
|
+
|
|
18
|
+
# Pytest / Coverage
|
|
19
|
+
.pytest_cache/
|
|
20
|
+
.coverage
|
|
21
|
+
htmlcov/
|
|
22
|
+
|
|
23
|
+
# Mypy
|
|
24
|
+
.mypy_cache/
|
|
25
|
+
|
|
26
|
+
# Ruff
|
|
27
|
+
.ruff_cache/
|
|
28
|
+
|
|
29
|
+
# IDEs
|
|
30
|
+
.vscode/
|
|
31
|
+
.idea/
|
|
@@ -0,0 +1,16 @@
|
|
|
1
|
+
From HEAD Mon Sep 17 00:00:00 2001
|
|
2
|
+
From: Hypothesis 6.151.9 <no-reply@hypothesis.works>
|
|
3
|
+
Date: Wed, 25 Feb 2026 04:56:08
|
|
4
|
+
Subject: [PATCH] Hypothesis: add explicit examples
|
|
5
|
+
|
|
6
|
+
---
|
|
7
|
+
--- ./tests\property\test_tokenizer_props.py
|
|
8
|
+
+++ ./tests\property\test_tokenizer_props.py
|
|
9
|
+
@@ -4,6 +4,7 @@
|
|
10
|
+
pytestmark = pytest.mark.property
|
|
11
|
+
|
|
12
|
+
@given(st.text(min_size=1, max_size=10000, alphabet=st.characters(blacklist_categories=("Cs",))))
|
|
13
|
+
+@example(text="0").via("discovered failure")
|
|
14
|
+
def test_tokenizer_never_crashes_on_valid_text(text):
|
|
15
|
+
from cudf_grep.backends.cybert_backend import tokenize
|
|
16
|
+
tokens = tokenize([text])
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -0,0 +1,32 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: tensor-grep
|
|
3
|
+
Version: 0.1.0
|
|
4
|
+
Requires-Python: >=3.11
|
|
5
|
+
Requires-Dist: rich>=13.0
|
|
6
|
+
Requires-Dist: typer[all]>=0.12
|
|
7
|
+
Provides-Extra: ast
|
|
8
|
+
Requires-Dist: tree-sitter-javascript; extra == 'ast'
|
|
9
|
+
Requires-Dist: tree-sitter-python; extra == 'ast'
|
|
10
|
+
Requires-Dist: tree-sitter>=0.22; extra == 'ast'
|
|
11
|
+
Provides-Extra: dev
|
|
12
|
+
Requires-Dist: hypothesis>=6.100; extra == 'dev'
|
|
13
|
+
Requires-Dist: mutmut>=3.0; extra == 'dev'
|
|
14
|
+
Requires-Dist: mypy>=1.11; extra == 'dev'
|
|
15
|
+
Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
|
|
16
|
+
Requires-Dist: pytest-cov>=5.0; extra == 'dev'
|
|
17
|
+
Requires-Dist: pytest-mock>=3.14; extra == 'dev'
|
|
18
|
+
Requires-Dist: pytest-snapshot>=0.9; extra == 'dev'
|
|
19
|
+
Requires-Dist: pytest>=8.0; extra == 'dev'
|
|
20
|
+
Requires-Dist: ruff>=0.6; extra == 'dev'
|
|
21
|
+
Provides-Extra: gpu
|
|
22
|
+
Requires-Dist: cudf-cu12; extra == 'gpu'
|
|
23
|
+
Requires-Dist: kvikio-cu12; extra == 'gpu'
|
|
24
|
+
Requires-Dist: torch-geometric>=2.5.0; extra == 'gpu'
|
|
25
|
+
Requires-Dist: torch>=2.0; extra == 'gpu'
|
|
26
|
+
Provides-Extra: gpu-win
|
|
27
|
+
Requires-Dist: dstorage-gpu>=1.0; extra == 'gpu-win'
|
|
28
|
+
Requires-Dist: torch-geometric>=2.5.0; extra == 'gpu-win'
|
|
29
|
+
Requires-Dist: torch>=2.0; extra == 'gpu-win'
|
|
30
|
+
Provides-Extra: nlp
|
|
31
|
+
Requires-Dist: transformers>=4.40; extra == 'nlp'
|
|
32
|
+
Requires-Dist: tritonclient[all]; extra == 'nlp'
|
|
@@ -0,0 +1,87 @@
|
|
|
1
|
+
# tensor-grep (tg)
|
|
2
|
+
|
|
3
|
+
**The GPU-Accelerated Semantic Log Parsing CLI**
|
|
4
|
+
|
|
5
|
+
`tensor-grep` combines the raw regex speed of traditional tools like `ripgrep` with the semantic understanding of Transformer AI networks (`cyBERT`), parallelized across multiple GPUs using NVIDIA RAPIDS `cuDF`.
|
|
6
|
+
|
|
7
|
+
## Features
|
|
8
|
+
* **Drop-in Replacement:** Supports 70+ `ripgrep` CLI flags (e.g., `-i`, `-v`, `-C`, `-g`, `-t`).
|
|
9
|
+
* **AST-Grep Parity (NEW):** Structural code searching via PyTorch Geometric Graph Neural Networks (GNNs). Run `tg run`, `tg scan`, `tg lsp` natively on your GPU!
|
|
10
|
+
* **Multi-GPU Scaling:** Automatically detects and shards massive log files across dual, quad, or enterprise GPU arrays.
|
|
11
|
+
* **Semantic NLP Classification:** Utilize cyBERT to classify logs contextually (e.g. identify "ERROR" severity without explicit regexes) in a single pass.
|
|
12
|
+
* **CPU Fallback Resiliency:** Works gracefully on Windows, macOS, and CPU-only systems using a resilient Python Regex backend.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## 💻 Hardware & Software Requirements
|
|
17
|
+
|
|
18
|
+
`tensor-grep` runs on any machine with Python 3.11+ using its highly-optimized CPU fallback. However, to unlock its 3x-10x GPU-accelerated speeds, your system must meet these requirements:
|
|
19
|
+
|
|
20
|
+
* **Hardware:**
|
|
21
|
+
* NVIDIA GPU (GTX 10-Series or newer, RTX 30/40/50 series recommended)
|
|
22
|
+
* Minimum 4GB VRAM (8GB+ recommended for massive logs)
|
|
23
|
+
* **Software / Drivers:**
|
|
24
|
+
* **NVIDIA Display Drivers:** v535.xx or newer
|
|
25
|
+
* **CUDA Toolkit:** 12.0 or newer (CUDA 12.4 highly recommended)
|
|
26
|
+
* **Python Environments:**
|
|
27
|
+
* **Linux / WSL2:** Requires NVIDIA RAPIDS `cuDF` (`cudf-cu12`) for maximum throughput via instant `fork()` process spanning.
|
|
28
|
+
* **Windows Native:** Requires PyTorch with CUDA 12 support (`torch==2.5.1+cu124`). Note that PyTorch `spawn()` on Windows adds a ~10-second initial overhead, so for files <50MB, `tg` intelligently routes to the CPU backend instead.
|
|
29
|
+
|
|
30
|
+
---
|
|
31
|
+
|
|
32
|
+
## 🚀 GPU Acceleration Setup (CRITICAL)
|
|
33
|
+
|
|
34
|
+
To achieve the 3x-10x performance gains over traditional CPU tools, `tensor-grep` utilizes NVIDIA's RAPIDS suite (`cuDF`) on Linux/WSL2, and falls back to an optimized native **PyTorch Tensor** pipeline when running natively on Windows.
|
|
35
|
+
|
|
36
|
+
### Windows Native GPU Support (No WSL2 Required)
|
|
37
|
+
If you do not want to use WSL2 and want to run `tensor-grep` natively from PowerShell/CMD while still utilizing your GPU, you can use `uv` (the fast Python package manager) to dynamically provision an isolated Python 3.12 environment with CUDA bindings:
|
|
38
|
+
|
|
39
|
+
```powershell
|
|
40
|
+
# Run using uv to automatically pull PyTorch CUDA 12.4 hooks securely on Windows
|
|
41
|
+
uv run --python 3.12 --extra-index-url https://download.pytorch.org/whl/cu124 --index-strategy unsafe-best-match --with "torch==2.5.1+cu124" tg search "ERROR" /var/logs
|
|
42
|
+
```
|
|
43
|
+
`tensor-grep` will automatically detect Windows + PyTorch and dispatch workloads to the `TorchBackend`.
|
|
44
|
+
|
|
45
|
+
#### ⚠️ Windows PyTorch Spawn Overhead
|
|
46
|
+
Because Windows Python `multiprocessing` requires `spawn()` rather than Linux's `fork()`, the PyTorch CUDA context takes ~11 seconds to initialize across multiple worker processes on Windows.
|
|
47
|
+
- For small files (< 50MB), `tensor-grep` automatically bypasses the GPU on Windows to avoid this delay, routing to an optimized `CPUBackend` instead.
|
|
48
|
+
- For massive logs (> 200MB), the 11s Windows spawn overhead is absorbed by the sheer throughput of the GPU matrix math.
|
|
49
|
+
|
|
50
|
+
### Linux / Windows WSL2 (Maximum Enterprise Performance) 🚀
|
|
51
|
+
For absolute maximum performance using raw CUDA C++ string bindings (`cuDF`), **run tensor-grep inside WSL2 or Linux.**
|
|
52
|
+
Because Linux uses `fork()`, process initialization is practically instantaneous, meaning you will actually see sub-`0.02s` speeds across your dual GPUs!
|
|
53
|
+
|
|
54
|
+
```bash
|
|
55
|
+
# If using a RAPIDS conda environment:
|
|
56
|
+
conda activate rapids-24.04
|
|
57
|
+
|
|
58
|
+
# Or using uv to pull the linux cuDF wheels directly:
|
|
59
|
+
uv run --python 3.12 --extra-index-url https://pypi.nvidia.com --with "cudf-cu12" python run_benchmarks.py
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
Once installed, `tensor-grep` will automatically detect `cuDF`, discover your GPUs, and route all regex and string operations directly to your video cards' VRAM using the `CuDFBackend`.
|
|
63
|
+
|
|
64
|
+
### 3. Install tensor-grep
|
|
65
|
+
```bash
|
|
66
|
+
pip install tensor-grep
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
Once installed, `tensor-grep` will automatically detect `cuDF`, discover your GPUs, and route all regex and string operations directly to your video cards' VRAM.
|
|
70
|
+
|
|
71
|
+
## Usage
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
# Standard regex search (GPU Accelerated)
|
|
75
|
+
tg search "Exception.*timeout" /var/logs
|
|
76
|
+
|
|
77
|
+
# Context lines, case-insensitive, ripgrep parity
|
|
78
|
+
tg search -i -C 2 "database" /var/logs
|
|
79
|
+
|
|
80
|
+
# AI Semantic Classification
|
|
81
|
+
tg classify /var/logs/syslog.log --format json
|
|
82
|
+
|
|
83
|
+
# AST Structural Code Search (ast-grep parity via PyTorch GNNs)
|
|
84
|
+
tg run --ast --lang python "if ($A) { return $B; }" ./src
|
|
85
|
+
tg scan -c sgconfig.yml
|
|
86
|
+
tg lsp
|
|
87
|
+
```
|