tensor-grep 0.1.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (138) hide show
  1. tensor_grep-0.1.0/.github/workflows/benchmark.yml +13 -0
  2. tensor_grep-0.1.0/.github/workflows/release.yml +102 -0
  3. tensor_grep-0.1.0/.gitignore +31 -0
  4. tensor_grep-0.1.0/.hypothesis/constants/0321b4d9a56ecd5d +4 -0
  5. tensor_grep-0.1.0/.hypothesis/constants/08b4847abc9ccb4c +4 -0
  6. tensor_grep-0.1.0/.hypothesis/constants/09619f2e7b2664f4 +4 -0
  7. tensor_grep-0.1.0/.hypothesis/constants/097541b56a59acb5 +4 -0
  8. tensor_grep-0.1.0/.hypothesis/constants/0aa14fe3f4d66eb8 +4 -0
  9. tensor_grep-0.1.0/.hypothesis/constants/1987ea0e0b593c3b +4 -0
  10. tensor_grep-0.1.0/.hypothesis/constants/244a932d4f46a9e8 +4 -0
  11. tensor_grep-0.1.0/.hypothesis/constants/259b95774007f88d +4 -0
  12. tensor_grep-0.1.0/.hypothesis/constants/26f238250b8499f1 +4 -0
  13. tensor_grep-0.1.0/.hypothesis/constants/2d172376ce08cd3e +4 -0
  14. tensor_grep-0.1.0/.hypothesis/constants/2fec718b7d5b449f +4 -0
  15. tensor_grep-0.1.0/.hypothesis/constants/3881db1d5f997c09 +4 -0
  16. tensor_grep-0.1.0/.hypothesis/constants/468c441a699c5e31 +4 -0
  17. tensor_grep-0.1.0/.hypothesis/constants/54e4b49940b89cd5 +4 -0
  18. tensor_grep-0.1.0/.hypothesis/constants/5a48abcf717d7583 +4 -0
  19. tensor_grep-0.1.0/.hypothesis/constants/6524b0fcb7a55341 +4 -0
  20. tensor_grep-0.1.0/.hypothesis/constants/67abd2ccbf381759 +4 -0
  21. tensor_grep-0.1.0/.hypothesis/constants/6b442457f5745159 +4 -0
  22. tensor_grep-0.1.0/.hypothesis/constants/6dfa700ee8bb8fa5 +4 -0
  23. tensor_grep-0.1.0/.hypothesis/constants/72f590d9256fc7a6 +4 -0
  24. tensor_grep-0.1.0/.hypothesis/constants/740a6fb9febec20c +4 -0
  25. tensor_grep-0.1.0/.hypothesis/constants/74ea32f85ecfd56c +4 -0
  26. tensor_grep-0.1.0/.hypothesis/constants/a3fd68449bfef454 +4 -0
  27. tensor_grep-0.1.0/.hypothesis/constants/ab596ae2844fd274 +4 -0
  28. tensor_grep-0.1.0/.hypothesis/constants/ac1d131da99aea9b +4 -0
  29. tensor_grep-0.1.0/.hypothesis/constants/b0714817c49af4f1 +4 -0
  30. tensor_grep-0.1.0/.hypothesis/constants/b0f490b7b2d6bf29 +4 -0
  31. tensor_grep-0.1.0/.hypothesis/constants/c30bf139df59b92f +4 -0
  32. tensor_grep-0.1.0/.hypothesis/constants/c63b8b09cb09eb60 +4 -0
  33. tensor_grep-0.1.0/.hypothesis/constants/cabbe58a6de53f9b +4 -0
  34. tensor_grep-0.1.0/.hypothesis/constants/cba1b20bea830cc3 +4 -0
  35. tensor_grep-0.1.0/.hypothesis/constants/d7e5661845384844 +4 -0
  36. tensor_grep-0.1.0/.hypothesis/constants/da39a3ee5e6b4b0d +4 -0
  37. tensor_grep-0.1.0/.hypothesis/constants/dff29feff30d7749 +4 -0
  38. tensor_grep-0.1.0/.hypothesis/constants/e2c8c4005a3c9dad +4 -0
  39. tensor_grep-0.1.0/.hypothesis/constants/f06fa362a827014e +4 -0
  40. tensor_grep-0.1.0/.hypothesis/constants/f167f3c3dea5b11a +4 -0
  41. tensor_grep-0.1.0/.hypothesis/constants/f46aa3ceacc645fd +4 -0
  42. tensor_grep-0.1.0/.hypothesis/constants/f5078de4866a65c6 +4 -0
  43. tensor_grep-0.1.0/.hypothesis/constants/f9ba50acb0fd2077 +4 -0
  44. tensor_grep-0.1.0/.hypothesis/patches/2026-02-24--4ebab487.patch +16 -0
  45. tensor_grep-0.1.0/.hypothesis/tmp/tmp30a3q4k9 +0 -0
  46. tensor_grep-0.1.0/.hypothesis/tmp/tmp6up_04ih +0 -0
  47. tensor_grep-0.1.0/.hypothesis/tmp/tmp7hc5c3hh +0 -0
  48. tensor_grep-0.1.0/.hypothesis/tmp/tmpdyqfr9tp +0 -0
  49. tensor_grep-0.1.0/.hypothesis/tmp/tmpjnolnllk +0 -0
  50. tensor_grep-0.1.0/.hypothesis/tmp/tmpmmxg2dc2 +0 -0
  51. tensor_grep-0.1.0/.hypothesis/tmp/tmppojmtll6 +0 -0
  52. tensor_grep-0.1.0/.hypothesis/tmp/tmpzkk8myux +0 -0
  53. tensor_grep-0.1.0/.hypothesis/unicode_data/16.0.0/charmap.json.gz +0 -0
  54. tensor_grep-0.1.0/.hypothesis/unicode_data/16.0.0/codec-utf-8.json.gz +0 -0
  55. tensor_grep-0.1.0/PKG-INFO +32 -0
  56. tensor_grep-0.1.0/README.md +87 -0
  57. tensor_grep-0.1.0/benchmarks/run_ast_benchmarks.py +127 -0
  58. tensor_grep-0.1.0/benchmarks/run_benchmarks.py +209 -0
  59. tensor_grep-0.1.0/docs/architecture.md +19 -0
  60. tensor_grep-0.1.0/docs/benchmarks.md +18 -0
  61. tensor_grep-0.1.0/docs/benchmarks_ast.md +18 -0
  62. tensor_grep-0.1.0/docs/index.md +22 -0
  63. tensor_grep-0.1.0/docs/installation.md +37 -0
  64. tensor_grep-0.1.0/docs/planning/ENTERPRISE_PLAN.md +44 -0
  65. tensor_grep-0.1.0/docs/planning/PROJECT_PLAN.md +1088 -0
  66. tensor_grep-0.1.0/docs/planning/V1_RELEASE_PLAN.md +80 -0
  67. tensor_grep-0.1.0/main.build/.gitignore +1 -0
  68. tensor_grep-0.1.0/mkdocs.yml +46 -0
  69. tensor_grep-0.1.0/npm/install.js +70 -0
  70. tensor_grep-0.1.0/npm/package.json +32 -0
  71. tensor_grep-0.1.0/oimiragieo.tensor-grep.yaml +22 -0
  72. tensor_grep-0.1.0/pyproject.toml +84 -0
  73. tensor_grep-0.1.0/scripts/build_binaries.py +47 -0
  74. tensor_grep-0.1.0/scripts/verify_gpu.py +36 -0
  75. tensor_grep-0.1.0/src/tensor_grep/__init__.py +0 -0
  76. tensor_grep-0.1.0/src/tensor_grep/backends/__init__.py +0 -0
  77. tensor_grep-0.1.0/src/tensor_grep/backends/ast_backend.py +162 -0
  78. tensor_grep-0.1.0/src/tensor_grep/backends/base.py +12 -0
  79. tensor_grep-0.1.0/src/tensor_grep/backends/cpu_backend.py +88 -0
  80. tensor_grep-0.1.0/src/tensor_grep/backends/cudf_backend.py +136 -0
  81. tensor_grep-0.1.0/src/tensor_grep/backends/cybert_backend.py +65 -0
  82. tensor_grep-0.1.0/src/tensor_grep/backends/torch_backend.py +166 -0
  83. tensor_grep-0.1.0/src/tensor_grep/cli/__init__.py +0 -0
  84. tensor_grep-0.1.0/src/tensor_grep/cli/main.py +582 -0
  85. tensor_grep-0.1.0/src/tensor_grep/core/__init__.py +0 -0
  86. tensor_grep-0.1.0/src/tensor_grep/core/config.py +123 -0
  87. tensor_grep-0.1.0/src/tensor_grep/core/pipeline.py +51 -0
  88. tensor_grep-0.1.0/src/tensor_grep/core/query_analyzer.py +23 -0
  89. tensor_grep-0.1.0/src/tensor_grep/core/result.py +19 -0
  90. tensor_grep-0.1.0/src/tensor_grep/formatters/__init__.py +0 -0
  91. tensor_grep-0.1.0/src/tensor_grep/formatters/base.py +7 -0
  92. tensor_grep-0.1.0/src/tensor_grep/formatters/csv_fmt.py +15 -0
  93. tensor_grep-0.1.0/src/tensor_grep/formatters/json_fmt.py +17 -0
  94. tensor_grep-0.1.0/src/tensor_grep/formatters/ripgrep_fmt.py +41 -0
  95. tensor_grep-0.1.0/src/tensor_grep/formatters/table_fmt.py +10 -0
  96. tensor_grep-0.1.0/src/tensor_grep/gpu/__init__.py +0 -0
  97. tensor_grep-0.1.0/src/tensor_grep/gpu/device_detect.py +60 -0
  98. tensor_grep-0.1.0/src/tensor_grep/gpu/memory_manager.py +33 -0
  99. tensor_grep-0.1.0/src/tensor_grep/io/__init__.py +0 -0
  100. tensor_grep-0.1.0/src/tensor_grep/io/base.py +6 -0
  101. tensor_grep-0.1.0/src/tensor_grep/io/directory_scanner.py +84 -0
  102. tensor_grep-0.1.0/src/tensor_grep/io/reader_cudf.py +22 -0
  103. tensor_grep-0.1.0/src/tensor_grep/io/reader_dstorage.py +17 -0
  104. tensor_grep-0.1.0/src/tensor_grep/io/reader_fallback.py +24 -0
  105. tensor_grep-0.1.0/src/tensor_grep/io/reader_kvikio.py +15 -0
  106. tensor_grep-0.1.0/tensor-grep.rb +29 -0
  107. tensor_grep-0.1.0/tests/conftest.py +43 -0
  108. tensor_grep-0.1.0/tests/e2e/snapshots/test_output_snapshots/test_json_output_snapshot/json_output.json +1 -0
  109. tensor_grep-0.1.0/tests/e2e/test_backend_contracts.py +26 -0
  110. tensor_grep-0.1.0/tests/e2e/test_cli_classify.py +19 -0
  111. tensor_grep-0.1.0/tests/e2e/test_cli_no_gpu.py +28 -0
  112. tensor_grep-0.1.0/tests/e2e/test_cli_search.py +26 -0
  113. tensor_grep-0.1.0/tests/e2e/test_io_contracts.py +14 -0
  114. tensor_grep-0.1.0/tests/e2e/test_output_snapshots.py +16 -0
  115. tensor_grep-0.1.0/tests/e2e/test_reader_props.py +46 -0
  116. tensor_grep-0.1.0/tests/e2e/test_ripgrep_parity.py +24 -0
  117. tensor_grep-0.1.0/tests/e2e/test_throughput.py +27 -0
  118. tensor_grep-0.1.0/tests/e2e/test_tokenizer_props.py +15 -0
  119. tensor_grep-0.1.0/tests/e2e/test_vs_ripgrep.py +39 -0
  120. tensor_grep-0.1.0/tests/integration/test_cudf_read_text.py +27 -0
  121. tensor_grep-0.1.0/tests/integration/test_gpu_memory.py +59 -0
  122. tensor_grep-0.1.0/tests/integration/test_pipeline_e2e.py +23 -0
  123. tensor_grep-0.1.0/tests/unit/test_ast_backend.py +39 -0
  124. tensor_grep-0.1.0/tests/unit/test_cpu_backend.py +104 -0
  125. tensor_grep-0.1.0/tests/unit/test_cudf_backend.py +104 -0
  126. tensor_grep-0.1.0/tests/unit/test_cybert_backend.py +79 -0
  127. tensor_grep-0.1.0/tests/unit/test_device_detect.py +75 -0
  128. tensor_grep-0.1.0/tests/unit/test_directory_scanner.py +65 -0
  129. tensor_grep-0.1.0/tests/unit/test_formatters.py +42 -0
  130. tensor_grep-0.1.0/tests/unit/test_memory_manager.py +84 -0
  131. tensor_grep-0.1.0/tests/unit/test_pipeline.py +37 -0
  132. tensor_grep-0.1.0/tests/unit/test_query_analyzer.py +20 -0
  133. tensor_grep-0.1.0/tests/unit/test_reader_cudf.py +21 -0
  134. tensor_grep-0.1.0/tests/unit/test_reader_dstorage.py +39 -0
  135. tensor_grep-0.1.0/tests/unit/test_reader_fallback.py +40 -0
  136. tensor_grep-0.1.0/tests/unit/test_reader_kvikio.py +24 -0
  137. tensor_grep-0.1.0/tests/unit/test_result.py +13 -0
  138. tensor_grep-0.1.0/uv.lock +3530 -0
@@ -0,0 +1,13 @@
1
+ name: Benchmarks
2
+ on:
3
+ pull_request:
4
+ paths: ['src/**', 'tests/performance/**']
5
+ jobs:
6
+ benchmark:
7
+ runs-on: ubuntu-latest
8
+ steps:
9
+ - uses: actions/checkout@v4
10
+ - uses: actions/setup-python@v5
11
+ with: { python-version: '3.11' }
12
+ - run: pip install -e ".[dev]"
13
+ - run: pytest tests/performance/ -v --tb=short
@@ -0,0 +1,102 @@
1
+ name: Release
2
+ on:
3
+ push:
4
+ tags:
5
+ - 'v*'
6
+
7
+ jobs:
8
+ build-binaries:
9
+ name: Build on ${{ matrix.os }}
10
+ runs-on: ${{ matrix.os }}
11
+ strategy:
12
+ matrix:
13
+ os: [ubuntu-latest, windows-latest, macos-latest]
14
+
15
+ steps:
16
+ - uses: actions/checkout@v4
17
+
18
+ - name: Set up Python
19
+ uses: actions/setup-python@v5
20
+ with:
21
+ python-version: '3.11'
22
+
23
+ - name: Install dependencies
24
+ run: |
25
+ pip install -e ".[dev]"
26
+ pip install nuitka
27
+
28
+ - name: Build Binary
29
+ run: python build_binaries.py
30
+
31
+ - name: Rename Artifact (Windows)
32
+ if: runner.os == 'Windows'
33
+ run: mv tg.exe tg-windows-amd64.exe
34
+
35
+ - name: Rename Artifact (Linux)
36
+ if: runner.os == 'Linux'
37
+ run: mv tg.bin tg-linux-amd64
38
+
39
+ - name: Rename Artifact (macOS)
40
+ if: runner.os == 'macOS'
41
+ run: mv tg.bin tg-macos-amd64
42
+
43
+ - name: Upload Artifact
44
+ uses: actions/upload-artifact@v4
45
+ with:
46
+ name: binary-${{ runner.os }}
47
+ path: tg-*
48
+
49
+ create-release:
50
+ needs: build-binaries
51
+ runs-on: ubuntu-latest
52
+ permissions:
53
+ contents: write
54
+ steps:
55
+ - uses: actions/checkout@v4
56
+
57
+ - name: Download Artifacts
58
+ uses: actions/download-artifact@v4
59
+ with:
60
+ path: artifacts
61
+
62
+ - name: Create GitHub Release
63
+ uses: softprops/action-gh-release@v2
64
+ with:
65
+ files: artifacts/**/tg-*
66
+ generate_release_notes: true
67
+
68
+ publish-npm:
69
+ needs: create-release
70
+ runs-on: ubuntu-latest
71
+ steps:
72
+ - uses: actions/checkout@v4
73
+
74
+ - name: Setup Node.js
75
+ uses: actions/setup-node@v4
76
+ with:
77
+ node-version: '20'
78
+ registry-url: 'https://registry.npmjs.org'
79
+
80
+ - name: Publish NPM Package
81
+ working-directory: npm
82
+ run: npm publish --access public
83
+ env:
84
+ NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
85
+
86
+ publish-docs:
87
+ runs-on: ubuntu-latest
88
+ permissions:
89
+ contents: write
90
+ steps:
91
+ - uses: actions/checkout@v4
92
+
93
+ - name: Set up Python
94
+ uses: actions/setup-python@v5
95
+ with:
96
+ python-version: '3.11'
97
+
98
+ - name: Install mkdocs
99
+ run: pip install mkdocs-material
100
+
101
+ - name: Deploy Docs
102
+ run: mkdocs gh-deploy --force
@@ -0,0 +1,31 @@
1
+ # Python
2
+ __pycache__/
3
+ *.py[cod]
4
+ *$py.class
5
+
6
+ # Environments
7
+ .env
8
+ .venv
9
+ env/
10
+ venv/
11
+ ENV/
12
+
13
+ # Distribution / packaging
14
+ dist/
15
+ build/
16
+ *.egg-info/
17
+
18
+ # Pytest / Coverage
19
+ .pytest_cache/
20
+ .coverage
21
+ htmlcov/
22
+
23
+ # Mypy
24
+ .mypy_cache/
25
+
26
+ # Ruff
27
+ .ruff_cache/
28
+
29
+ # IDEs
30
+ .vscode/
31
+ .idea/
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\formatters\json_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['file', 'line_number', 'matches', 'text', 'total_files', 'total_matches']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\core\pipeline.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\backends\cpu_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['latin-1', 'r', 'utf-8']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\io\reader_fallback.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['.gz', 'latin-1', 'rt', 'utf-8']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\gpu\device_detect.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [1024, '/run/WSL', 'linux', 'win32']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\core\pipeline.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_dstorage.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['dstorage_gpu.Tensor', 'win32']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\formatters\table_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['File\tLine\tMatch']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\gpu\device_detect.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [1024, '/run/WSL', 'linux', 'win32']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\backends\cybert_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [0.1, 0.8, 0.9, 'INT64', 'bert-base-uncased', 'confidence', 'cybert', 'error', 'info', 'input_ids', 'label', 'localhost:8000', 'logits', 'np', 'warn']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\backends\cybert_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [0.1, 0.8, 0.9, 'INT64', 'bert-base-uncased', 'confidence', 'cybert', 'error', 'info', 'input_ids', 'label', 'localhost:8000', 'logits', 'np', 'warn']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\gpu\memory_manager.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [0.8]
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_cudf.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['cudf.Series']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\gpu\memory_manager.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [0.8]
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\backends\cpu_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['latin-1', 'r', 'utf-8']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_kvikio.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['r']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\core\query_analyzer.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['anomaly', 'classify', 'detect', 'extract entities']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\backends\base.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_fallback.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['.gz', 'latin-1', 'rt', 'utf-8']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\core\result.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\formatters\json_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['file', 'line_number', 'matches', 'text', 'total_files', 'total_matches']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\backends\base.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\formatters\base.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\base.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\io\reader_cudf.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['cudf.Series']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_fallback.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['.gz', 'latin-1', 'rt', 'utf-8']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_kvikio.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['r']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\formatters\base.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_cudf.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\backends\cudf_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [512, 1024]
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\formatters\csv_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['file', 'line_number', 'text']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\formatters\csv_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['file', 'line_number', 'text']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\__init__.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\gpu\memory_manager.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [0.8]
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_dstorage.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['win32']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\formatters\ripgrep_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\formatters\ripgrep_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ []
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\formatters\table_fmt.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['File\tLine\tMatch']
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\tensor_grep\backends\cudf_backend.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ [512, 1024]
@@ -0,0 +1,4 @@
1
+ # file: C:\dev\projects\cudf-grep\src\cudf_grep\io\reader_cudf.py
2
+ # hypothesis_version: 6.151.9
3
+
4
+ ['cudf.Series']
@@ -0,0 +1,16 @@
1
+ From HEAD Mon Sep 17 00:00:00 2001
2
+ From: Hypothesis 6.151.9 <no-reply@hypothesis.works>
3
+ Date: Wed, 25 Feb 2026 04:56:08
4
+ Subject: [PATCH] Hypothesis: add explicit examples
5
+
6
+ ---
7
+ --- ./tests\property\test_tokenizer_props.py
8
+ +++ ./tests\property\test_tokenizer_props.py
9
+ @@ -4,6 +4,7 @@
10
+ pytestmark = pytest.mark.property
11
+
12
+ @given(st.text(min_size=1, max_size=10000, alphabet=st.characters(blacklist_categories=("Cs",))))
13
+ +@example(text="0").via("discovered failure")
14
+ def test_tokenizer_never_crashes_on_valid_text(text):
15
+ from cudf_grep.backends.cybert_backend import tokenize
16
+ tokens = tokenize([text])
@@ -0,0 +1,32 @@
1
+ Metadata-Version: 2.4
2
+ Name: tensor-grep
3
+ Version: 0.1.0
4
+ Requires-Python: >=3.11
5
+ Requires-Dist: rich>=13.0
6
+ Requires-Dist: typer[all]>=0.12
7
+ Provides-Extra: ast
8
+ Requires-Dist: tree-sitter-javascript; extra == 'ast'
9
+ Requires-Dist: tree-sitter-python; extra == 'ast'
10
+ Requires-Dist: tree-sitter>=0.22; extra == 'ast'
11
+ Provides-Extra: dev
12
+ Requires-Dist: hypothesis>=6.100; extra == 'dev'
13
+ Requires-Dist: mutmut>=3.0; extra == 'dev'
14
+ Requires-Dist: mypy>=1.11; extra == 'dev'
15
+ Requires-Dist: pytest-asyncio>=0.24; extra == 'dev'
16
+ Requires-Dist: pytest-cov>=5.0; extra == 'dev'
17
+ Requires-Dist: pytest-mock>=3.14; extra == 'dev'
18
+ Requires-Dist: pytest-snapshot>=0.9; extra == 'dev'
19
+ Requires-Dist: pytest>=8.0; extra == 'dev'
20
+ Requires-Dist: ruff>=0.6; extra == 'dev'
21
+ Provides-Extra: gpu
22
+ Requires-Dist: cudf-cu12; extra == 'gpu'
23
+ Requires-Dist: kvikio-cu12; extra == 'gpu'
24
+ Requires-Dist: torch-geometric>=2.5.0; extra == 'gpu'
25
+ Requires-Dist: torch>=2.0; extra == 'gpu'
26
+ Provides-Extra: gpu-win
27
+ Requires-Dist: dstorage-gpu>=1.0; extra == 'gpu-win'
28
+ Requires-Dist: torch-geometric>=2.5.0; extra == 'gpu-win'
29
+ Requires-Dist: torch>=2.0; extra == 'gpu-win'
30
+ Provides-Extra: nlp
31
+ Requires-Dist: transformers>=4.40; extra == 'nlp'
32
+ Requires-Dist: tritonclient[all]; extra == 'nlp'
@@ -0,0 +1,87 @@
1
+ # tensor-grep (tg)
2
+
3
+ **The GPU-Accelerated Semantic Log Parsing CLI**
4
+
5
+ `tensor-grep` combines the raw regex speed of traditional tools like `ripgrep` with the semantic understanding of Transformer AI networks (`cyBERT`), parallelized across multiple GPUs using NVIDIA RAPIDS `cuDF`.
6
+
7
+ ## Features
8
+ * **Drop-in Replacement:** Supports 70+ `ripgrep` CLI flags (e.g., `-i`, `-v`, `-C`, `-g`, `-t`).
9
+ * **AST-Grep Parity (NEW):** Structural code searching via PyTorch Geometric Graph Neural Networks (GNNs). Run `tg run`, `tg scan`, `tg lsp` natively on your GPU!
10
+ * **Multi-GPU Scaling:** Automatically detects and shards massive log files across dual, quad, or enterprise GPU arrays.
11
+ * **Semantic NLP Classification:** Utilize cyBERT to classify logs contextually (e.g. identify "ERROR" severity without explicit regexes) in a single pass.
12
+ * **CPU Fallback Resiliency:** Works gracefully on Windows, macOS, and CPU-only systems using a resilient Python Regex backend.
13
+
14
+ ---
15
+
16
+ ## 💻 Hardware & Software Requirements
17
+
18
+ `tensor-grep` runs on any machine with Python 3.11+ using its highly-optimized CPU fallback. However, to unlock its 3x-10x GPU-accelerated speeds, your system must meet these requirements:
19
+
20
+ * **Hardware:**
21
+ * NVIDIA GPU (GTX 10-Series or newer, RTX 30/40/50 series recommended)
22
+ * Minimum 4GB VRAM (8GB+ recommended for massive logs)
23
+ * **Software / Drivers:**
24
+ * **NVIDIA Display Drivers:** v535.xx or newer
25
+ * **CUDA Toolkit:** 12.0 or newer (CUDA 12.4 highly recommended)
26
+ * **Python Environments:**
27
+ * **Linux / WSL2:** Requires NVIDIA RAPIDS `cuDF` (`cudf-cu12`) for maximum throughput via instant `fork()` process spanning.
28
+ * **Windows Native:** Requires PyTorch with CUDA 12 support (`torch==2.5.1+cu124`). Note that PyTorch `spawn()` on Windows adds a ~10-second initial overhead, so for files <50MB, `tg` intelligently routes to the CPU backend instead.
29
+
30
+ ---
31
+
32
+ ## 🚀 GPU Acceleration Setup (CRITICAL)
33
+
34
+ To achieve the 3x-10x performance gains over traditional CPU tools, `tensor-grep` utilizes NVIDIA's RAPIDS suite (`cuDF`) on Linux/WSL2, and falls back to an optimized native **PyTorch Tensor** pipeline when running natively on Windows.
35
+
36
+ ### Windows Native GPU Support (No WSL2 Required)
37
+ If you do not want to use WSL2 and want to run `tensor-grep` natively from PowerShell/CMD while still utilizing your GPU, you can use `uv` (the fast Python package manager) to dynamically provision an isolated Python 3.12 environment with CUDA bindings:
38
+
39
+ ```powershell
40
+ # Run using uv to automatically pull PyTorch CUDA 12.4 hooks securely on Windows
41
+ uv run --python 3.12 --extra-index-url https://download.pytorch.org/whl/cu124 --index-strategy unsafe-best-match --with "torch==2.5.1+cu124" tg search "ERROR" /var/logs
42
+ ```
43
+ `tensor-grep` will automatically detect Windows + PyTorch and dispatch workloads to the `TorchBackend`.
44
+
45
+ #### ⚠️ Windows PyTorch Spawn Overhead
46
+ Because Windows Python `multiprocessing` requires `spawn()` rather than Linux's `fork()`, the PyTorch CUDA context takes ~11 seconds to initialize across multiple worker processes on Windows.
47
+ - For small files (< 50MB), `tensor-grep` automatically bypasses the GPU on Windows to avoid this delay, routing to an optimized `CPUBackend` instead.
48
+ - For massive logs (> 200MB), the 11s Windows spawn overhead is absorbed by the sheer throughput of the GPU matrix math.
49
+
50
+ ### Linux / Windows WSL2 (Maximum Enterprise Performance) 🚀
51
+ For absolute maximum performance using raw CUDA C++ string bindings (`cuDF`), **run tensor-grep inside WSL2 or Linux.**
52
+ Because Linux uses `fork()`, process initialization is practically instantaneous, meaning you will actually see sub-`0.02s` speeds across your dual GPUs!
53
+
54
+ ```bash
55
+ # If using a RAPIDS conda environment:
56
+ conda activate rapids-24.04
57
+
58
+ # Or using uv to pull the linux cuDF wheels directly:
59
+ uv run --python 3.12 --extra-index-url https://pypi.nvidia.com --with "cudf-cu12" python run_benchmarks.py
60
+ ```
61
+
62
+ Once installed, `tensor-grep` will automatically detect `cuDF`, discover your GPUs, and route all regex and string operations directly to your video cards' VRAM using the `CuDFBackend`.
63
+
64
+ ### 3. Install tensor-grep
65
+ ```bash
66
+ pip install tensor-grep
67
+ ```
68
+
69
+ Once installed, `tensor-grep` will automatically detect `cuDF`, discover your GPUs, and route all regex and string operations directly to your video cards' VRAM.
70
+
71
+ ## Usage
72
+
73
+ ```bash
74
+ # Standard regex search (GPU Accelerated)
75
+ tg search "Exception.*timeout" /var/logs
76
+
77
+ # Context lines, case-insensitive, ripgrep parity
78
+ tg search -i -C 2 "database" /var/logs
79
+
80
+ # AI Semantic Classification
81
+ tg classify /var/logs/syslog.log --format json
82
+
83
+ # AST Structural Code Search (ast-grep parity via PyTorch GNNs)
84
+ tg run --ast --lang python "if ($A) { return $B; }" ./src
85
+ tg scan -c sgconfig.yml
86
+ tg lsp
87
+ ```