semantic-chunker-langchain 0.1.2__py3-none-any.whl → 0.1.4__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.3
2
2
  Name: semantic-chunker-langchain
3
- Version: 0.1.2
3
+ Version: 0.1.4
4
4
  Summary: Token-aware, LangChain-compatible semantic chunker with PDF and layout support
5
5
  License: MIT
6
6
  Author: Prajwal Shivaji Mandale
@@ -22,7 +22,8 @@ Description-Content-Type: text/markdown
22
22
 
23
23
  # Semantic Chunker for LangChain
24
24
 
25
- A **token-aware**, **LangChain-compatible** chunker that splits text (from PDF, markdown, or plain text) into semantically coherent chunks while respecting model token limits.
25
+ Hitting limits on passing the larger context to your limited character token limit llm model not anymore this chunker solves the problem
26
+ It is a **token-aware**, **LangChain-compatible** chunker that splits text (from PDF, markdown, or plain text) into semantically coherent chunks while respecting model token limits.
26
27
 
27
28
  ---
28
29
 
@@ -42,7 +43,7 @@ A **token-aware**, **LangChain-compatible** chunker that splits text (from PDF,
42
43
 
43
44
  ---
44
45
 
45
- ## 📦 Installation
46
+ ## 📆 Installation
46
47
 
47
48
  ```bash
48
49
  pip install semantic-chunker-langchain
@@ -62,6 +63,7 @@ semantic-chunker sample.pdf --txt chunks.txt --json chunks.json
62
63
 
63
64
  ### 🔸 From Code
64
65
 
66
+ ```python
65
67
  from semantic_chunker_langchain.chunker import SemanticChunker, SimpleSemanticChunker
66
68
  from semantic_chunker_langchain.extractors.pdf import extract_pdf
67
69
  from semantic_chunker_langchain.outputs.formatter import write_to_txt
@@ -79,7 +81,7 @@ write_to_txt(chunks, "output.txt")
79
81
  # Using SimpleSemanticChunker
80
82
  simple_chunker = SimpleSemanticChunker(model_name="gpt-3.5-turbo")
81
83
  simple_chunks = simple_chunker.split_documents(docs)
82
-
84
+ ```
83
85
 
84
86
  ### 🔸 Convert to Retriever
85
87
 
@@ -90,7 +92,7 @@ retriever = chunker.to_retriever(chunks, embedding=OpenAIEmbeddings())
90
92
 
91
93
  ---
92
94
 
93
- ## 🧪 Testing
95
+ ## 📊 Testing
94
96
 
95
97
  ```bash
96
98
  poetry run pytest tests/
@@ -3,8 +3,8 @@ semantic_chunker_langchain/chunker.py,sha256=6WdJMu1cI9hDPAM9ISJWpAhenrsOMNkiW5r
3
3
  semantic_chunker_langchain/extractors/pdf.py,sha256=8jRWBCMeIK3M_WgOyDqxxadEHQw678CzD5ryAJ0tvAA,356
4
4
  semantic_chunker_langchain/outputs/formatter.py,sha256=tYShwikgwIleV6Nz1ohmtGX6nQRVnY41NOkOT6v43Qk,964
5
5
  semantic_chunker_langchain/utils.py,sha256=E0Ajj2IBa6EFJJkGYZ8pyWUEKEAjiL9_Uof8KPnM8ew,288
6
- semantic_chunker_langchain-0.1.2.dist-info/entry_points.txt,sha256=Kve0GJQ5uzNSMBidDihM9sFuoUY90OeP5THfJWQLDVQ,45
7
- semantic_chunker_langchain-0.1.2.dist-info/LICENSE,sha256=vfqlCGc0OOjpze243uuSsBAAq1OFEoCLbmElHpljFWM,1111
8
- semantic_chunker_langchain-0.1.2.dist-info/METADATA,sha256=ozfUlZC1kMoImi9tA9xBZQkQ8TYZ52aHCHSDVuzIFlU,2906
9
- semantic_chunker_langchain-0.1.2.dist-info/WHEEL,sha256=b4K_helf-jlQoXBBETfwnf4B04YC67LOev0jo4fX5m8,88
10
- semantic_chunker_langchain-0.1.2.dist-info/RECORD,,
6
+ semantic_chunker_langchain-0.1.4.dist-info/entry_points.txt,sha256=Kve0GJQ5uzNSMBidDihM9sFuoUY90OeP5THfJWQLDVQ,45
7
+ semantic_chunker_langchain-0.1.4.dist-info/LICENSE,sha256=vfqlCGc0OOjpze243uuSsBAAq1OFEoCLbmElHpljFWM,1111
8
+ semantic_chunker_langchain-0.1.4.dist-info/METADATA,sha256=6caibf7aBTjCZIrmxODnvI7c9H4erc199bF89D8MUPA,3062
9
+ semantic_chunker_langchain-0.1.4.dist-info/WHEEL,sha256=b4K_helf-jlQoXBBETfwnf4B04YC67LOev0jo4fX5m8,88
10
+ semantic_chunker_langchain-0.1.4.dist-info/RECORD,,