text-summarizer-aweebtaku 1.2.6__tar.gz → 1.2.7__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (21) hide show
  1. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/LICENSE +20 -20
  2. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/MANIFEST.in +6 -6
  3. {text_summarizer_aweebtaku-1.2.6/text_summarizer_aweebtaku.egg-info → text_summarizer_aweebtaku-1.2.7}/PKG-INFO +217 -206
  4. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/README.md +179 -179
  5. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/create_shortcuts.bat +15 -15
  6. text_summarizer_aweebtaku-1.2.7/requirements.txt +6 -0
  7. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/setup.cfg +4 -4
  8. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/setup.py +46 -46
  9. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/__init__.py +3 -3
  10. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/cli.py +96 -96
  11. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/create_shortcuts.py +63 -63
  12. text_summarizer_aweebtaku-1.2.7/text_summarizer/data/tennis.csv +9 -0
  13. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/summarizer.py +322 -322
  14. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/ui.py +379 -379
  15. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7/text_summarizer_aweebtaku.egg-info}/PKG-INFO +217 -206
  16. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer_aweebtaku.egg-info/SOURCES.txt +2 -0
  17. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer/data/__init__.py +0 -0
  18. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer_aweebtaku.egg-info/dependency_links.txt +0 -0
  19. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer_aweebtaku.egg-info/entry_points.txt +0 -0
  20. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer_aweebtaku.egg-info/requires.txt +0 -0
  21. {text_summarizer_aweebtaku-1.2.6 → text_summarizer_aweebtaku-1.2.7}/text_summarizer_aweebtaku.egg-info/top_level.txt +0 -0
@@ -1,21 +1,21 @@
1
- MIT License
2
-
3
- Copyright (c) 2026 Aditya Chaurasiya
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
11
-
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Aditya Chaurasiya
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
21
  SOFTWARE.
@@ -1,7 +1,7 @@
1
- include README.md
2
- include LICENSE
3
- include requirements.txt
4
- include create_shortcuts.bat
5
- recursive-include text_summarizer/data *.csv
6
- global-exclude *.pyc
1
+ include README.md
2
+ include LICENSE
3
+ include requirements.txt
4
+ include create_shortcuts.bat
5
+ recursive-include text_summarizer/data *.csv
6
+ global-exclude *.pyc
7
7
  global-exclude __pycache__
@@ -1,206 +1,217 @@
1
- Metadata-Version: 2.1
2
- Name: text-summarizer-aweebtaku
3
- Version: 1.2.6
4
- Summary: A text summarization tool using GloVe embeddings and PageRank algorithm
5
- Home-page: https://github.com/AWeebTaku/Summarizer
6
- Author: Aditya Chaurasiya
7
- Author-email: adityachaurasiya57527@gmail.com
8
- License: MIT
9
- Classifier: Development Status :: 4 - Beta
10
- Classifier: Intended Audience :: Developers
11
- Classifier: Operating System :: OS Independent
12
- Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.8
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
16
- Classifier: Programming Language :: Python :: 3.11
17
- Requires-Python: >=3.8
18
- Description-Content-Type: text/markdown
19
- License-File: LICENSE
20
- Requires-Dist: pandas>=1.3.0
21
- Requires-Dist: numpy>=1.21.0
22
- Requires-Dist: nltk>=3.6
23
- Requires-Dist: scikit-learn>=1.0
24
- Requires-Dist: networkx>=2.6
25
- Requires-Dist: requests>=2.25.0
26
-
27
- # Text Summarizer
28
-
29
- A Python-based text summarization tool that uses GloVe word embeddings and PageRank algorithm to generate extractive summaries of documents.
30
-
31
- ## Features
32
-
33
- - **Extractive Summarization**: Uses sentence similarity and PageRank to identify the most important sentences
34
- - **GloVe Embeddings**: Leverages pre-trained GloVe word vectors for semantic similarity calculation
35
- - **Multiple Input Methods**: Support for single documents, CSV files, or interactive creation
36
- - **GUI Interface**: User-friendly Tkinter-based graphical interface
37
- - **Command Line Interface**: Scriptable command-line tool for automation
38
- - **Batch Processing**: Process multiple documents at once
39
-
40
- ## Installation
41
-
42
- ### Prerequisites
43
-
44
- - Python 3.8 or higher
45
- - Required packages (automatically installed): pandas, numpy, nltk, scikit-learn, networkx
46
-
47
- ### Install from PyPI
48
-
49
- ```bash
50
- pip install text-summarizer-aweebtaku
51
- ```
52
-
53
- ### Install from Source
54
-
55
- 1. Clone the repository:
56
- ```bash
57
- git clone https://github.com/AWeebTaku/Summarizer.git
58
- cd Summarizer
59
- ```
60
-
61
- 2. Install the package:
62
- ```bash
63
- pip install -e .
64
- ```
65
-
66
- ### Upgrade Package
67
-
68
- To upgrade to the latest version with new features:
69
- ```bash
70
- pip install --upgrade text-summarizer-aweebtaku
71
- ```
72
-
73
- ### Create Desktop Shortcuts (Windows)
74
-
75
- After installation, create desktop shortcuts for easy access:
76
-
77
- **Option 1: Automatic (Recommended)**
78
- ```bash
79
- text-summarizer-shortcuts
80
- ```
81
- This will create desktop shortcuts for both GUI and CLI versions.
82
-
83
- **Option 2: Manual**
84
- Run the included batch file:
85
- ```cmd
86
- create_shortcuts.bat
87
- ```
88
-
89
- ### Download GloVe Embeddings
90
-
91
- **No manual download required!** The package will automatically download GloVe embeddings (100d, ~400MB) on first use and cache them in your home directory (`~/.text_summarizer/`).
92
-
93
- If you prefer to use your own GloVe file, you can specify the path:
94
- ```python
95
- summarizer = TextSummarizer(glove_path='path/to/your/glove.6B.100d.txt')
96
- ```
97
-
98
- ## Usage
99
-
100
- ### Console Scripts
101
-
102
- After installation, you can use these commands from anywhere:
103
-
104
- ```bash
105
- # Upgrade to the latest version
106
- pip install --upgrade text-summarizer-aweebtaku
107
-
108
- # Launch the graphical user interface
109
- text-summarizer-gui
110
-
111
- # Use the command line interface
112
- text-summarizer-aweebtaku --help
113
-
114
- # Create desktop shortcuts (Windows only)
115
- text-summarizer-shortcuts
116
- ```
117
-
118
- ### Command Line Interface
119
-
120
- ```bash
121
- # Summarize a CSV file
122
- text-summarizer-aweebtaku --csv-file data/tennis.csv --article-id 1
123
-
124
- # Interactive mode
125
- text-summarizer-aweebtaku
126
- ```
127
-
128
- ### Graphical User Interface
129
-
130
- ```bash
131
- # Launch GUI (easiest way)
132
- text-summarizer-aweebtaku --gui
133
-
134
- # Or use the dedicated GUI command
135
- text-summarizer-gui
136
- ```
137
-
138
- ### Python API
139
-
140
- ```python
141
- from text_summarizer import TextSummarizer
142
-
143
- # Initialize summarizer (automatic GloVe download)
144
- summarizer = TextSummarizer(num_sentences=3)
145
-
146
- # Simple text summarization
147
- text = "Your long text here..."
148
- summary = summarizer.summarize_text(text)
149
- print(summary)
150
-
151
- # Advanced usage with DataFrame
152
- import pandas as pd
153
- df = pd.DataFrame([{'article_id': 1, 'article_text': text}])
154
- scored_sentences = summarizer.run_summarization(df)
155
- article_text, summary = summarizer.summarize_article(scored_sentences, 1, df)
156
- ```
157
-
158
- ## Data Format
159
-
160
- Input data should be in CSV format with columns:
161
- - `article_id`: Unique identifier for each document
162
- - `article_text`: The full text of the document
163
-
164
- Example:
165
- ```csv
166
- article_id,article_text
167
- 1,"This is the first article. It contains multiple sentences..."
168
- 2,"This is the second article. It also has several sentences..."
169
- ```
170
-
171
- ## Algorithm
172
-
173
- The summarization process follows these steps:
174
-
175
- 1. **Sentence Tokenization**: Split documents into individual sentences
176
- 2. **Text Cleaning**: Remove punctuation, convert to lowercase, remove stopwords
177
- 3. **Sentence Vectorization**: Convert sentences to vectors using GloVe embeddings
178
- 4. **Similarity Calculation**: Compute cosine similarity between all sentence pairs
179
- 5. **PageRank Scoring**: Apply PageRank algorithm to identify important sentences
180
- 6. **Summary Extraction**: Select top-ranked sentences in original order
181
-
182
- ## Configuration
183
-
184
- - `glove_path`: Path to GloVe embeddings file (default: 'glove.6B.100d.txt/glove.6B.100d.txt')
185
- - `num_sentences`: Number of sentences in summary (default: 5)
186
-
187
- ## License
188
-
189
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
190
-
191
- ## Contributing
192
-
193
- Contributions are welcome! Please feel free to submit a Pull Request.
194
-
195
- ## Citation
196
-
197
- If you use this tool in your research, please cite:
198
-
199
- ```
200
- @software{text_summarizer,
201
- title = {Text Summarizer},
202
- author = {Aditya Chaurasiya},
203
- url = {https://github.com/AWeebTaku/Summarizer},
204
- year = {2026}
205
- }
206
- ```
1
+ Metadata-Version: 2.4
2
+ Name: text-summarizer-aweebtaku
3
+ Version: 1.2.7
4
+ Summary: A text summarization tool using GloVe embeddings and PageRank algorithm
5
+ Home-page: https://github.com/AWeebTaku/Summarizer
6
+ Author: Aditya Chaurasiya
7
+ Author-email: adityachaurasiya57527@gmail.com
8
+ License: MIT
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Programming Language :: Python :: 3.8
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Requires-Python: >=3.8
18
+ Description-Content-Type: text/markdown
19
+ License-File: LICENSE
20
+ Requires-Dist: pandas>=1.3.0
21
+ Requires-Dist: numpy>=1.21.0
22
+ Requires-Dist: nltk>=3.6
23
+ Requires-Dist: scikit-learn>=1.0
24
+ Requires-Dist: networkx>=2.6
25
+ Requires-Dist: requests>=2.25.0
26
+ Dynamic: author
27
+ Dynamic: author-email
28
+ Dynamic: classifier
29
+ Dynamic: description
30
+ Dynamic: description-content-type
31
+ Dynamic: home-page
32
+ Dynamic: license
33
+ Dynamic: license-file
34
+ Dynamic: requires-dist
35
+ Dynamic: requires-python
36
+ Dynamic: summary
37
+
38
+ # Text Summarizer
39
+
40
+ A Python-based text summarization tool that uses GloVe word embeddings and PageRank algorithm to generate extractive summaries of documents.
41
+
42
+ ## Features
43
+
44
+ - **Extractive Summarization**: Uses sentence similarity and PageRank to identify the most important sentences
45
+ - **GloVe Embeddings**: Leverages pre-trained GloVe word vectors for semantic similarity calculation
46
+ - **Multiple Input Methods**: Support for single documents, CSV files, or interactive creation
47
+ - **GUI Interface**: User-friendly Tkinter-based graphical interface
48
+ - **Command Line Interface**: Scriptable command-line tool for automation
49
+ - **Batch Processing**: Process multiple documents at once
50
+
51
+ ## Installation
52
+
53
+ ### Prerequisites
54
+
55
+ - Python 3.8 or higher
56
+ - Required packages (automatically installed): pandas, numpy, nltk, scikit-learn, networkx
57
+
58
+ ### Install from PyPI
59
+
60
+ ```bash
61
+ pip install text-summarizer-aweebtaku
62
+ ```
63
+
64
+ ### Install from Source
65
+
66
+ 1. Clone the repository:
67
+ ```bash
68
+ git clone https://github.com/AWeebTaku/Summarizer.git
69
+ cd Summarizer
70
+ ```
71
+
72
+ 2. Install the package:
73
+ ```bash
74
+ pip install -e .
75
+ ```
76
+
77
+ ### Upgrade Package
78
+
79
+ To upgrade to the latest version with new features:
80
+ ```bash
81
+ pip install --upgrade text-summarizer-aweebtaku
82
+ ```
83
+
84
+ ### Create Desktop Shortcuts (Windows)
85
+
86
+ After installation, create desktop shortcuts for easy access:
87
+
88
+ **Option 1: Automatic (Recommended)**
89
+ ```bash
90
+ text-summarizer-shortcuts
91
+ ```
92
+ This will create desktop shortcuts for both GUI and CLI versions.
93
+
94
+ **Option 2: Manual**
95
+ Run the included batch file:
96
+ ```cmd
97
+ create_shortcuts.bat
98
+ ```
99
+
100
+ ### Download GloVe Embeddings
101
+
102
+ **No manual download required!** The package will automatically download GloVe embeddings (100d, ~400MB) on first use and cache them in your home directory (`~/.text_summarizer/`).
103
+
104
+ If you prefer to use your own GloVe file, you can specify the path:
105
+ ```python
106
+ summarizer = TextSummarizer(glove_path='path/to/your/glove.6B.100d.txt')
107
+ ```
108
+
109
+ ## Usage
110
+
111
+ ### Console Scripts
112
+
113
+ After installation, you can use these commands from anywhere:
114
+
115
+ ```bash
116
+ # Upgrade to the latest version
117
+ pip install --upgrade text-summarizer-aweebtaku
118
+
119
+ # Launch the graphical user interface
120
+ text-summarizer-gui
121
+
122
+ # Use the command line interface
123
+ text-summarizer-aweebtaku --help
124
+
125
+ # Create desktop shortcuts (Windows only)
126
+ text-summarizer-shortcuts
127
+ ```
128
+
129
+ ### Command Line Interface
130
+
131
+ ```bash
132
+ # Summarize a CSV file
133
+ text-summarizer-aweebtaku --csv-file data/tennis.csv --article-id 1
134
+
135
+ # Interactive mode
136
+ text-summarizer-aweebtaku
137
+ ```
138
+
139
+ ### Graphical User Interface
140
+
141
+ ```bash
142
+ # Launch GUI (easiest way)
143
+ text-summarizer-aweebtaku --gui
144
+
145
+ # Or use the dedicated GUI command
146
+ text-summarizer-gui
147
+ ```
148
+
149
+ ### Python API
150
+
151
+ ```python
152
+ from text_summarizer import TextSummarizer
153
+
154
+ # Initialize summarizer (automatic GloVe download)
155
+ summarizer = TextSummarizer(num_sentences=3)
156
+
157
+ # Simple text summarization
158
+ text = "Your long text here..."
159
+ summary = summarizer.summarize_text(text)
160
+ print(summary)
161
+
162
+ # Advanced usage with DataFrame
163
+ import pandas as pd
164
+ df = pd.DataFrame([{'article_id': 1, 'article_text': text}])
165
+ scored_sentences = summarizer.run_summarization(df)
166
+ article_text, summary = summarizer.summarize_article(scored_sentences, 1, df)
167
+ ```
168
+
169
+ ## Data Format
170
+
171
+ Input data should be in CSV format with columns:
172
+ - `article_id`: Unique identifier for each document
173
+ - `article_text`: The full text of the document
174
+
175
+ Example:
176
+ ```csv
177
+ article_id,article_text
178
+ 1,"This is the first article. It contains multiple sentences..."
179
+ 2,"This is the second article. It also has several sentences..."
180
+ ```
181
+
182
+ ## Algorithm
183
+
184
+ The summarization process follows these steps:
185
+
186
+ 1. **Sentence Tokenization**: Split documents into individual sentences
187
+ 2. **Text Cleaning**: Remove punctuation, convert to lowercase, remove stopwords
188
+ 3. **Sentence Vectorization**: Convert sentences to vectors using GloVe embeddings
189
+ 4. **Similarity Calculation**: Compute cosine similarity between all sentence pairs
190
+ 5. **PageRank Scoring**: Apply PageRank algorithm to identify important sentences
191
+ 6. **Summary Extraction**: Select top-ranked sentences in original order
192
+
193
+ ## Configuration
194
+
195
+ - `glove_path`: Path to GloVe embeddings file (default: 'glove.6B.100d.txt/glove.6B.100d.txt')
196
+ - `num_sentences`: Number of sentences in summary (default: 5)
197
+
198
+ ## License
199
+
200
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
201
+
202
+ ## Contributing
203
+
204
+ Contributions are welcome! Please feel free to submit a Pull Request.
205
+
206
+ ## Citation
207
+
208
+ If you use this tool in your research, please cite:
209
+
210
+ ```
211
+ @software{text_summarizer,
212
+ title = {Text Summarizer},
213
+ author = {Aditya Chaurasiya},
214
+ url = {https://github.com/AWeebTaku/Summarizer},
215
+ year = {2026}
216
+ }
217
+ ```