text-summarizer-aweebtaku 1.2.4__tar.gz → 1.2.6__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (22) hide show
  1. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/LICENSE +20 -20
  2. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/MANIFEST.in +6 -6
  3. {text_summarizer_aweebtaku-1.2.4/text_summarizer_aweebtaku.egg-info → text_summarizer_aweebtaku-1.2.6}/PKG-INFO +206 -207
  4. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/README.md +179 -169
  5. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/create_shortcuts.bat +15 -15
  6. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/setup.cfg +4 -4
  7. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/setup.py +46 -39
  8. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/__init__.py +3 -3
  9. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/cli.py +96 -88
  10. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/create_shortcuts.py +63 -63
  11. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/summarizer.py +322 -322
  12. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/ui.py +379 -379
  13. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6/text_summarizer_aweebtaku.egg-info}/PKG-INFO +206 -207
  14. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer_aweebtaku.egg-info/SOURCES.txt +0 -2
  15. text_summarizer_aweebtaku-1.2.6/text_summarizer_aweebtaku.egg-info/requires.txt +6 -0
  16. text_summarizer_aweebtaku-1.2.4/requirements.txt +0 -6
  17. text_summarizer_aweebtaku-1.2.4/text_summarizer/data/tennis.csv +0 -9
  18. text_summarizer_aweebtaku-1.2.4/text_summarizer_aweebtaku.egg-info/requires.txt +0 -6
  19. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer/data/__init__.py +0 -0
  20. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer_aweebtaku.egg-info/dependency_links.txt +0 -0
  21. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer_aweebtaku.egg-info/entry_points.txt +0 -0
  22. {text_summarizer_aweebtaku-1.2.4 → text_summarizer_aweebtaku-1.2.6}/text_summarizer_aweebtaku.egg-info/top_level.txt +0 -0
@@ -1,21 +1,21 @@
1
- MIT License
2
-
3
- Copyright (c) 2026 Aditya Chaurasiya
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
11
-
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Aditya Chaurasiya
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
21
  SOFTWARE.
@@ -1,7 +1,7 @@
1
- include README.md
2
- include LICENSE
3
- include requirements.txt
4
- include create_shortcuts.bat
5
- recursive-include text_summarizer/data *.csv
6
- global-exclude *.pyc
1
+ include README.md
2
+ include LICENSE
3
+ include requirements.txt
4
+ include create_shortcuts.bat
5
+ recursive-include text_summarizer/data *.csv
6
+ global-exclude *.pyc
7
7
  global-exclude __pycache__
@@ -1,207 +1,206 @@
1
- Metadata-Version: 2.4
2
- Name: text-summarizer-aweebtaku
3
- Version: 1.2.4
4
- Summary: A text summarization tool using GloVe embeddings and PageRank algorithm
5
- Home-page: https://github.com/AWeebTaku/Summarizer
6
- Author: Aditya Chaurasiya
7
- Author-email: adityachaurasiya57527@gmail.com
8
- License: MIT
9
- Classifier: Development Status :: 4 - Beta
10
- Classifier: Intended Audience :: Developers
11
- Classifier: Operating System :: OS Independent
12
- Classifier: Programming Language :: Python :: 3
13
- Classifier: Programming Language :: Python :: 3.8
14
- Classifier: Programming Language :: Python :: 3.9
15
- Classifier: Programming Language :: Python :: 3.10
16
- Classifier: Programming Language :: Python :: 3.11
17
- Requires-Python: >=3.8
18
- Description-Content-Type: text/markdown
19
- License-File: LICENSE
20
- Requires-Dist: pandas
21
- Requires-Dist: numpy
22
- Requires-Dist: nltk
23
- Requires-Dist: scikit-learn
24
- Requires-Dist: networkx
25
- Requires-Dist: requests
26
- Dynamic: author
27
- Dynamic: author-email
28
- Dynamic: classifier
29
- Dynamic: description
30
- Dynamic: description-content-type
31
- Dynamic: home-page
32
- Dynamic: license
33
- Dynamic: license-file
34
- Dynamic: requires-dist
35
- Dynamic: requires-python
36
- Dynamic: summary
37
-
38
- # Text Summarizer
39
-
40
- A Python-based text summarization tool that uses GloVe word embeddings and PageRank algorithm to generate extractive summaries of documents.
41
-
42
- ## Features
43
-
44
- - **Extractive Summarization**: Uses sentence similarity and PageRank to identify the most important sentences
45
- - **GloVe Embeddings**: Leverages pre-trained GloVe word vectors for semantic similarity calculation
46
- - **Multiple Input Methods**: Support for single documents, CSV files, or interactive creation
47
- - **GUI Interface**: User-friendly Tkinter-based graphical interface
48
- - **Command Line Interface**: Scriptable command-line tool for automation
49
- - **Batch Processing**: Process multiple documents at once
50
-
51
- ## Installation
52
-
53
- ### Prerequisites
54
-
55
- - Python 3.8 or higher
56
- - Required packages (automatically installed): pandas, numpy, nltk, scikit-learn, networkx
57
-
58
- ### Install from PyPI
59
-
60
- ```bash
61
- pip install text-summarizer-aweebtaku
62
- ```
63
-
64
- ### Install from Source
65
-
66
- 1. Clone the repository:
67
- ```bash
68
- git clone https://github.com/AWeebTaku/Summarizer.git
69
- cd Summarizer
70
- ```
71
-
72
- 2. Install the package:
73
- ```bash
74
- pip install -e .
75
- ```
76
-
77
- ### Create Desktop Shortcuts (Windows)
78
-
79
- After installation, create desktop shortcuts for easy access:
80
-
81
- **Option 1: Automatic (Recommended)**
82
- ```bash
83
- text-summarizer-shortcuts
84
- ```
85
- This will create desktop shortcuts for both GUI and CLI versions.
86
-
87
- **Option 2: Manual**
88
- Run the included batch file:
89
- ```cmd
90
- create_shortcuts.bat
91
- ```
92
-
93
- ### Download GloVe Embeddings
94
-
95
- **No manual download required!** The package will automatically download GloVe embeddings (100d, ~400MB) on first use and cache them in your home directory (`~/.text_summarizer/`).
96
-
97
- If you prefer to use your own GloVe file, you can specify the path:
98
- ```python
99
- summarizer = TextSummarizer(glove_path='path/to/your/glove.6B.100d.txt')
100
- ```
101
-
102
- ## Usage
103
-
104
- ### Console Scripts
105
-
106
- After installation, you can use these commands from anywhere:
107
-
108
- ```bash
109
- # Launch the graphical user interface
110
- text-summarizer-gui
111
-
112
- # Use the command line interface
113
- text-summarizer-aweebtaku --help
114
-
115
- # Create desktop shortcuts (Windows only)
116
- text-summarizer-shortcuts
117
- ```
118
-
119
- ### Command Line Interface
120
-
121
- ```bash
122
- # Summarize a CSV file
123
- text-summarizer-aweebtaku --csv-file data/tennis.csv --article-id 1
124
-
125
- # Interactive mode
126
- text-summarizer-aweebtaku
127
- ```
128
-
129
- ### Graphical User Interface
130
-
131
- ```bash
132
- # Launch GUI (easiest way)
133
- text-summarizer-aweebtaku --gui
134
-
135
- # Or use the dedicated GUI command
136
- text-summarizer-gui
137
- ```
138
-
139
- ### Python API
140
-
141
- ```python
142
- from text_summarizer import TextSummarizer
143
-
144
- # Initialize summarizer (automatic GloVe download)
145
- summarizer = TextSummarizer(num_sentences=3)
146
-
147
- # Simple text summarization
148
- text = "Your long text here..."
149
- summary = summarizer.summarize_text(text)
150
- print(summary)
151
-
152
- # Advanced usage with DataFrame
153
- import pandas as pd
154
- df = pd.DataFrame([{'article_id': 1, 'article_text': text}])
155
- scored_sentences = summarizer.run_summarization(df)
156
- article_text, summary = summarizer.summarize_article(scored_sentences, 1, df)
157
- ```
158
-
159
- ## Data Format
160
-
161
- Input data should be in CSV format with columns:
162
- - `article_id`: Unique identifier for each document
163
- - `article_text`: The full text of the document
164
-
165
- Example:
166
- ```csv
167
- article_id,article_text
168
- 1,"This is the first article. It contains multiple sentences..."
169
- 2,"This is the second article. It also has several sentences..."
170
- ```
171
-
172
- ## Algorithm
173
-
174
- The summarization process follows these steps:
175
-
176
- 1. **Sentence Tokenization**: Split documents into individual sentences
177
- 2. **Text Cleaning**: Remove punctuation, convert to lowercase, remove stopwords
178
- 3. **Sentence Vectorization**: Convert sentences to vectors using GloVe embeddings
179
- 4. **Similarity Calculation**: Compute cosine similarity between all sentence pairs
180
- 5. **PageRank Scoring**: Apply PageRank algorithm to identify important sentences
181
- 6. **Summary Extraction**: Select top-ranked sentences in original order
182
-
183
- ## Configuration
184
-
185
- - `glove_path`: Path to GloVe embeddings file (default: 'glove.6B.100d.txt/glove.6B.100d.txt')
186
- - `num_sentences`: Number of sentences in summary (default: 5)
187
-
188
- ## License
189
-
190
- This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
191
-
192
- ## Contributing
193
-
194
- Contributions are welcome! Please feel free to submit a Pull Request.
195
-
196
- ## Citation
197
-
198
- If you use this tool in your research, please cite:
199
-
200
- ```
201
- @software{text_summarizer,
202
- title = {Text Summarizer},
203
- author = {Aditya Chaurasiya},
204
- url = {https://github.com/AWeebTaku/Summarizer},
205
- year = {2026}
206
- }
207
- ```
1
+ Metadata-Version: 2.1
2
+ Name: text-summarizer-aweebtaku
3
+ Version: 1.2.6
4
+ Summary: A text summarization tool using GloVe embeddings and PageRank algorithm
5
+ Home-page: https://github.com/AWeebTaku/Summarizer
6
+ Author: Aditya Chaurasiya
7
+ Author-email: adityachaurasiya57527@gmail.com
8
+ License: MIT
9
+ Classifier: Development Status :: 4 - Beta
10
+ Classifier: Intended Audience :: Developers
11
+ Classifier: Operating System :: OS Independent
12
+ Classifier: Programming Language :: Python :: 3
13
+ Classifier: Programming Language :: Python :: 3.8
14
+ Classifier: Programming Language :: Python :: 3.9
15
+ Classifier: Programming Language :: Python :: 3.10
16
+ Classifier: Programming Language :: Python :: 3.11
17
+ Requires-Python: >=3.8
18
+ Description-Content-Type: text/markdown
19
+ License-File: LICENSE
20
+ Requires-Dist: pandas>=1.3.0
21
+ Requires-Dist: numpy>=1.21.0
22
+ Requires-Dist: nltk>=3.6
23
+ Requires-Dist: scikit-learn>=1.0
24
+ Requires-Dist: networkx>=2.6
25
+ Requires-Dist: requests>=2.25.0
26
+
27
+ # Text Summarizer
28
+
29
+ A Python-based text summarization tool that uses GloVe word embeddings and PageRank algorithm to generate extractive summaries of documents.
30
+
31
+ ## Features
32
+
33
+ - **Extractive Summarization**: Uses sentence similarity and PageRank to identify the most important sentences
34
+ - **GloVe Embeddings**: Leverages pre-trained GloVe word vectors for semantic similarity calculation
35
+ - **Multiple Input Methods**: Support for single documents, CSV files, or interactive creation
36
+ - **GUI Interface**: User-friendly Tkinter-based graphical interface
37
+ - **Command Line Interface**: Scriptable command-line tool for automation
38
+ - **Batch Processing**: Process multiple documents at once
39
+
40
+ ## Installation
41
+
42
+ ### Prerequisites
43
+
44
+ - Python 3.8 or higher
45
+ - Required packages (automatically installed): pandas, numpy, nltk, scikit-learn, networkx
46
+
47
+ ### Install from PyPI
48
+
49
+ ```bash
50
+ pip install text-summarizer-aweebtaku
51
+ ```
52
+
53
+ ### Install from Source
54
+
55
+ 1. Clone the repository:
56
+ ```bash
57
+ git clone https://github.com/AWeebTaku/Summarizer.git
58
+ cd Summarizer
59
+ ```
60
+
61
+ 2. Install the package:
62
+ ```bash
63
+ pip install -e .
64
+ ```
65
+
66
+ ### Upgrade Package
67
+
68
+ To upgrade to the latest version with new features:
69
+ ```bash
70
+ pip install --upgrade text-summarizer-aweebtaku
71
+ ```
72
+
73
+ ### Create Desktop Shortcuts (Windows)
74
+
75
+ After installation, create desktop shortcuts for easy access:
76
+
77
+ **Option 1: Automatic (Recommended)**
78
+ ```bash
79
+ text-summarizer-shortcuts
80
+ ```
81
+ This will create desktop shortcuts for both GUI and CLI versions.
82
+
83
+ **Option 2: Manual**
84
+ Run the included batch file:
85
+ ```cmd
86
+ create_shortcuts.bat
87
+ ```
88
+
89
+ ### Download GloVe Embeddings
90
+
91
+ **No manual download required!** The package will automatically download GloVe embeddings (100d, ~400MB) on first use and cache them in your home directory (`~/.text_summarizer/`).
92
+
93
+ If you prefer to use your own GloVe file, you can specify the path:
94
+ ```python
95
+ summarizer = TextSummarizer(glove_path='path/to/your/glove.6B.100d.txt')
96
+ ```
97
+
98
+ ## Usage
99
+
100
+ ### Console Scripts
101
+
102
+ After installation, you can use these commands from anywhere:
103
+
104
+ ```bash
105
+ # Upgrade to the latest version
106
+ pip install --upgrade text-summarizer-aweebtaku
107
+
108
+ # Launch the graphical user interface
109
+ text-summarizer-gui
110
+
111
+ # Use the command line interface
112
+ text-summarizer-aweebtaku --help
113
+
114
+ # Create desktop shortcuts (Windows only)
115
+ text-summarizer-shortcuts
116
+ ```
117
+
118
+ ### Command Line Interface
119
+
120
+ ```bash
121
+ # Summarize a CSV file
122
+ text-summarizer-aweebtaku --csv-file data/tennis.csv --article-id 1
123
+
124
+ # Interactive mode
125
+ text-summarizer-aweebtaku
126
+ ```
127
+
128
+ ### Graphical User Interface
129
+
130
+ ```bash
131
+ # Launch GUI (easiest way)
132
+ text-summarizer-aweebtaku --gui
133
+
134
+ # Or use the dedicated GUI command
135
+ text-summarizer-gui
136
+ ```
137
+
138
+ ### Python API
139
+
140
+ ```python
141
+ from text_summarizer import TextSummarizer
142
+
143
+ # Initialize summarizer (automatic GloVe download)
144
+ summarizer = TextSummarizer(num_sentences=3)
145
+
146
+ # Simple text summarization
147
+ text = "Your long text here..."
148
+ summary = summarizer.summarize_text(text)
149
+ print(summary)
150
+
151
+ # Advanced usage with DataFrame
152
+ import pandas as pd
153
+ df = pd.DataFrame([{'article_id': 1, 'article_text': text}])
154
+ scored_sentences = summarizer.run_summarization(df)
155
+ article_text, summary = summarizer.summarize_article(scored_sentences, 1, df)
156
+ ```
157
+
158
+ ## Data Format
159
+
160
+ Input data should be in CSV format with columns:
161
+ - `article_id`: Unique identifier for each document
162
+ - `article_text`: The full text of the document
163
+
164
+ Example:
165
+ ```csv
166
+ article_id,article_text
167
+ 1,"This is the first article. It contains multiple sentences..."
168
+ 2,"This is the second article. It also has several sentences..."
169
+ ```
170
+
171
+ ## Algorithm
172
+
173
+ The summarization process follows these steps:
174
+
175
+ 1. **Sentence Tokenization**: Split documents into individual sentences
176
+ 2. **Text Cleaning**: Remove punctuation, convert to lowercase, remove stopwords
177
+ 3. **Sentence Vectorization**: Convert sentences to vectors using GloVe embeddings
178
+ 4. **Similarity Calculation**: Compute cosine similarity between all sentence pairs
179
+ 5. **PageRank Scoring**: Apply PageRank algorithm to identify important sentences
180
+ 6. **Summary Extraction**: Select top-ranked sentences in original order
181
+
182
+ ## Configuration
183
+
184
+ - `glove_path`: Path to GloVe embeddings file (default: 'glove.6B.100d.txt/glove.6B.100d.txt')
185
+ - `num_sentences`: Number of sentences in summary (default: 5)
186
+
187
+ ## License
188
+
189
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
190
+
191
+ ## Contributing
192
+
193
+ Contributions are welcome! Please feel free to submit a Pull Request.
194
+
195
+ ## Citation
196
+
197
+ If you use this tool in your research, please cite:
198
+
199
+ ```
200
+ @software{text_summarizer,
201
+ title = {Text Summarizer},
202
+ author = {Aditya Chaurasiya},
203
+ url = {https://github.com/AWeebTaku/Summarizer},
204
+ year = {2026}
205
+ }
206
+ ```