gptmed 0.3.4__tar.gz → 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (56) hide show
  1. {gptmed-0.3.4/gptmed.egg-info → gptmed-0.4.0}/PKG-INFO +180 -43
  2. {gptmed-0.3.4 → gptmed-0.4.0}/README.md +170 -41
  3. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/__init__.py +37 -3
  4. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/__init__.py +2 -2
  5. gptmed-0.4.0/gptmed/observability/__init__.py +43 -0
  6. gptmed-0.4.0/gptmed/observability/base.py +369 -0
  7. gptmed-0.4.0/gptmed/observability/callbacks.py +397 -0
  8. gptmed-0.4.0/gptmed/observability/metrics_tracker.py +544 -0
  9. gptmed-0.4.0/gptmed/services/__init__.py +15 -0
  10. gptmed-0.4.0/gptmed/services/device_manager.py +252 -0
  11. gptmed-0.4.0/gptmed/services/training_service.py +489 -0
  12. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/training/trainer.py +124 -10
  13. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/utils/checkpoints.py +1 -1
  14. {gptmed-0.3.4 → gptmed-0.4.0/gptmed.egg-info}/PKG-INFO +180 -43
  15. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed.egg-info/SOURCES.txt +7 -0
  16. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed.egg-info/requires.txt +10 -0
  17. {gptmed-0.3.4 → gptmed-0.4.0}/pyproject.toml +14 -2
  18. {gptmed-0.3.4 → gptmed-0.4.0}/LICENSE +0 -0
  19. {gptmed-0.3.4 → gptmed-0.4.0}/MANIFEST.in +0 -0
  20. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/api.py +0 -0
  21. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/configs/__init__.py +0 -0
  22. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/configs/config_loader.py +0 -0
  23. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/configs/train_config.py +0 -0
  24. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/configs/training_config.yaml +0 -0
  25. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/data/__init__.py +0 -0
  26. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/data/parsers/__init__.py +0 -0
  27. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/data/parsers/medquad_parser.py +0 -0
  28. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/data/parsers/text_formatter.py +0 -0
  29. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/inference/__init__.py +0 -0
  30. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/inference/decoding_utils.py +0 -0
  31. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/inference/generation_config.py +0 -0
  32. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/inference/generator.py +0 -0
  33. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/inference/sampling.py +0 -0
  34. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/__init__.py +0 -0
  35. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/attention.py +0 -0
  36. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/decoder_block.py +0 -0
  37. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/embeddings.py +0 -0
  38. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/feedforward.py +0 -0
  39. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/architecture/transformer.py +0 -0
  40. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/configs/__init__.py +0 -0
  41. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/model/configs/model_config.py +0 -0
  42. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/tokenizer/__init__.py +0 -0
  43. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/tokenizer/tokenize_data.py +0 -0
  44. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/tokenizer/train_tokenizer.py +0 -0
  45. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/training/__init__.py +0 -0
  46. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/training/dataset.py +0 -0
  47. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/training/train.py +0 -0
  48. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/training/utils.py +0 -0
  49. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/utils/__init__.py +0 -0
  50. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed/utils/logging.py +0 -0
  51. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed.egg-info/dependency_links.txt +0 -0
  52. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed.egg-info/entry_points.txt +0 -0
  53. {gptmed-0.3.4 → gptmed-0.4.0}/gptmed.egg-info/top_level.txt +0 -0
  54. {gptmed-0.3.4 → gptmed-0.4.0}/requirements.txt +0 -0
  55. {gptmed-0.3.4 → gptmed-0.4.0}/setup.cfg +0 -0
  56. {gptmed-0.3.4 → gptmed-0.4.0}/setup.py +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: gptmed
3
- Version: 0.3.4
3
+ Version: 0.4.0
4
4
  Summary: A lightweight GPT-based language model framework for training custom question-answering models on any domain
5
5
  Author-email: Sanjog Sigdel <sigdelsanjog@gmail.com>
6
6
  Maintainer-email: Sanjog Sigdel <sigdelsanjog@gmail.com>
@@ -10,7 +10,7 @@ Project-URL: Documentation, https://github.com/sigdelsanjog/gptmed#readme
10
10
  Project-URL: Repository, https://github.com/sigdelsanjog/gptmed
11
11
  Project-URL: Issues, https://github.com/sigdelsanjog/gptmed/issues
12
12
  Keywords: nlp,language-model,transformer,gpt,pytorch,qa,question-answering,training,deep-learning,custom-model
13
- Classifier: Development Status :: 3 - Alpha
13
+ Classifier: Development Status :: 4 - Beta
14
14
  Classifier: Intended Audience :: Developers
15
15
  Classifier: Intended Audience :: Science/Research
16
16
  Classifier: Intended Audience :: Education
@@ -38,28 +38,64 @@ Requires-Dist: mypy>=0.950; extra == "dev"
38
38
  Provides-Extra: training
39
39
  Requires-Dist: tensorboard>=2.10.0; extra == "training"
40
40
  Requires-Dist: wandb>=0.13.0; extra == "training"
41
+ Provides-Extra: visualization
42
+ Requires-Dist: matplotlib>=3.5.0; extra == "visualization"
43
+ Requires-Dist: seaborn>=0.12.0; extra == "visualization"
44
+ Provides-Extra: xai
45
+ Requires-Dist: matplotlib>=3.5.0; extra == "xai"
46
+ Requires-Dist: seaborn>=0.12.0; extra == "xai"
47
+ Requires-Dist: captum>=0.6.0; extra == "xai"
48
+ Requires-Dist: scikit-learn>=1.0.0; extra == "xai"
41
49
  Dynamic: license-file
42
50
 
43
51
  # GptMed 🤖
44
52
 
45
- A lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q&A datasets - whether it's casual conversations, technical support, education, or any other domain.
46
-
53
+ [![Downloads](https://static.pepy.tech/badge/gptmed)](https://pepy.tech/project/gptmed)
54
+ [![Downloads/Month](https://static.pepy.tech/badge/gptmed/month)](https://pepy.tech/project/gptmed)
47
55
  [![PyPI version](https://badge.fury.io/py/gptmed.svg)](https://badge.fury.io/py/gptmed)
48
56
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
49
57
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
50
58
 
51
- ## 📖 [Complete User Manual](USER_MANUAL.md) | [Quick Start](#quick-start)
59
+ A lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q&A datasets - whether it's casual conversations, technical support, education, or any other domain.
52
60
 
53
- > **New to GptMed?** Check out the [**step-by-step User Manual**](USER_MANUAL.md) for a complete guide on training your own model!
61
+ ## Citation
54
62
 
55
- ## Features
63
+ If you use this model in your research, please cite:
56
64
 
57
- - 🧠 **Custom GPT Architecture**: Lightweight transformer model for any Q&A domain
58
- - 🎯 **Domain-Agnostic**: Train on any question-answering dataset (casual chat, tech support, education, etc.)
59
- - **Fast Inference**: Optimized for quick question answering
60
- - 🔧 **Flexible Training**: Easy to train on your own custom datasets
61
- - 📦 **Lightweight**: Small model size suitable for edge deployment
62
- - 🛠️ **Complete Toolkit**: Includes tokenizer training, model training, and inference utilities
65
+ ```bibtex
66
+ @software{gptmed_2026,
67
+ author = {Sanjog Sigdel},
68
+ title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},
69
+ year = {2026},
70
+ url = {https://github.com/sigdelsanjog/gptmed}
71
+ }
72
+ ```
73
+
74
+ ## Table of Contents
75
+
76
+ - [Installation](#installation)
77
+ - [From PyPI (Recommended)](#from-pypi-recommended)
78
+ - [From Source](#from-source)
79
+ - [With Optional Dependencies](#with-optional-dependencies)
80
+ - [Quick Start](#quick-start)
81
+ - [Using the High-Level API](#using-the-high-level-api)
82
+ - [Inference (Generate Answers)](#inference-generate-answers)
83
+ - [Using Command Line](#using-command-line)
84
+ - [Training Your Own Model](#training-your-own-model)
85
+ - [Model Architecture](#model-architecture)
86
+ - [Configuration](#configuration)
87
+ - [Model Sizes](#model-sizes)
88
+ - [Training Configuration](#training-configuration)
89
+ - [Observability](#observability)
90
+ - [Project Structure](#project-structure)
91
+ - [Requirements](#requirements)
92
+ - [Documentation](#documentation)
93
+ - [Performance](#performance)
94
+ - [Examples](#examples)
95
+ - [Contributing](#contributing)
96
+ - [Citation](#citation)
97
+ - [License](#license)
98
+ - [Support](#support)
63
99
 
64
100
  ## Installation
65
101
 
@@ -83,15 +119,49 @@ pip install -e .
83
119
  # For development
84
120
  pip install gptmed[dev]
85
121
 
86
- # For training
122
+ # For training with logging integrations
87
123
  pip install gptmed[training]
88
124
 
125
+ # For visualization (loss curves, metrics plots)
126
+ pip install gptmed[visualization]
127
+
128
+ # For Explainable AI features
129
+ pip install gptmed[xai]
130
+
89
131
  # All dependencies
90
- pip install gptmed[dev,training]
132
+ pip install gptmed[dev,training,visualization,xai]
91
133
  ```
92
134
 
93
135
  ## Quick Start
94
136
 
137
+ ### Using the High-Level API
138
+
139
+ The easiest way to use GptMed is through the high-level API:
140
+
141
+ ```python
142
+ import gptmed
143
+
144
+ # 1. Create a training configuration
145
+ gptmed.create_config('my_config.yaml')
146
+
147
+ # 2. Edit my_config.yaml with your settings (data paths, model size, etc.)
148
+
149
+ # 3. Train the model
150
+ gptmed.train_from_config('my_config.yaml')
151
+
152
+ # 4. Generate answers
153
+ answer = gptmed.generate(
154
+ checkpoint='model/checkpoints/best_model.pt',
155
+ tokenizer='tokenizer/my_tokenizer.model',
156
+ prompt='What is machine learning?',
157
+ max_length=150,
158
+ temperature=0.7
159
+ )
160
+ print(answer)
161
+ ```
162
+
163
+ For a complete API testing workflow, see the [gptmed-api folder](https://github.com/sigdelsanjog/gptmed/tree/main/gptmed-api) with ready-to-run examples.
164
+
95
165
  ### Inference (Generate Answers)
96
166
 
97
167
  ```python
@@ -187,6 +257,50 @@ config = TrainingConfig(
187
257
  )
188
258
  ```
189
259
 
260
+ ## Observability
261
+
262
+ **New in v0.4.0**: Built-in training monitoring with Observer Pattern architecture.
263
+
264
+ ### Features
265
+
266
+ - 📊 **Loss Curves**: Track training/validation loss over time
267
+ - 📈 **Metrics Tracking**: Perplexity, gradient norms, learning rates
268
+ - 🔔 **Callbacks**: Console output, JSON logging, early stopping
269
+ - 📁 **Export**: CSV export, matplotlib visualizations
270
+ - 🔌 **Extensible**: Add custom observers for integrations (W&B, TensorBoard)
271
+
272
+ ### Quick Example
273
+
274
+ ```python
275
+ from gptmed.observability import MetricsTracker, ConsoleCallback, EarlyStoppingCallback
276
+
277
+ # Create observers
278
+ tracker = MetricsTracker(output_dir='./metrics')
279
+ console = ConsoleCallback(print_every=50)
280
+ early_stop = EarlyStoppingCallback(patience=3)
281
+
282
+ # Use with TrainingService (automatic)
283
+ from gptmed.services import TrainingService
284
+ service = TrainingService(config_path='config.yaml')
285
+ service.train() # Automatically creates MetricsTracker
286
+
287
+ # Or use with Trainer directly
288
+ trainer = Trainer(model, train_loader, config, observers=[tracker, console])
289
+ trainer.train()
290
+ ```
291
+
292
+ ### Available Observers
293
+
294
+ | Observer | Description |
295
+ | ----------------------- | --------------------------------------------------------- |
296
+ | `MetricsTracker` | Comprehensive metrics collection with export capabilities |
297
+ | `ConsoleCallback` | Real-time console output with progress bars |
298
+ | `JSONLoggerCallback` | Structured JSON logging for analysis |
299
+ | `EarlyStoppingCallback` | Stop training when validation loss plateaus |
300
+ | `LRSchedulerCallback` | Learning rate scheduling integration |
301
+
302
+ See [XAI.md](XAI.md) for future Explainable AI features roadmap.
303
+
190
304
  ## Project Structure
191
305
 
192
306
  ```
@@ -201,10 +315,16 @@ gptmed/
201
315
  │ ├── train.py # Training script
202
316
  │ ├── trainer.py # Training loop
203
317
  │ └── dataset.py # Data loading
318
+ ├── observability/ # Training monitoring & XAI (v0.4.0+)
319
+ │ ├── base.py # Observer pattern interfaces
320
+ │ ├── metrics_tracker.py # Loss curves & metrics
321
+ │ └── callbacks.py # Console, JSON, early stopping
204
322
  ├── tokenizer/
205
323
  │ └── train_tokenizer.py # SentencePiece tokenizer
206
324
  ├── configs/
207
325
  │ └── train_config.py # Training configurations
326
+ ├── services/
327
+ │ └── training_service.py # High-level training orchestration
208
328
  └── utils/
209
329
  ├── checkpoints.py # Model checkpointing
210
330
  └── logging.py # Training logging
@@ -226,6 +346,7 @@ gptmed/
226
346
 
227
347
  - [User Manual](USER_MANUAL.md) - **Start here!** Complete training pipeline guide
228
348
  - [Architecture Guide](ARCHITECTURE_EXTENSION_GUIDE.md) - Understanding the model architecture
349
+ - [XAI Roadmap](XAI.md) - Explainable AI features & implementation guide
229
350
  - [Deployment Guide](DEPLOYMENT_GUIDE.md) - Publishing to PyPI
230
351
  - [Changelog](CHANGELOG.md) - Version history
231
352
 
@@ -241,20 +362,53 @@ _Tested on GTX 1080 8GB_
241
362
 
242
363
  ## Examples
243
364
 
244
- ### Medical Question Answering
365
+ ### Domain-Agnostic Usage
366
+
367
+ GptMed works with **any domain** - just train on your own Q&A data:
245
368
 
246
369
  ```python
247
- # Example 1: Symptoms inquiry
248
- question = "What are the early signs of Alzheimer's disease?"
370
+ # Technical Support Bot
371
+ question = "How do I reset my WiFi router?"
249
372
  answer = generator.generate(question, temperature=0.7)
250
373
 
251
- # Example 2: Treatment information
252
- question = "How is Type 2 diabetes treated?"
374
+ # Educational Assistant
375
+ question = "Explain the water cycle in simple terms"
253
376
  answer = generator.generate(question, temperature=0.6)
254
377
 
255
- # Example 3: Medical definitions
256
- question = "What is hypertension?"
378
+ # Customer Service
379
+ question = "What is your return policy?"
257
380
  answer = generator.generate(question, temperature=0.5)
381
+
382
+ # Medical Q&A (example domain)
383
+ question = "What are the symptoms of flu?"
384
+ answer = generator.generate(question, temperature=0.7)
385
+ ```
386
+
387
+ ### Training Observability (v0.4.0+)
388
+
389
+ Monitor your training with built-in observability:
390
+
391
+ ```python
392
+ from gptmed.observability import MetricsTracker, ConsoleCallback
393
+
394
+ # Create observers
395
+ tracker = MetricsTracker(output_dir='./metrics')
396
+ console = ConsoleCallback(print_every=10)
397
+
398
+ # Train with observability
399
+ gptmed.train_from_config(
400
+ 'my_config.yaml',
401
+ observers=[tracker, console]
402
+ )
403
+
404
+ # After training - get the report
405
+ report = tracker.get_report()
406
+ print(f"Final Loss: {report['final_loss']:.4f}")
407
+ print(f"Total Steps: {report['total_steps']}")
408
+
409
+ # Export metrics
410
+ tracker.export_to_csv('training_metrics.csv')
411
+ tracker.plot_loss_curves('loss_curves.png') # Requires matplotlib
258
412
  ```
259
413
 
260
414
  ## Contributing
@@ -267,19 +421,6 @@ Contributions are welcome! Please feel free to submit a Pull Request.
267
421
  4. Push to the branch (`git push origin feature/AmazingFeature`)
268
422
  5. Open a Pull Request
269
423
 
270
- ## Citation
271
-
272
- If you use this model in your research, please cite:
273
-
274
- ```bibtex
275
- @software{llm_med_2026,
276
- author = {Sanjog Sigdel},
277
- title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},
278
- year = {2026},
279
- url = {https://github.com/sigdelsanjog/gptmed}
280
- }
281
- ```
282
-
283
424
  ## License
284
425
 
285
426
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
@@ -289,16 +430,12 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
289
430
  - MedQuAD dataset creators
290
431
  - PyTorch team
291
432
 
292
- ## Disclaimer
293
-
294
- ⚠️ **Medical Disclaimer**: This model is for research and educational purposes only. It should NOT be used for actual medical diagnosis or treatment decisions. Always consult qualified healthcare professionals for medical advice.
295
-
296
433
  ## Support
297
434
 
298
- - **[User Manual](USER_MANUAL.md)** - Complete step-by-step training guide
299
- - �📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)
435
+ - 📫 [User Manual](USER_MANUAL.md)\*\* - Complete step-by-step training guide
436
+ - 📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)
300
437
  - 💬 Discussions: [GitHub Discussions](https://github.com/sigdelsanjog/gptmed/discussions)
301
- - 📧 Email: sanjog.sigdel@ku.edu.np
438
+ - 📧 Email: sigdelsanjog@gmail.com | sanjog.sigdel@ku.edu.np
302
439
 
303
440
  ## Changelog
304
441
 
@@ -306,4 +443,4 @@ See [CHANGELOG.md](CHANGELOG.md) for version history.
306
443
 
307
444
  ---
308
445
 
309
- Made with ❤️ for learning purpose
446
+ #### Made with ❤️ from Nepal
@@ -1,23 +1,51 @@
1
1
  # GptMed 🤖
2
2
 
3
- A lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q&A datasets - whether it's casual conversations, technical support, education, or any other domain.
4
-
3
+ [![Downloads](https://static.pepy.tech/badge/gptmed)](https://pepy.tech/project/gptmed)
4
+ [![Downloads/Month](https://static.pepy.tech/badge/gptmed/month)](https://pepy.tech/project/gptmed)
5
5
  [![PyPI version](https://badge.fury.io/py/gptmed.svg)](https://badge.fury.io/py/gptmed)
6
6
  [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/)
7
7
  [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
8
8
 
9
- ## 📖 [Complete User Manual](USER_MANUAL.md) | [Quick Start](#quick-start)
9
+ A lightweight GPT-based language model framework for training custom question-answering models on any domain. This package provides a transformer-based GPT architecture that you can train on your own Q&A datasets - whether it's casual conversations, technical support, education, or any other domain.
10
10
 
11
- > **New to GptMed?** Check out the [**step-by-step User Manual**](USER_MANUAL.md) for a complete guide on training your own model!
11
+ ## Citation
12
12
 
13
- ## Features
13
+ If you use this model in your research, please cite:
14
14
 
15
- - 🧠 **Custom GPT Architecture**: Lightweight transformer model for any Q&A domain
16
- - 🎯 **Domain-Agnostic**: Train on any question-answering dataset (casual chat, tech support, education, etc.)
17
- - **Fast Inference**: Optimized for quick question answering
18
- - 🔧 **Flexible Training**: Easy to train on your own custom datasets
19
- - 📦 **Lightweight**: Small model size suitable for edge deployment
20
- - 🛠️ **Complete Toolkit**: Includes tokenizer training, model training, and inference utilities
15
+ ```bibtex
16
+ @software{gptmed_2026,
17
+ author = {Sanjog Sigdel},
18
+ title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},
19
+ year = {2026},
20
+ url = {https://github.com/sigdelsanjog/gptmed}
21
+ }
22
+ ```
23
+
24
+ ## Table of Contents
25
+
26
+ - [Installation](#installation)
27
+ - [From PyPI (Recommended)](#from-pypi-recommended)
28
+ - [From Source](#from-source)
29
+ - [With Optional Dependencies](#with-optional-dependencies)
30
+ - [Quick Start](#quick-start)
31
+ - [Using the High-Level API](#using-the-high-level-api)
32
+ - [Inference (Generate Answers)](#inference-generate-answers)
33
+ - [Using Command Line](#using-command-line)
34
+ - [Training Your Own Model](#training-your-own-model)
35
+ - [Model Architecture](#model-architecture)
36
+ - [Configuration](#configuration)
37
+ - [Model Sizes](#model-sizes)
38
+ - [Training Configuration](#training-configuration)
39
+ - [Observability](#observability)
40
+ - [Project Structure](#project-structure)
41
+ - [Requirements](#requirements)
42
+ - [Documentation](#documentation)
43
+ - [Performance](#performance)
44
+ - [Examples](#examples)
45
+ - [Contributing](#contributing)
46
+ - [Citation](#citation)
47
+ - [License](#license)
48
+ - [Support](#support)
21
49
 
22
50
  ## Installation
23
51
 
@@ -41,15 +69,49 @@ pip install -e .
41
69
  # For development
42
70
  pip install gptmed[dev]
43
71
 
44
- # For training
72
+ # For training with logging integrations
45
73
  pip install gptmed[training]
46
74
 
75
+ # For visualization (loss curves, metrics plots)
76
+ pip install gptmed[visualization]
77
+
78
+ # For Explainable AI features
79
+ pip install gptmed[xai]
80
+
47
81
  # All dependencies
48
- pip install gptmed[dev,training]
82
+ pip install gptmed[dev,training,visualization,xai]
49
83
  ```
50
84
 
51
85
  ## Quick Start
52
86
 
87
+ ### Using the High-Level API
88
+
89
+ The easiest way to use GptMed is through the high-level API:
90
+
91
+ ```python
92
+ import gptmed
93
+
94
+ # 1. Create a training configuration
95
+ gptmed.create_config('my_config.yaml')
96
+
97
+ # 2. Edit my_config.yaml with your settings (data paths, model size, etc.)
98
+
99
+ # 3. Train the model
100
+ gptmed.train_from_config('my_config.yaml')
101
+
102
+ # 4. Generate answers
103
+ answer = gptmed.generate(
104
+ checkpoint='model/checkpoints/best_model.pt',
105
+ tokenizer='tokenizer/my_tokenizer.model',
106
+ prompt='What is machine learning?',
107
+ max_length=150,
108
+ temperature=0.7
109
+ )
110
+ print(answer)
111
+ ```
112
+
113
+ For a complete API testing workflow, see the [gptmed-api folder](https://github.com/sigdelsanjog/gptmed/tree/main/gptmed-api) with ready-to-run examples.
114
+
53
115
  ### Inference (Generate Answers)
54
116
 
55
117
  ```python
@@ -145,6 +207,50 @@ config = TrainingConfig(
145
207
  )
146
208
  ```
147
209
 
210
+ ## Observability
211
+
212
+ **New in v0.4.0**: Built-in training monitoring with Observer Pattern architecture.
213
+
214
+ ### Features
215
+
216
+ - 📊 **Loss Curves**: Track training/validation loss over time
217
+ - 📈 **Metrics Tracking**: Perplexity, gradient norms, learning rates
218
+ - 🔔 **Callbacks**: Console output, JSON logging, early stopping
219
+ - 📁 **Export**: CSV export, matplotlib visualizations
220
+ - 🔌 **Extensible**: Add custom observers for integrations (W&B, TensorBoard)
221
+
222
+ ### Quick Example
223
+
224
+ ```python
225
+ from gptmed.observability import MetricsTracker, ConsoleCallback, EarlyStoppingCallback
226
+
227
+ # Create observers
228
+ tracker = MetricsTracker(output_dir='./metrics')
229
+ console = ConsoleCallback(print_every=50)
230
+ early_stop = EarlyStoppingCallback(patience=3)
231
+
232
+ # Use with TrainingService (automatic)
233
+ from gptmed.services import TrainingService
234
+ service = TrainingService(config_path='config.yaml')
235
+ service.train() # Automatically creates MetricsTracker
236
+
237
+ # Or use with Trainer directly
238
+ trainer = Trainer(model, train_loader, config, observers=[tracker, console])
239
+ trainer.train()
240
+ ```
241
+
242
+ ### Available Observers
243
+
244
+ | Observer | Description |
245
+ | ----------------------- | --------------------------------------------------------- |
246
+ | `MetricsTracker` | Comprehensive metrics collection with export capabilities |
247
+ | `ConsoleCallback` | Real-time console output with progress bars |
248
+ | `JSONLoggerCallback` | Structured JSON logging for analysis |
249
+ | `EarlyStoppingCallback` | Stop training when validation loss plateaus |
250
+ | `LRSchedulerCallback` | Learning rate scheduling integration |
251
+
252
+ See [XAI.md](XAI.md) for future Explainable AI features roadmap.
253
+
148
254
  ## Project Structure
149
255
 
150
256
  ```
@@ -159,10 +265,16 @@ gptmed/
159
265
  │ ├── train.py # Training script
160
266
  │ ├── trainer.py # Training loop
161
267
  │ └── dataset.py # Data loading
268
+ ├── observability/ # Training monitoring & XAI (v0.4.0+)
269
+ │ ├── base.py # Observer pattern interfaces
270
+ │ ├── metrics_tracker.py # Loss curves & metrics
271
+ │ └── callbacks.py # Console, JSON, early stopping
162
272
  ├── tokenizer/
163
273
  │ └── train_tokenizer.py # SentencePiece tokenizer
164
274
  ├── configs/
165
275
  │ └── train_config.py # Training configurations
276
+ ├── services/
277
+ │ └── training_service.py # High-level training orchestration
166
278
  └── utils/
167
279
  ├── checkpoints.py # Model checkpointing
168
280
  └── logging.py # Training logging
@@ -184,6 +296,7 @@ gptmed/
184
296
 
185
297
  - [User Manual](USER_MANUAL.md) - **Start here!** Complete training pipeline guide
186
298
  - [Architecture Guide](ARCHITECTURE_EXTENSION_GUIDE.md) - Understanding the model architecture
299
+ - [XAI Roadmap](XAI.md) - Explainable AI features & implementation guide
187
300
  - [Deployment Guide](DEPLOYMENT_GUIDE.md) - Publishing to PyPI
188
301
  - [Changelog](CHANGELOG.md) - Version history
189
302
 
@@ -199,20 +312,53 @@ _Tested on GTX 1080 8GB_
199
312
 
200
313
  ## Examples
201
314
 
202
- ### Medical Question Answering
315
+ ### Domain-Agnostic Usage
316
+
317
+ GptMed works with **any domain** - just train on your own Q&A data:
203
318
 
204
319
  ```python
205
- # Example 1: Symptoms inquiry
206
- question = "What are the early signs of Alzheimer's disease?"
320
+ # Technical Support Bot
321
+ question = "How do I reset my WiFi router?"
207
322
  answer = generator.generate(question, temperature=0.7)
208
323
 
209
- # Example 2: Treatment information
210
- question = "How is Type 2 diabetes treated?"
324
+ # Educational Assistant
325
+ question = "Explain the water cycle in simple terms"
211
326
  answer = generator.generate(question, temperature=0.6)
212
327
 
213
- # Example 3: Medical definitions
214
- question = "What is hypertension?"
328
+ # Customer Service
329
+ question = "What is your return policy?"
215
330
  answer = generator.generate(question, temperature=0.5)
331
+
332
+ # Medical Q&A (example domain)
333
+ question = "What are the symptoms of flu?"
334
+ answer = generator.generate(question, temperature=0.7)
335
+ ```
336
+
337
+ ### Training Observability (v0.4.0+)
338
+
339
+ Monitor your training with built-in observability:
340
+
341
+ ```python
342
+ from gptmed.observability import MetricsTracker, ConsoleCallback
343
+
344
+ # Create observers
345
+ tracker = MetricsTracker(output_dir='./metrics')
346
+ console = ConsoleCallback(print_every=10)
347
+
348
+ # Train with observability
349
+ gptmed.train_from_config(
350
+ 'my_config.yaml',
351
+ observers=[tracker, console]
352
+ )
353
+
354
+ # After training - get the report
355
+ report = tracker.get_report()
356
+ print(f"Final Loss: {report['final_loss']:.4f}")
357
+ print(f"Total Steps: {report['total_steps']}")
358
+
359
+ # Export metrics
360
+ tracker.export_to_csv('training_metrics.csv')
361
+ tracker.plot_loss_curves('loss_curves.png') # Requires matplotlib
216
362
  ```
217
363
 
218
364
  ## Contributing
@@ -225,19 +371,6 @@ Contributions are welcome! Please feel free to submit a Pull Request.
225
371
  4. Push to the branch (`git push origin feature/AmazingFeature`)
226
372
  5. Open a Pull Request
227
373
 
228
- ## Citation
229
-
230
- If you use this model in your research, please cite:
231
-
232
- ```bibtex
233
- @software{llm_med_2026,
234
- author = {Sanjog Sigdel},
235
- title = {GptMed: A custom causal question answering general purpose GPT Transformer Architecture Model},
236
- year = {2026},
237
- url = {https://github.com/sigdelsanjog/gptmed}
238
- }
239
- ```
240
-
241
374
  ## License
242
375
 
243
376
  This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
@@ -247,16 +380,12 @@ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file
247
380
  - MedQuAD dataset creators
248
381
  - PyTorch team
249
382
 
250
- ## Disclaimer
251
-
252
- ⚠️ **Medical Disclaimer**: This model is for research and educational purposes only. It should NOT be used for actual medical diagnosis or treatment decisions. Always consult qualified healthcare professionals for medical advice.
253
-
254
383
  ## Support
255
384
 
256
- - **[User Manual](USER_MANUAL.md)** - Complete step-by-step training guide
257
- - �📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)
385
+ - 📫 [User Manual](USER_MANUAL.md)\*\* - Complete step-by-step training guide
386
+ - 📫 Issues: [GitHub Issues](https://github.com/sigdelsanjog/gptmed/issues)
258
387
  - 💬 Discussions: [GitHub Discussions](https://github.com/sigdelsanjog/gptmed/discussions)
259
- - 📧 Email: sanjog.sigdel@ku.edu.np
388
+ - 📧 Email: sigdelsanjog@gmail.com | sanjog.sigdel@ku.edu.np
260
389
 
261
390
  ## Changelog
262
391
 
@@ -264,4 +393,4 @@ See [CHANGELOG.md](CHANGELOG.md) for version history.
264
393
 
265
394
  ---
266
395
 
267
- Made with ❤️ for learning purpose
396
+ #### Made with ❤️ from Nepal