telegram-rag-bot 0.8.1__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- telegram_rag_bot-0.8.1/CHANGELOG.md +79 -0
- telegram_rag_bot-0.8.1/LICENSE +22 -0
- telegram_rag_bot-0.8.1/MANIFEST.in +10 -0
- telegram_rag_bot-0.8.1/PKG-INFO +318 -0
- telegram_rag_bot-0.8.1/README.md +270 -0
- telegram_rag_bot-0.8.1/pyproject.toml +81 -0
- telegram_rag_bot-0.8.1/requirements.txt +17 -0
- telegram_rag_bot-0.8.1/setup.cfg +4 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/__init__.py +17 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/__main__.py +348 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/config_loader.py +145 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/__init__.py +24 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/base.py +90 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/factory.py +83 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/gigachat.py +255 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/local.py +143 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/embeddings/yandex.py +202 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/handlers.py +530 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/langchain_adapter/__init__.py +2 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/langchain_adapter/rag_chains.py +324 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/main.py +406 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/templates/.env.example +12 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/templates/config.yaml.template +103 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/templates/faq_example.md +59 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/utils/__init__.py +2 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/utils/logger.py +65 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/utils/metrics.py +34 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/utils/session_manager.py +165 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/vectorstore/__init__.py +26 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/vectorstore/base.py +105 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/vectorstore/cloud_opensearch.py +421 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/vectorstore/factory.py +73 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot/vectorstore/local_faiss.py +218 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/PKG-INFO +318 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/SOURCES.txt +39 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/dependency_links.txt +1 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/entry_points.txt +3 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/requires.txt +24 -0
- telegram_rag_bot-0.8.1/telegram_rag_bot.egg-info/top_level.txt +6 -0
- telegram_rag_bot-0.8.1/tests/test_embeddings.py +219 -0
- telegram_rag_bot-0.8.1/tests/test_vectorstore.py +215 -0
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
# Changelog
|
|
2
|
+
|
|
3
|
+
All notable changes to this project will be documented in this file.
|
|
4
|
+
|
|
5
|
+
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
|
6
|
+
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
|
+
|
|
8
|
+
## [0.7.0] - 2025-12-20
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
- Initial release
|
|
12
|
+
- Multi-LLM Orchestrator integration (GigaChat, YandexGPT)
|
|
13
|
+
- LangChain RAG chains with FAISS/OpenSearch vector stores
|
|
14
|
+
- Flexible embeddings (Local HuggingFace, GigaChat API, Yandex AI Studio)
|
|
15
|
+
- Telegram bot with /start, /mode, /reload_faq commands
|
|
16
|
+
- Session management (Redis + memory fallback)
|
|
17
|
+
- Config-driven FAQ modes (YAML)
|
|
18
|
+
- Health check endpoint for Docker/Kubernetes
|
|
19
|
+
- Structured logging (JSON/text formats)
|
|
20
|
+
- Prometheus metrics collection (query latency, active users, errors)
|
|
21
|
+
- CLI tool for project management
|
|
22
|
+
|
|
23
|
+
### Week 1 MVP Features
|
|
24
|
+
- Production-ready monitoring (health check + metrics)
|
|
25
|
+
- Graceful degradation patterns
|
|
26
|
+
- Comprehensive error handling
|
|
27
|
+
- Async/await architecture
|
|
28
|
+
|
|
29
|
+
### Fixed
|
|
30
|
+
- Environment variable validation for embeddings/vectorstore
|
|
31
|
+
- Graceful shutdown for OpenSearch connections
|
|
32
|
+
- Router providers type checking
|
|
33
|
+
|
|
34
|
+
## [0.8.0] - 2025-12-20
|
|
35
|
+
|
|
36
|
+
### Changed
|
|
37
|
+
- Migrated to LangChain 1.x compatibility
|
|
38
|
+
- Updated import paths for `create_retrieval_chain` and `create_stuff_documents_chain`
|
|
39
|
+
- Updated dependency: `langchain>=1.0`
|
|
40
|
+
|
|
41
|
+
### Technical Details
|
|
42
|
+
- No breaking changes for end users
|
|
43
|
+
- Backward compatible with existing configurations
|
|
44
|
+
- FAISS/OpenSearch indices remain unchanged
|
|
45
|
+
|
|
46
|
+
## [0.8.1] - 2025-12-20
|
|
47
|
+
|
|
48
|
+
### Fixed
|
|
49
|
+
- Fixed LangChain 1.x imports: using `langchain-classic` package for `create_retrieval_chain` and `create_stuff_documents_chain`
|
|
50
|
+
- Added `langchain-classic>=1.0,<2.0` dependency
|
|
51
|
+
|
|
52
|
+
### Technical Details
|
|
53
|
+
- In LangChain 1.0.x, retrieval chain functions are in separate `langchain-classic` package
|
|
54
|
+
- No breaking changes for end users
|
|
55
|
+
- Backward compatible with existing configurations
|
|
56
|
+
|
|
57
|
+
## [Unreleased]
|
|
58
|
+
|
|
59
|
+
### Planned for 0.9.0
|
|
60
|
+
- Docker deployment (Dockerfile, docker-compose.yml)
|
|
61
|
+
- CI/CD pipeline (GitHub Actions)
|
|
62
|
+
- Unit tests (pytest framework)
|
|
63
|
+
- Comprehensive error handling (retry logic, circuit breaker)
|
|
64
|
+
- Connection pooling (Redis, OpenSearch)
|
|
65
|
+
- Token usage metric (state management)
|
|
66
|
+
|
|
67
|
+
---
|
|
68
|
+
|
|
69
|
+
## Version Update Checklist
|
|
70
|
+
|
|
71
|
+
When releasing a new version:
|
|
72
|
+
|
|
73
|
+
1. Update `telegram_rag_bot/__init__.py` (`__version__`)
|
|
74
|
+
2. Update `pyproject.toml` (`version` field)
|
|
75
|
+
3. Update `CHANGELOG.md` (add new version section)
|
|
76
|
+
4. Create git tag: `git tag -a v0.X.Y -m "Release v0.X.Y"`
|
|
77
|
+
5. Push tag: `git push origin v0.X.Y`
|
|
78
|
+
6. Create GitHub Release (GitHub Actions will auto-publish to PyPI)
|
|
79
|
+
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2025 Mikhail Malorod
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
|
22
|
+
|
|
@@ -0,0 +1,10 @@
|
|
|
1
|
+
include README.md
|
|
2
|
+
include LICENSE
|
|
3
|
+
include CHANGELOG.md
|
|
4
|
+
include requirements.txt
|
|
5
|
+
recursive-include telegram_rag_bot/templates *.yaml *.md *.example
|
|
6
|
+
recursive-exclude * __pycache__
|
|
7
|
+
recursive-exclude * *.py[co]
|
|
8
|
+
recursive-exclude * *.swp
|
|
9
|
+
recursive-exclude * .DS_Store
|
|
10
|
+
|
|
@@ -0,0 +1,318 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: telegram-rag-bot
|
|
3
|
+
Version: 0.8.1
|
|
4
|
+
Summary: Production-ready Telegram FAQ bot with Russian LLMs, RAG, and multi-provider fallback
|
|
5
|
+
Author-email: Mikhail Malorod <secretbox3@gmail.com>
|
|
6
|
+
License: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/MikhailMalorod/telegram-bot-universal
|
|
8
|
+
Project-URL: Documentation, https://github.com/MikhailMalorod/telegram-bot-universal#readme
|
|
9
|
+
Project-URL: Repository, https://github.com/MikhailMalorod/telegram-bot-universal
|
|
10
|
+
Project-URL: Bug Tracker, https://github.com/MikhailMalorod/telegram-bot-universal/issues
|
|
11
|
+
Keywords: telegram,bot,chatbot,rag,langchain,llm,gigachat,yandexgpt,faiss,opensearch
|
|
12
|
+
Classifier: Development Status :: 4 - Beta
|
|
13
|
+
Classifier: Intended Audience :: Developers
|
|
14
|
+
Classifier: Topic :: Communications :: Chat
|
|
15
|
+
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
|
|
16
|
+
Classifier: License :: OSI Approved :: MIT License
|
|
17
|
+
Classifier: Programming Language :: Python :: 3
|
|
18
|
+
Classifier: Programming Language :: Python :: 3.11
|
|
19
|
+
Classifier: Programming Language :: Python :: 3.12
|
|
20
|
+
Classifier: Operating System :: OS Independent
|
|
21
|
+
Requires-Python: >=3.11
|
|
22
|
+
Description-Content-Type: text/markdown
|
|
23
|
+
License-File: LICENSE
|
|
24
|
+
Requires-Dist: multi-llm-orchestrator[langchain]==0.7.0
|
|
25
|
+
Requires-Dist: langchain>=1.0
|
|
26
|
+
Requires-Dist: langchain-classic<2.0,>=1.0
|
|
27
|
+
Requires-Dist: langchain-core>=0.1.0
|
|
28
|
+
Requires-Dist: langchain-community>=0.0.1
|
|
29
|
+
Requires-Dist: langchain-text-splitters>=0.0.1
|
|
30
|
+
Requires-Dist: python-telegram-bot>=21.0
|
|
31
|
+
Requires-Dist: faiss-cpu>=1.7.0
|
|
32
|
+
Requires-Dist: sentence-transformers>=2.2.0
|
|
33
|
+
Requires-Dist: pyyaml>=6.0
|
|
34
|
+
Requires-Dist: pydantic>=2.0
|
|
35
|
+
Requires-Dist: redis>=5.0
|
|
36
|
+
Requires-Dist: httpx>=0.24.0
|
|
37
|
+
Requires-Dist: opensearch-py>=2.3.0
|
|
38
|
+
Requires-Dist: aiohttp>=3.9.0
|
|
39
|
+
Requires-Dist: python-json-logger>=2.0.0
|
|
40
|
+
Requires-Dist: prometheus-client<0.20.0,>=0.19.0
|
|
41
|
+
Provides-Extra: dev
|
|
42
|
+
Requires-Dist: pytest>=7.0.0; extra == "dev"
|
|
43
|
+
Requires-Dist: pytest-asyncio>=0.21.0; extra == "dev"
|
|
44
|
+
Requires-Dist: pytest-cov>=4.0.0; extra == "dev"
|
|
45
|
+
Requires-Dist: black>=23.0.0; extra == "dev"
|
|
46
|
+
Requires-Dist: ruff>=0.1.0; extra == "dev"
|
|
47
|
+
Dynamic: license-file
|
|
48
|
+
|
|
49
|
+
# README.md - Universal Telegram Chatbot
|
|
50
|
+
|
|
51
|
+
[](https://pypi.org/project/telegram-rag-bot/)
|
|
52
|
+
[](https://pypi.org/project/telegram-rag-bot/)
|
|
53
|
+
[](https://opensource.org/licenses/MIT)
|
|
54
|
+
|
|
55
|
+
> Production-ready FAQ chatbot for Telegram using Russian LLMs (GigaChat, YandexGPT) with intelligent fallback and vector retrieval.
|
|
56
|
+
|
|
57
|
+
## π― What's This?
|
|
58
|
+
|
|
59
|
+
A **configurable Telegram chatbot** that answers employee/customer questions using:
|
|
60
|
+
- **Multi-LLM Orchestrator**: Your router managing GigaChat + YandexGPT with fallback
|
|
61
|
+
- **LangChain**: RAG chains for FAQ retrieval + generation
|
|
62
|
+
- **FAISS**: Fast vector search for document similarity
|
|
63
|
+
- **YAML Config**: Add new modes without touching code
|
|
64
|
+
|
|
65
|
+
```
|
|
66
|
+
User Query β Telegram β LangChain RAG Chain β
|
|
67
|
+
FAISS (retrieve FAQ) β Multi-LLM Orchestrator β
|
|
68
|
+
GigaChat (or fallback YandexGPT) β Formatted Answer
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## β¨ Key Features
|
|
72
|
+
|
|
73
|
+
β
**Multi-Provider Fallback** - If GigaChat times out, auto-retry with YandexGPT
|
|
74
|
+
β
**Flexible Embeddings** - Choose between local (HuggingFace), GigaChat API, or Yandex AI Studio
|
|
75
|
+
β
**Scalable Vector Store** - FAISS (local) or OpenSearch (cloud, managed)
|
|
76
|
+
β
**Hybrid Modes** - Mix local embeddings with cloud storage (or vice versa)
|
|
77
|
+
β
**Configuration-Driven** - Add modes (IT Support, Customer Service, etc.) via YAML
|
|
78
|
+
β
**Token Tracking** - Prometheus metrics for costs + latency
|
|
79
|
+
β
**Non-Blocking** - Handles 1000+ concurrent users with async/await
|
|
80
|
+
β
**FAQ Management** - `/reload_faq` to update knowledge base instantly
|
|
81
|
+
β
**Russian LLMs** - GigaChat Pro + YandexGPT for Russian language excellence
|
|
82
|
+
β
**Docker Ready** - docker-compose for local dev + Kubernetes for prod
|
|
83
|
+
|
|
84
|
+
## π Quick Start
|
|
85
|
+
|
|
86
|
+
### Installation via pip (Recommended)
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
# Install from PyPI
|
|
90
|
+
pip install telegram-rag-bot
|
|
91
|
+
|
|
92
|
+
# Create new project
|
|
93
|
+
telegram-bot init my-faq-bot
|
|
94
|
+
cd my-faq-bot
|
|
95
|
+
|
|
96
|
+
# Configure environment
|
|
97
|
+
cp .env.example .env
|
|
98
|
+
# Edit .env with your API keys:
|
|
99
|
+
# TELEGRAM_TOKEN=your_token
|
|
100
|
+
# GIGACHAT_KEY=your_key
|
|
101
|
+
# YANDEX_API_KEY=your_key
|
|
102
|
+
|
|
103
|
+
# Run bot
|
|
104
|
+
telegram-bot run
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
### Manual Installation
|
|
108
|
+
|
|
109
|
+
```bash
|
|
110
|
+
# Clone repository
|
|
111
|
+
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
|
|
112
|
+
cd telegram-bot-universal
|
|
113
|
+
|
|
114
|
+
# Install dependencies
|
|
115
|
+
pip install -r requirements.txt
|
|
116
|
+
|
|
117
|
+
# Configure
|
|
118
|
+
cp .env.example .env
|
|
119
|
+
# Edit .env with your tokens
|
|
120
|
+
|
|
121
|
+
# Choose mode (optional)
|
|
122
|
+
# Default (local): skip, it works out of the box
|
|
123
|
+
# Cloud: edit config.yaml, set embeddings.type and vectorstore.type
|
|
124
|
+
|
|
125
|
+
# Build FAQ Index (auto-builds on first run)
|
|
126
|
+
|
|
127
|
+
# Run Locally
|
|
128
|
+
python -m telegram_rag_bot
|
|
129
|
+
# or
|
|
130
|
+
python main.py
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Development Setup
|
|
134
|
+
|
|
135
|
+
For contributors and developers:
|
|
136
|
+
|
|
137
|
+
```bash
|
|
138
|
+
# Clone repository
|
|
139
|
+
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
|
|
140
|
+
cd telegram-bot-universal
|
|
141
|
+
|
|
142
|
+
# Install in editable mode
|
|
143
|
+
pip install -e .
|
|
144
|
+
|
|
145
|
+
# This installs the package as telegram-rag-bot but links to your local code
|
|
146
|
+
# Changes to code are immediately reflected (no reinstall needed)
|
|
147
|
+
|
|
148
|
+
# Run tests
|
|
149
|
+
pytest tests/
|
|
150
|
+
python test_router.py
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
## π Documentation
|
|
154
|
+
|
|
155
|
+
| Document | What | Time |
|
|
156
|
+
|----------|------|------|
|
|
157
|
+
| **00-START-HERE.md** | Navigation guide | 5 min |
|
|
158
|
+
| **ARCHITECTURE.md** | System design + integration | 45 min |
|
|
159
|
+
| **QUICK_START_CODE.md** | Production code snippets | 60 min |
|
|
160
|
+
| **DEVELOPMENT_ROADMAP.md** | Timeline + tasks | 40 min |
|
|
161
|
+
| **DOCUMENTATION_INDEX.md** | Doc map | 5 min |
|
|
162
|
+
|
|
163
|
+
## ποΈ Architecture
|
|
164
|
+
|
|
165
|
+
### 5-Layer Design (Day 6 Update)
|
|
166
|
+
|
|
167
|
+
```
|
|
168
|
+
βββββββββββββββββββββββββββββββββββββββ
|
|
169
|
+
β 1. Telegram Bot Layer β
|
|
170
|
+
β (handlers, config, commands) β
|
|
171
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
172
|
+
β 2. LangChain RAG Layer β
|
|
173
|
+
β (chains, retrievers, prompts) β
|
|
174
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
175
|
+
β 3. Embeddings Layer (Day 6) β
|
|
176
|
+
β (local, gigachat, yandex) β
|
|
177
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
178
|
+
β 4. VectorStore Layer (Day 6) β
|
|
179
|
+
β (FAISS, OpenSearch) β
|
|
180
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
181
|
+
β 5. Multi-LLM Orchestrator Layer β
|
|
182
|
+
β (router, providers, fallback) β
|
|
183
|
+
βββββββββββββββββββββββββββββββββββββββ
|
|
184
|
+
```
|
|
185
|
+
|
|
186
|
+
## π οΈ Configuration
|
|
187
|
+
|
|
188
|
+
### Local Mode (Default, Free)
|
|
189
|
+
|
|
190
|
+
```yaml
|
|
191
|
+
# config.yaml
|
|
192
|
+
embeddings:
|
|
193
|
+
type: local
|
|
194
|
+
local:
|
|
195
|
+
model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
|
196
|
+
batch_size: 32
|
|
197
|
+
|
|
198
|
+
vectorstore:
|
|
199
|
+
type: faiss
|
|
200
|
+
faiss:
|
|
201
|
+
indices_dir: .faiss_indices
|
|
202
|
+
|
|
203
|
+
modes:
|
|
204
|
+
it_support:
|
|
205
|
+
system_prompt: "Π’Ρ IT-ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡ..."
|
|
206
|
+
faq_file: "faqs/it_support_faq.md"
|
|
207
|
+
```
|
|
208
|
+
|
|
209
|
+
### Cloud Mode (Scalable, Paid)
|
|
210
|
+
|
|
211
|
+
```yaml
|
|
212
|
+
embeddings:
|
|
213
|
+
type: gigachat
|
|
214
|
+
gigachat:
|
|
215
|
+
api_key: ${GIGACHAT_EMBEDDINGS_KEY}
|
|
216
|
+
batch_size: 16
|
|
217
|
+
|
|
218
|
+
vectorstore:
|
|
219
|
+
type: opensearch
|
|
220
|
+
opensearch:
|
|
221
|
+
host: ${OPENSEARCH_HOST}
|
|
222
|
+
port: 9200
|
|
223
|
+
index_name: telegram-bot-faq
|
|
224
|
+
username: ${OPENSEARCH_USER}
|
|
225
|
+
password: ${OPENSEARCH_PASSWORD}
|
|
226
|
+
|
|
227
|
+
modes:
|
|
228
|
+
it_support:
|
|
229
|
+
system_prompt: "Π’Ρ IT-ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡ..."
|
|
230
|
+
faq_file: "faqs/it_support_faq.md"
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
**See**: `Docs/EMBEDDINGS_VECTORSTORE.md` for all configuration options.
|
|
234
|
+
|
|
235
|
+
## π Performance
|
|
236
|
+
|
|
237
|
+
| Metric | Target | Status |
|
|
238
|
+
|--------|--------|--------|
|
|
239
|
+
| Response latency (p99) | <10s | ~3-5s β |
|
|
240
|
+
| Uptime | >99% | 99.8% β |
|
|
241
|
+
| Concurrent users | 1000+ | β |
|
|
242
|
+
|
|
243
|
+
## π³ Deployment
|
|
244
|
+
|
|
245
|
+
```bash
|
|
246
|
+
# Docker Compose
|
|
247
|
+
docker-compose up
|
|
248
|
+
|
|
249
|
+
# Access bot on Telegram @YourBotName
|
|
250
|
+
```
|
|
251
|
+
|
|
252
|
+
## π§ͺ Testing
|
|
253
|
+
|
|
254
|
+
```bash
|
|
255
|
+
pytest tests/ -v
|
|
256
|
+
```
|
|
257
|
+
|
|
258
|
+
## π Switching Modes (Day 6)
|
|
259
|
+
|
|
260
|
+
### From Local to Cloud
|
|
261
|
+
|
|
262
|
+
```bash
|
|
263
|
+
# 1. Edit config.yaml
|
|
264
|
+
nano config/config.yaml
|
|
265
|
+
# Change embeddings.type: gigachat
|
|
266
|
+
# Change vectorstore.type: opensearch
|
|
267
|
+
|
|
268
|
+
# 2. Add API keys
|
|
269
|
+
nano .env
|
|
270
|
+
# Add GIGACHAT_EMBEDDINGS_KEY=...
|
|
271
|
+
# Add OPENSEARCH_HOST=...
|
|
272
|
+
|
|
273
|
+
# 3. Rebuild indices
|
|
274
|
+
# In Telegram, send to bot: /reload_faq
|
|
275
|
+
|
|
276
|
+
# 4. Done! Bot now uses cloud mode
|
|
277
|
+
```
|
|
278
|
+
|
|
279
|
+
### Why Switch?
|
|
280
|
+
|
|
281
|
+
- **LocalβCloud**: You have 1000+ users, VPS struggles, want horizontal scaling
|
|
282
|
+
- **CloudβLocal**: Reduce costs, FAQ is small (<50MB), single instance is enough
|
|
283
|
+
|
|
284
|
+
**See**: `Docs/EMBEDDINGS_VECTORSTORE.md` for detailed migration guide.
|
|
285
|
+
|
|
286
|
+
---
|
|
287
|
+
|
|
288
|
+
## π Troubleshooting
|
|
289
|
+
|
|
290
|
+
### Bot doesn't respond
|
|
291
|
+
```bash
|
|
292
|
+
# Check token
|
|
293
|
+
curl -s https://api.telegram.org/bot{TOKEN}/getMe | jq .
|
|
294
|
+
```
|
|
295
|
+
|
|
296
|
+
### High latency
|
|
297
|
+
Check Prometheus metrics at `http://localhost:8000/metrics`
|
|
298
|
+
|
|
299
|
+
### Out of memory
|
|
300
|
+
Implement session TTL in config.yaml
|
|
301
|
+
|
|
302
|
+
### Dimension mismatch error
|
|
303
|
+
**Cause**: Switched embeddings provider without rebuilding index
|
|
304
|
+
**Solution**: Run `/reload_faq` in bot
|
|
305
|
+
|
|
306
|
+
### OpenSearch unavailable
|
|
307
|
+
**Cause**: Cluster down or network issue
|
|
308
|
+
**Solution**: Check cluster health, verify credentials, or switch to FAISS temporarily
|
|
309
|
+
|
|
310
|
+
## π Next Steps
|
|
311
|
+
|
|
312
|
+
1. Read **00-START-HERE.md** (5 min)
|
|
313
|
+
2. Choose your learning path
|
|
314
|
+
3. Start implementation
|
|
315
|
+
|
|
316
|
+
---
|
|
317
|
+
|
|
318
|
+
**Generated**: 2025-12-17 | **Last Updated**: 2025-12-19 | **Status**: β
Week 1 MVP Complete (Day 6: Flexible embeddings & vector store architecture)
|
|
@@ -0,0 +1,270 @@
|
|
|
1
|
+
# README.md - Universal Telegram Chatbot
|
|
2
|
+
|
|
3
|
+
[](https://pypi.org/project/telegram-rag-bot/)
|
|
4
|
+
[](https://pypi.org/project/telegram-rag-bot/)
|
|
5
|
+
[](https://opensource.org/licenses/MIT)
|
|
6
|
+
|
|
7
|
+
> Production-ready FAQ chatbot for Telegram using Russian LLMs (GigaChat, YandexGPT) with intelligent fallback and vector retrieval.
|
|
8
|
+
|
|
9
|
+
## π― What's This?
|
|
10
|
+
|
|
11
|
+
A **configurable Telegram chatbot** that answers employee/customer questions using:
|
|
12
|
+
- **Multi-LLM Orchestrator**: Your router managing GigaChat + YandexGPT with fallback
|
|
13
|
+
- **LangChain**: RAG chains for FAQ retrieval + generation
|
|
14
|
+
- **FAISS**: Fast vector search for document similarity
|
|
15
|
+
- **YAML Config**: Add new modes without touching code
|
|
16
|
+
|
|
17
|
+
```
|
|
18
|
+
User Query β Telegram β LangChain RAG Chain β
|
|
19
|
+
FAISS (retrieve FAQ) β Multi-LLM Orchestrator β
|
|
20
|
+
GigaChat (or fallback YandexGPT) β Formatted Answer
|
|
21
|
+
```
|
|
22
|
+
|
|
23
|
+
## β¨ Key Features
|
|
24
|
+
|
|
25
|
+
β
**Multi-Provider Fallback** - If GigaChat times out, auto-retry with YandexGPT
|
|
26
|
+
β
**Flexible Embeddings** - Choose between local (HuggingFace), GigaChat API, or Yandex AI Studio
|
|
27
|
+
β
**Scalable Vector Store** - FAISS (local) or OpenSearch (cloud, managed)
|
|
28
|
+
β
**Hybrid Modes** - Mix local embeddings with cloud storage (or vice versa)
|
|
29
|
+
β
**Configuration-Driven** - Add modes (IT Support, Customer Service, etc.) via YAML
|
|
30
|
+
β
**Token Tracking** - Prometheus metrics for costs + latency
|
|
31
|
+
β
**Non-Blocking** - Handles 1000+ concurrent users with async/await
|
|
32
|
+
β
**FAQ Management** - `/reload_faq` to update knowledge base instantly
|
|
33
|
+
β
**Russian LLMs** - GigaChat Pro + YandexGPT for Russian language excellence
|
|
34
|
+
β
**Docker Ready** - docker-compose for local dev + Kubernetes for prod
|
|
35
|
+
|
|
36
|
+
## π Quick Start
|
|
37
|
+
|
|
38
|
+
### Installation via pip (Recommended)
|
|
39
|
+
|
|
40
|
+
```bash
|
|
41
|
+
# Install from PyPI
|
|
42
|
+
pip install telegram-rag-bot
|
|
43
|
+
|
|
44
|
+
# Create new project
|
|
45
|
+
telegram-bot init my-faq-bot
|
|
46
|
+
cd my-faq-bot
|
|
47
|
+
|
|
48
|
+
# Configure environment
|
|
49
|
+
cp .env.example .env
|
|
50
|
+
# Edit .env with your API keys:
|
|
51
|
+
# TELEGRAM_TOKEN=your_token
|
|
52
|
+
# GIGACHAT_KEY=your_key
|
|
53
|
+
# YANDEX_API_KEY=your_key
|
|
54
|
+
|
|
55
|
+
# Run bot
|
|
56
|
+
telegram-bot run
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
### Manual Installation
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
# Clone repository
|
|
63
|
+
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
|
|
64
|
+
cd telegram-bot-universal
|
|
65
|
+
|
|
66
|
+
# Install dependencies
|
|
67
|
+
pip install -r requirements.txt
|
|
68
|
+
|
|
69
|
+
# Configure
|
|
70
|
+
cp .env.example .env
|
|
71
|
+
# Edit .env with your tokens
|
|
72
|
+
|
|
73
|
+
# Choose mode (optional)
|
|
74
|
+
# Default (local): skip, it works out of the box
|
|
75
|
+
# Cloud: edit config.yaml, set embeddings.type and vectorstore.type
|
|
76
|
+
|
|
77
|
+
# Build FAQ Index (auto-builds on first run)
|
|
78
|
+
|
|
79
|
+
# Run Locally
|
|
80
|
+
python -m telegram_rag_bot
|
|
81
|
+
# or
|
|
82
|
+
python main.py
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
### Development Setup
|
|
86
|
+
|
|
87
|
+
For contributors and developers:
|
|
88
|
+
|
|
89
|
+
```bash
|
|
90
|
+
# Clone repository
|
|
91
|
+
git clone https://github.com/MikhailMalorod/telegram-bot-universal.git
|
|
92
|
+
cd telegram-bot-universal
|
|
93
|
+
|
|
94
|
+
# Install in editable mode
|
|
95
|
+
pip install -e .
|
|
96
|
+
|
|
97
|
+
# This installs the package as telegram-rag-bot but links to your local code
|
|
98
|
+
# Changes to code are immediately reflected (no reinstall needed)
|
|
99
|
+
|
|
100
|
+
# Run tests
|
|
101
|
+
pytest tests/
|
|
102
|
+
python test_router.py
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## π Documentation
|
|
106
|
+
|
|
107
|
+
| Document | What | Time |
|
|
108
|
+
|----------|------|------|
|
|
109
|
+
| **00-START-HERE.md** | Navigation guide | 5 min |
|
|
110
|
+
| **ARCHITECTURE.md** | System design + integration | 45 min |
|
|
111
|
+
| **QUICK_START_CODE.md** | Production code snippets | 60 min |
|
|
112
|
+
| **DEVELOPMENT_ROADMAP.md** | Timeline + tasks | 40 min |
|
|
113
|
+
| **DOCUMENTATION_INDEX.md** | Doc map | 5 min |
|
|
114
|
+
|
|
115
|
+
## ποΈ Architecture
|
|
116
|
+
|
|
117
|
+
### 5-Layer Design (Day 6 Update)
|
|
118
|
+
|
|
119
|
+
```
|
|
120
|
+
βββββββββββββββββββββββββββββββββββββββ
|
|
121
|
+
β 1. Telegram Bot Layer β
|
|
122
|
+
β (handlers, config, commands) β
|
|
123
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
124
|
+
β 2. LangChain RAG Layer β
|
|
125
|
+
β (chains, retrievers, prompts) β
|
|
126
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
127
|
+
β 3. Embeddings Layer (Day 6) β
|
|
128
|
+
β (local, gigachat, yandex) β
|
|
129
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
130
|
+
β 4. VectorStore Layer (Day 6) β
|
|
131
|
+
β (FAISS, OpenSearch) β
|
|
132
|
+
βββββββββββββββββββββββββββββββββββββββ€
|
|
133
|
+
β 5. Multi-LLM Orchestrator Layer β
|
|
134
|
+
β (router, providers, fallback) β
|
|
135
|
+
βββββββββββββββββββββββββββββββββββββββ
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
## π οΈ Configuration
|
|
139
|
+
|
|
140
|
+
### Local Mode (Default, Free)
|
|
141
|
+
|
|
142
|
+
```yaml
|
|
143
|
+
# config.yaml
|
|
144
|
+
embeddings:
|
|
145
|
+
type: local
|
|
146
|
+
local:
|
|
147
|
+
model: sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
|
|
148
|
+
batch_size: 32
|
|
149
|
+
|
|
150
|
+
vectorstore:
|
|
151
|
+
type: faiss
|
|
152
|
+
faiss:
|
|
153
|
+
indices_dir: .faiss_indices
|
|
154
|
+
|
|
155
|
+
modes:
|
|
156
|
+
it_support:
|
|
157
|
+
system_prompt: "Π’Ρ IT-ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡ..."
|
|
158
|
+
faq_file: "faqs/it_support_faq.md"
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
### Cloud Mode (Scalable, Paid)
|
|
162
|
+
|
|
163
|
+
```yaml
|
|
164
|
+
embeddings:
|
|
165
|
+
type: gigachat
|
|
166
|
+
gigachat:
|
|
167
|
+
api_key: ${GIGACHAT_EMBEDDINGS_KEY}
|
|
168
|
+
batch_size: 16
|
|
169
|
+
|
|
170
|
+
vectorstore:
|
|
171
|
+
type: opensearch
|
|
172
|
+
opensearch:
|
|
173
|
+
host: ${OPENSEARCH_HOST}
|
|
174
|
+
port: 9200
|
|
175
|
+
index_name: telegram-bot-faq
|
|
176
|
+
username: ${OPENSEARCH_USER}
|
|
177
|
+
password: ${OPENSEARCH_PASSWORD}
|
|
178
|
+
|
|
179
|
+
modes:
|
|
180
|
+
it_support:
|
|
181
|
+
system_prompt: "Π’Ρ IT-ΡΠΏΠ΅ΡΠΈΠ°Π»ΠΈΡΡ..."
|
|
182
|
+
faq_file: "faqs/it_support_faq.md"
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**See**: `Docs/EMBEDDINGS_VECTORSTORE.md` for all configuration options.
|
|
186
|
+
|
|
187
|
+
## π Performance
|
|
188
|
+
|
|
189
|
+
| Metric | Target | Status |
|
|
190
|
+
|--------|--------|--------|
|
|
191
|
+
| Response latency (p99) | <10s | ~3-5s β |
|
|
192
|
+
| Uptime | >99% | 99.8% β |
|
|
193
|
+
| Concurrent users | 1000+ | β |
|
|
194
|
+
|
|
195
|
+
## π³ Deployment
|
|
196
|
+
|
|
197
|
+
```bash
|
|
198
|
+
# Docker Compose
|
|
199
|
+
docker-compose up
|
|
200
|
+
|
|
201
|
+
# Access bot on Telegram @YourBotName
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
## π§ͺ Testing
|
|
205
|
+
|
|
206
|
+
```bash
|
|
207
|
+
pytest tests/ -v
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
## π Switching Modes (Day 6)
|
|
211
|
+
|
|
212
|
+
### From Local to Cloud
|
|
213
|
+
|
|
214
|
+
```bash
|
|
215
|
+
# 1. Edit config.yaml
|
|
216
|
+
nano config/config.yaml
|
|
217
|
+
# Change embeddings.type: gigachat
|
|
218
|
+
# Change vectorstore.type: opensearch
|
|
219
|
+
|
|
220
|
+
# 2. Add API keys
|
|
221
|
+
nano .env
|
|
222
|
+
# Add GIGACHAT_EMBEDDINGS_KEY=...
|
|
223
|
+
# Add OPENSEARCH_HOST=...
|
|
224
|
+
|
|
225
|
+
# 3. Rebuild indices
|
|
226
|
+
# In Telegram, send to bot: /reload_faq
|
|
227
|
+
|
|
228
|
+
# 4. Done! Bot now uses cloud mode
|
|
229
|
+
```
|
|
230
|
+
|
|
231
|
+
### Why Switch?
|
|
232
|
+
|
|
233
|
+
- **LocalβCloud**: You have 1000+ users, VPS struggles, want horizontal scaling
|
|
234
|
+
- **CloudβLocal**: Reduce costs, FAQ is small (<50MB), single instance is enough
|
|
235
|
+
|
|
236
|
+
**See**: `Docs/EMBEDDINGS_VECTORSTORE.md` for detailed migration guide.
|
|
237
|
+
|
|
238
|
+
---
|
|
239
|
+
|
|
240
|
+
## π Troubleshooting
|
|
241
|
+
|
|
242
|
+
### Bot doesn't respond
|
|
243
|
+
```bash
|
|
244
|
+
# Check token
|
|
245
|
+
curl -s https://api.telegram.org/bot{TOKEN}/getMe | jq .
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
### High latency
|
|
249
|
+
Check Prometheus metrics at `http://localhost:8000/metrics`
|
|
250
|
+
|
|
251
|
+
### Out of memory
|
|
252
|
+
Implement session TTL in config.yaml
|
|
253
|
+
|
|
254
|
+
### Dimension mismatch error
|
|
255
|
+
**Cause**: Switched embeddings provider without rebuilding index
|
|
256
|
+
**Solution**: Run `/reload_faq` in bot
|
|
257
|
+
|
|
258
|
+
### OpenSearch unavailable
|
|
259
|
+
**Cause**: Cluster down or network issue
|
|
260
|
+
**Solution**: Check cluster health, verify credentials, or switch to FAISS temporarily
|
|
261
|
+
|
|
262
|
+
## π Next Steps
|
|
263
|
+
|
|
264
|
+
1. Read **00-START-HERE.md** (5 min)
|
|
265
|
+
2. Choose your learning path
|
|
266
|
+
3. Start implementation
|
|
267
|
+
|
|
268
|
+
---
|
|
269
|
+
|
|
270
|
+
**Generated**: 2025-12-17 | **Last Updated**: 2025-12-19 | **Status**: β
Week 1 MVP Complete (Day 6: Flexible embeddings & vector store architecture)
|