django-cfg 1.1.61__py3-none-any.whl → 1.1.63__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- django_cfg/__init__.py +1 -1
- django_cfg/management/commands/rundramatiq.py +174 -202
- django_cfg/modules/django_tasks.py +54 -428
- django_cfg/modules/dramatiq_setup.py +16 -0
- {django_cfg-1.1.61.dist-info → django_cfg-1.1.63.dist-info}/METADATA +145 -4
- {django_cfg-1.1.61.dist-info → django_cfg-1.1.63.dist-info}/RECORD +9 -27
- django_cfg/apps/accounts/tests/__init__.py +0 -1
- django_cfg/apps/accounts/tests/test_models.py +0 -412
- django_cfg/apps/accounts/tests/test_otp_views.py +0 -143
- django_cfg/apps/accounts/tests/test_serializers.py +0 -331
- django_cfg/apps/accounts/tests/test_services.py +0 -401
- django_cfg/apps/accounts/tests/test_signals.py +0 -110
- django_cfg/apps/accounts/tests/test_views.py +0 -255
- django_cfg/apps/newsletter/tests/__init__.py +0 -1
- django_cfg/apps/newsletter/tests/run_tests.py +0 -47
- django_cfg/apps/newsletter/tests/test_email_integration.py +0 -256
- django_cfg/apps/newsletter/tests/test_email_tracking.py +0 -332
- django_cfg/apps/newsletter/tests/test_newsletter_manager.py +0 -83
- django_cfg/apps/newsletter/tests/test_newsletter_models.py +0 -157
- django_cfg/apps/support/tests/__init__.py +0 -0
- django_cfg/apps/support/tests/test_models.py +0 -106
- django_cfg/apps/tasks/@docs/CONFIGURATION.md +0 -663
- django_cfg/apps/tasks/@docs/README.md +0 -195
- django_cfg/apps/tasks/@docs/TASKS_QUEUES.md +0 -423
- django_cfg/apps/tasks/@docs/TROUBLESHOOTING.md +0 -506
- {django_cfg-1.1.61.dist-info → django_cfg-1.1.63.dist-info}/WHEEL +0 -0
- {django_cfg-1.1.61.dist-info → django_cfg-1.1.63.dist-info}/entry_points.txt +0 -0
- {django_cfg-1.1.61.dist-info → django_cfg-1.1.63.dist-info}/licenses/LICENSE +0 -0
@@ -1,506 +0,0 @@
|
|
1
|
-
# 🔧 Django-CFG Tasks Troubleshooting Guide
|
2
|
-
|
3
|
-
## 🎯 Overview
|
4
|
-
|
5
|
-
Comprehensive troubleshooting guide for Django-CFG task system issues. Based on real production problems and their solutions.
|
6
|
-
|
7
|
-
**TAGS**: `troubleshooting, debugging, dramatiq, redis, workers`
|
8
|
-
|
9
|
-
---
|
10
|
-
|
11
|
-
## 🚨 Critical Issues %%PRIORITY:HIGH%%
|
12
|
-
|
13
|
-
### 1. Tasks Stuck in Pending Status
|
14
|
-
|
15
|
-
**Symptoms**:
|
16
|
-
- Tasks appear in queue but never process
|
17
|
-
- Document processing status remains "pending"
|
18
|
-
- No worker activity in logs
|
19
|
-
|
20
|
-
**Diagnostic Commands**:
|
21
|
-
```bash
|
22
|
-
# Check if workers are running
|
23
|
-
ps aux | grep rundramatiq
|
24
|
-
pgrep -f rundramatiq
|
25
|
-
|
26
|
-
# Check Redis queue length
|
27
|
-
redis-cli -n 2 LLEN dramatiq:queue:knowledge
|
28
|
-
|
29
|
-
# Check Django task status
|
30
|
-
python manage.py shell -c "
|
31
|
-
from django_dramatiq.models import Task
|
32
|
-
print(f'Pending: {Task.objects.filter(status=\"pending\").count()}')
|
33
|
-
print(f'Failed: {Task.objects.filter(status=\"failed\").count()}')
|
34
|
-
"
|
35
|
-
```
|
36
|
-
|
37
|
-
**Solutions**:
|
38
|
-
1. **Start Workers**: `poetry run python manage.py rundramatiq`
|
39
|
-
2. **Check Redis Connection**: Verify Redis is running on correct port/DB
|
40
|
-
3. **Verify Queue Names**: Ensure task is sent to correct queue
|
41
|
-
4. **Check Logs**: Look for worker startup errors
|
42
|
-
|
43
|
-
---
|
44
|
-
|
45
|
-
### 2. Redis Database Mismatch %%BREAKING_CHANGE%%
|
46
|
-
|
47
|
-
**Problem**: Tasks sent to wrong Redis database
|
48
|
-
|
49
|
-
**Symptoms**:
|
50
|
-
- Tasks enqueued successfully but workers don't see them
|
51
|
-
- Queue appears empty in worker logs
|
52
|
-
- Redis shows tasks in different DB
|
53
|
-
|
54
|
-
**Root Cause**:
|
55
|
-
```python
|
56
|
-
# BROKEN: Concatenating DB numbers
|
57
|
-
redis_url = f"{base_url}/{redis_db}" # "redis://localhost:6379/0/2"
|
58
|
-
```
|
59
|
-
|
60
|
-
**Fix Applied in `generation.py`**:
|
61
|
-
```python
|
62
|
-
from urllib.parse import urlparse, urlunparse
|
63
|
-
|
64
|
-
def fix_redis_url(redis_url: str, redis_db: int) -> str:
|
65
|
-
parsed = urlparse(redis_url)
|
66
|
-
return urlunparse((
|
67
|
-
parsed.scheme,
|
68
|
-
parsed.netloc,
|
69
|
-
f"/{redis_db}", # Replace path with correct DB
|
70
|
-
parsed.params,
|
71
|
-
parsed.query,
|
72
|
-
parsed.fragment
|
73
|
-
))
|
74
|
-
```
|
75
|
-
|
76
|
-
**Verification**:
|
77
|
-
```bash
|
78
|
-
# Check which DB tasks are in
|
79
|
-
redis-cli -n 0 KEYS dramatiq:* # Should be empty
|
80
|
-
redis-cli -n 2 KEYS dramatiq:* # Should show queues
|
81
|
-
```
|
82
|
-
|
83
|
-
---
|
84
|
-
|
85
|
-
### 3. Worker Subprocess Failures %%DEPRECATED%%
|
86
|
-
|
87
|
-
**Problem**: Auto-start workers fail with "broker required" error
|
88
|
-
|
89
|
-
**Error Message**:
|
90
|
-
```
|
91
|
-
dramatiq: error: the following arguments are required: broker
|
92
|
-
```
|
93
|
-
|
94
|
-
**Root Cause**:
|
95
|
-
- Subprocess couldn't find `rundramatiq` command
|
96
|
-
- `DJANGO_SETTINGS_MODULE` not inherited
|
97
|
-
- Poetry environment not available in subprocess
|
98
|
-
|
99
|
-
**Solution**: %%BREAKING_CHANGE%% **Removed auto-start functionality**
|
100
|
-
- Use manual worker management
|
101
|
-
- Start workers via process manager (systemd, supervisor)
|
102
|
-
- No more subprocess complexity
|
103
|
-
|
104
|
-
---
|
105
|
-
|
106
|
-
### 4. Database Routing Issues
|
107
|
-
|
108
|
-
**Problem**: `ConnectionDoesNotExist: The connection 'knowledge' doesn't exist`
|
109
|
-
|
110
|
-
**Root Cause**: Incorrect `app_label` in database routing
|
111
|
-
|
112
|
-
**Fix Applied**:
|
113
|
-
```python
|
114
|
-
# BEFORE: Used full app path as key
|
115
|
-
DATABASE_ROUTING_RULES = {
|
116
|
-
"apps.knowbase": ["knowbase"] # ❌ Wrong
|
117
|
-
}
|
118
|
-
|
119
|
-
# AFTER: Extract app_label correctly
|
120
|
-
app_label = app_path.split('.')[-1] # "apps.knowbase" → "knowbase"
|
121
|
-
DATABASE_ROUTING_RULES = {
|
122
|
-
"knowbase": ["knowbase"] # ✅ Correct
|
123
|
-
}
|
124
|
-
```
|
125
|
-
|
126
|
-
---
|
127
|
-
|
128
|
-
### 5. Task Decoding Errors
|
129
|
-
|
130
|
-
**Problem**: "Error decoding message: Extra data: line 1 column 4 (char 3)"
|
131
|
-
|
132
|
-
**Symptoms**:
|
133
|
-
- Tasks end up in Dead Letter Queue (DLQ)
|
134
|
-
- Worker logs show message decoding failures
|
135
|
-
- Tasks appear corrupted
|
136
|
-
|
137
|
-
**Common Causes**:
|
138
|
-
1. **Message Format Issues**: Incorrect serialization
|
139
|
-
2. **Version Mismatch**: Different Dramatiq versions
|
140
|
-
3. **Encoding Problems**: Unicode/binary issues
|
141
|
-
4. **Corrupted Messages**: Redis data corruption
|
142
|
-
|
143
|
-
**Debugging**:
|
144
|
-
```bash
|
145
|
-
# Check DLQ contents
|
146
|
-
redis-cli -n 2 LRANGE dramatiq:queue:knowledge.DQ 0 -1
|
147
|
-
|
148
|
-
# Clear DLQ (development only)
|
149
|
-
redis-cli -n 2 DEL dramatiq:queue:knowledge.DQ
|
150
|
-
```
|
151
|
-
|
152
|
-
---
|
153
|
-
|
154
|
-
## 🔍 Diagnostic Tools
|
155
|
-
|
156
|
-
### Redis Queue Inspection
|
157
|
-
|
158
|
-
```bash
|
159
|
-
# Connect to Dramatiq Redis DB
|
160
|
-
redis-cli -n 2
|
161
|
-
|
162
|
-
# List all Dramatiq keys
|
163
|
-
KEYS dramatiq:*
|
164
|
-
|
165
|
-
# Check queue lengths
|
166
|
-
LLEN dramatiq:queue:default
|
167
|
-
LLEN dramatiq:queue:knowledge
|
168
|
-
LLEN dramatiq:queue:high
|
169
|
-
|
170
|
-
# View queue contents (first 5 items)
|
171
|
-
LRANGE dramatiq:queue:knowledge 0 4
|
172
|
-
|
173
|
-
# Check dead letter queues
|
174
|
-
LLEN dramatiq:queue:knowledge.DQ
|
175
|
-
LRANGE dramatiq:queue:knowledge.DQ 0 -1
|
176
|
-
```
|
177
|
-
|
178
|
-
### Worker Process Monitoring
|
179
|
-
|
180
|
-
```bash
|
181
|
-
# Find running workers
|
182
|
-
ps aux | grep rundramatiq
|
183
|
-
pgrep -f rundramatiq
|
184
|
-
|
185
|
-
# Monitor worker resource usage
|
186
|
-
top -p $(pgrep -f rundramatiq)
|
187
|
-
|
188
|
-
# Check worker logs
|
189
|
-
tail -f /tmp/dramatiq_worker.log
|
190
|
-
```
|
191
|
-
|
192
|
-
### Django Task Status
|
193
|
-
|
194
|
-
```python
|
195
|
-
# In Django shell
|
196
|
-
from django_dramatiq.models import Task
|
197
|
-
|
198
|
-
# Task counts by status
|
199
|
-
for status in ['pending', 'running', 'completed', 'failed']:
|
200
|
-
count = Task.objects.filter(status=status).count()
|
201
|
-
print(f"{status}: {count}")
|
202
|
-
|
203
|
-
# Recent failed tasks
|
204
|
-
failed_tasks = Task.objects.filter(
|
205
|
-
status='failed'
|
206
|
-
).order_by('-created_at')[:10]
|
207
|
-
|
208
|
-
for task in failed_tasks:
|
209
|
-
print(f"{task.actor_name}: {task.traceback}")
|
210
|
-
```
|
211
|
-
|
212
|
-
---
|
213
|
-
|
214
|
-
## 🛠️ Recovery Procedures
|
215
|
-
|
216
|
-
### Clear Stuck Tasks (Development)
|
217
|
-
|
218
|
-
```bash
|
219
|
-
# Clear all queues (DESTRUCTIVE)
|
220
|
-
redis-cli -n 2 FLUSHDB
|
221
|
-
|
222
|
-
# Clear specific queue
|
223
|
-
redis-cli -n 2 DEL dramatiq:queue:knowledge
|
224
|
-
|
225
|
-
# Reset Django task status
|
226
|
-
python manage.py shell -c "
|
227
|
-
from django_dramatiq.models import Task
|
228
|
-
Task.objects.filter(status='pending').update(status='failed')
|
229
|
-
"
|
230
|
-
```
|
231
|
-
|
232
|
-
### Restart Workers Safely
|
233
|
-
|
234
|
-
```bash
|
235
|
-
# Graceful shutdown (wait for current tasks)
|
236
|
-
pkill -TERM -f rundramatiq
|
237
|
-
|
238
|
-
# Force shutdown (immediate)
|
239
|
-
pkill -KILL -f rundramatiq
|
240
|
-
|
241
|
-
# Start fresh workers
|
242
|
-
poetry run python manage.py rundramatiq --processes 2 --threads 4
|
243
|
-
```
|
244
|
-
|
245
|
-
### Retry Failed Tasks
|
246
|
-
|
247
|
-
```python
|
248
|
-
# In Django shell
|
249
|
-
from django_dramatiq.models import Task
|
250
|
-
from apps.knowbase.tasks.document_processing import process_document_async
|
251
|
-
|
252
|
-
# Retry specific failed tasks
|
253
|
-
failed_tasks = Task.objects.filter(
|
254
|
-
status='failed',
|
255
|
-
actor_name='process_document_async'
|
256
|
-
)
|
257
|
-
|
258
|
-
for task in failed_tasks:
|
259
|
-
# Re-enqueue the task
|
260
|
-
process_document_async.send(task.kwargs['document_id'])
|
261
|
-
print(f"Retried task for document: {task.kwargs['document_id']}")
|
262
|
-
```
|
263
|
-
|
264
|
-
---
|
265
|
-
|
266
|
-
## 📊 Performance Issues
|
267
|
-
|
268
|
-
### High Memory Usage
|
269
|
-
|
270
|
-
**Symptoms**:
|
271
|
-
- Workers consuming excessive RAM
|
272
|
-
- System becomes unresponsive
|
273
|
-
- OOM killer terminates workers
|
274
|
-
|
275
|
-
**Solutions**:
|
276
|
-
1. **Reduce Worker Count**: Lower `--processes` parameter
|
277
|
-
2. **Limit Task Size**: Break large tasks into smaller chunks
|
278
|
-
3. **Add Memory Limits**: Use systemd or supervisor limits
|
279
|
-
4. **Monitor Memory**: Add memory usage logging
|
280
|
-
|
281
|
-
### Slow Task Processing
|
282
|
-
|
283
|
-
**Symptoms**:
|
284
|
-
- Tasks take longer than expected
|
285
|
-
- Queue backlog grows
|
286
|
-
- Users experience delays
|
287
|
-
|
288
|
-
**Debugging**:
|
289
|
-
```python
|
290
|
-
# Add timing to tasks
|
291
|
-
import time
|
292
|
-
import logging
|
293
|
-
|
294
|
-
@dramatiq.actor
|
295
|
-
def my_task():
|
296
|
-
start_time = time.time()
|
297
|
-
try:
|
298
|
-
# Task logic here
|
299
|
-
pass
|
300
|
-
finally:
|
301
|
-
duration = time.time() - start_time
|
302
|
-
logging.info(f"Task completed in {duration:.2f}s")
|
303
|
-
```
|
304
|
-
|
305
|
-
**Solutions**:
|
306
|
-
1. **Profile Tasks**: Identify bottlenecks
|
307
|
-
2. **Add More Workers**: Scale horizontally
|
308
|
-
3. **Optimize Database Queries**: Use select_related, prefetch_related
|
309
|
-
4. **Cache Results**: Avoid redundant processing
|
310
|
-
|
311
|
-
---
|
312
|
-
|
313
|
-
## 🔧 Configuration Issues
|
314
|
-
|
315
|
-
### Middleware Problems
|
316
|
-
|
317
|
-
**Problem**: Tasks fail due to middleware configuration
|
318
|
-
|
319
|
-
**Common Issues**:
|
320
|
-
```python
|
321
|
-
# ❌ Wrong middleware order
|
322
|
-
middleware = [
|
323
|
-
"dramatiq.middleware.Retries", # Should be later
|
324
|
-
"dramatiq.middleware.AgeLimit", # Should be first
|
325
|
-
]
|
326
|
-
|
327
|
-
# ✅ Correct order
|
328
|
-
middleware = [
|
329
|
-
"dramatiq.middleware.AgeLimit", # Age limit first
|
330
|
-
"dramatiq.middleware.TimeLimit", # Time limit second
|
331
|
-
"dramatiq.middleware.Callbacks", # Callbacks third
|
332
|
-
"dramatiq.middleware.Retries", # Retries fourth
|
333
|
-
"dramatiq.middleware.Shutdown", # Shutdown last
|
334
|
-
]
|
335
|
-
```
|
336
|
-
|
337
|
-
### Queue Configuration
|
338
|
-
|
339
|
-
**Problem**: Tasks sent to non-existent queues
|
340
|
-
|
341
|
-
**Debugging**:
|
342
|
-
```python
|
343
|
-
# Check configured queues
|
344
|
-
from django_cfg.modules.django_tasks import get_task_service
|
345
|
-
service = get_task_service()
|
346
|
-
print("Configured queues:", service.config.dramatiq.queues)
|
347
|
-
|
348
|
-
# Check if queue exists in Redis
|
349
|
-
import redis
|
350
|
-
r = redis.Redis(host='localhost', port=6379, db=2)
|
351
|
-
queues = [key.decode() for key in r.keys('dramatiq:queue:*')]
|
352
|
-
print("Redis queues:", queues)
|
353
|
-
```
|
354
|
-
|
355
|
-
---
|
356
|
-
|
357
|
-
## 🚨 Emergency Procedures
|
358
|
-
|
359
|
-
### System Overload
|
360
|
-
|
361
|
-
**Symptoms**:
|
362
|
-
- Redis memory usage at 100%
|
363
|
-
- Workers consuming all CPU
|
364
|
-
- System unresponsive
|
365
|
-
|
366
|
-
**Immediate Actions**:
|
367
|
-
```bash
|
368
|
-
# 1. Stop all workers immediately
|
369
|
-
pkill -KILL -f rundramatiq
|
370
|
-
|
371
|
-
# 2. Check Redis memory usage
|
372
|
-
redis-cli INFO memory
|
373
|
-
|
374
|
-
# 3. Clear queues if necessary (DESTRUCTIVE)
|
375
|
-
redis-cli -n 2 FLUSHDB
|
376
|
-
|
377
|
-
# 4. Restart with minimal workers
|
378
|
-
poetry run python manage.py rundramatiq --processes 1 --threads 1
|
379
|
-
```
|
380
|
-
|
381
|
-
### Data Corruption
|
382
|
-
|
383
|
-
**Symptoms**:
|
384
|
-
- Tasks fail with serialization errors
|
385
|
-
- Redis shows corrupted data
|
386
|
-
- Unexpected task behavior
|
387
|
-
|
388
|
-
**Recovery**:
|
389
|
-
```bash
|
390
|
-
# 1. Stop all workers
|
391
|
-
pkill -f rundramatiq
|
392
|
-
|
393
|
-
# 2. Backup Redis data
|
394
|
-
redis-cli -n 2 BGSAVE
|
395
|
-
|
396
|
-
# 3. Clear corrupted queues
|
397
|
-
redis-cli -n 2 DEL dramatiq:queue:knowledge
|
398
|
-
|
399
|
-
# 4. Reset task status in Django
|
400
|
-
python manage.py shell -c "
|
401
|
-
from django_dramatiq.models import Task
|
402
|
-
Task.objects.all().delete()
|
403
|
-
"
|
404
|
-
|
405
|
-
# 5. Restart system
|
406
|
-
poetry run python manage.py rundramatiq
|
407
|
-
```
|
408
|
-
|
409
|
-
---
|
410
|
-
|
411
|
-
## 📝 Logging & Monitoring
|
412
|
-
|
413
|
-
### Enhanced Logging
|
414
|
-
|
415
|
-
```python
|
416
|
-
# Add to Django settings
|
417
|
-
LOGGING = {
|
418
|
-
'version': 1,
|
419
|
-
'disable_existing_loggers': False,
|
420
|
-
'handlers': {
|
421
|
-
'dramatiq_file': {
|
422
|
-
'level': 'INFO',
|
423
|
-
'class': 'logging.FileHandler',
|
424
|
-
'filename': '/var/log/dramatiq/tasks.log',
|
425
|
-
'formatter': 'verbose',
|
426
|
-
},
|
427
|
-
},
|
428
|
-
'loggers': {
|
429
|
-
'dramatiq': {
|
430
|
-
'handlers': ['dramatiq_file'],
|
431
|
-
'level': 'INFO',
|
432
|
-
'propagate': True,
|
433
|
-
},
|
434
|
-
'django_cfg.modules.django_tasks': {
|
435
|
-
'handlers': ['dramatiq_file'],
|
436
|
-
'level': 'DEBUG',
|
437
|
-
'propagate': True,
|
438
|
-
},
|
439
|
-
},
|
440
|
-
}
|
441
|
-
```
|
442
|
-
|
443
|
-
### Health Check Script
|
444
|
-
|
445
|
-
```python
|
446
|
-
#!/usr/bin/env python
|
447
|
-
"""Health check script for Dramatiq workers."""
|
448
|
-
|
449
|
-
import redis
|
450
|
-
import subprocess
|
451
|
-
import sys
|
452
|
-
|
453
|
-
def check_redis():
|
454
|
-
try:
|
455
|
-
r = redis.Redis(host='localhost', port=6379, db=2)
|
456
|
-
r.ping()
|
457
|
-
return True
|
458
|
-
except:
|
459
|
-
return False
|
460
|
-
|
461
|
-
def check_workers():
|
462
|
-
try:
|
463
|
-
result = subprocess.run(['pgrep', '-f', 'rundramatiq'],
|
464
|
-
capture_output=True)
|
465
|
-
return result.returncode == 0
|
466
|
-
except:
|
467
|
-
return False
|
468
|
-
|
469
|
-
def main():
|
470
|
-
redis_ok = check_redis()
|
471
|
-
workers_ok = check_workers()
|
472
|
-
|
473
|
-
print(f"Redis: {'OK' if redis_ok else 'FAIL'}")
|
474
|
-
print(f"Workers: {'OK' if workers_ok else 'FAIL'}")
|
475
|
-
|
476
|
-
if not (redis_ok and workers_ok):
|
477
|
-
sys.exit(1)
|
478
|
-
|
479
|
-
if __name__ == '__main__':
|
480
|
-
main()
|
481
|
-
```
|
482
|
-
|
483
|
-
---
|
484
|
-
|
485
|
-
## 🧠 Prevention Tips
|
486
|
-
|
487
|
-
### Development Best Practices
|
488
|
-
|
489
|
-
1. **Always Test Locally**: Run workers during development
|
490
|
-
2. **Monitor Queue Lengths**: Set up alerts for queue buildup
|
491
|
-
3. **Use Timeouts**: Set reasonable task timeouts
|
492
|
-
4. **Handle Failures**: Implement proper error handling
|
493
|
-
5. **Log Everything**: Add comprehensive logging
|
494
|
-
|
495
|
-
### Production Checklist
|
496
|
-
|
497
|
-
- [ ] Workers managed by process manager
|
498
|
-
- [ ] Redis persistence enabled
|
499
|
-
- [ ] Monitoring and alerting configured
|
500
|
-
- [ ] Log rotation set up
|
501
|
-
- [ ] Backup procedures in place
|
502
|
-
- [ ] Health checks implemented
|
503
|
-
- [ ] Resource limits configured
|
504
|
-
|
505
|
-
**DEPENDS_ON**: Redis, Dramatiq, Django-CFG task system
|
506
|
-
**USED_BY**: Development teams, DevOps, Production support
|
File without changes
|
File without changes
|
File without changes
|