@gravito/zenith 0.1.0-beta.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/ARCHITECTURE.md +88 -0
- package/BATCH_OPERATIONS_IMPLEMENTATION.md +159 -0
- package/DEMO.md +156 -0
- package/DEPLOYMENT.md +157 -0
- package/DOCS_INTERNAL.md +73 -0
- package/Dockerfile +46 -0
- package/Dockerfile.demo-worker +29 -0
- package/EVOLUTION_BLUEPRINT.md +112 -0
- package/JOBINSPECTOR_SCROLL_FIX.md +152 -0
- package/PULSE_IMPLEMENTATION_PLAN.md +111 -0
- package/QUICK_TEST_GUIDE.md +72 -0
- package/README.md +33 -0
- package/ROADMAP.md +85 -0
- package/TESTING_BATCH_OPERATIONS.md +252 -0
- package/bin/flux-console.ts +2 -0
- package/dist/bin.js +108196 -0
- package/dist/client/assets/index-DGYEwTDL.css +1 -0
- package/dist/client/assets/index-oyTdySX0.js +421 -0
- package/dist/client/index.html +13 -0
- package/dist/server/index.js +108191 -0
- package/docker-compose.yml +40 -0
- package/docs/integrations/LARAVEL.md +207 -0
- package/package.json +50 -0
- package/postcss.config.js +6 -0
- package/scripts/flood-logs.ts +21 -0
- package/scripts/seed.ts +213 -0
- package/scripts/verify-throttle.ts +45 -0
- package/scripts/worker.ts +123 -0
- package/src/bin.ts +6 -0
- package/src/client/App.tsx +70 -0
- package/src/client/Layout.tsx +644 -0
- package/src/client/Sidebar.tsx +102 -0
- package/src/client/ThroughputChart.tsx +135 -0
- package/src/client/WorkerStatus.tsx +170 -0
- package/src/client/components/ConfirmDialog.tsx +103 -0
- package/src/client/components/JobInspector.tsx +524 -0
- package/src/client/components/LogArchiveModal.tsx +383 -0
- package/src/client/components/NotificationBell.tsx +203 -0
- package/src/client/components/Toaster.tsx +80 -0
- package/src/client/components/UserProfileDropdown.tsx +177 -0
- package/src/client/contexts/AuthContext.tsx +93 -0
- package/src/client/contexts/NotificationContext.tsx +103 -0
- package/src/client/index.css +174 -0
- package/src/client/index.html +12 -0
- package/src/client/main.tsx +15 -0
- package/src/client/pages/LoginPage.tsx +153 -0
- package/src/client/pages/MetricsPage.tsx +408 -0
- package/src/client/pages/OverviewPage.tsx +511 -0
- package/src/client/pages/QueuesPage.tsx +372 -0
- package/src/client/pages/SchedulesPage.tsx +531 -0
- package/src/client/pages/SettingsPage.tsx +449 -0
- package/src/client/pages/WorkersPage.tsx +316 -0
- package/src/client/pages/index.ts +7 -0
- package/src/client/utils.ts +6 -0
- package/src/server/index.ts +556 -0
- package/src/server/middleware/auth.ts +127 -0
- package/src/server/services/AlertService.ts +160 -0
- package/src/server/services/QueueService.ts +828 -0
- package/tailwind.config.js +73 -0
- package/tests/placeholder.test.ts +7 -0
- package/tsconfig.json +38 -0
- package/tsconfig.node.json +12 -0
- package/vite.config.ts +27 -0
|
@@ -0,0 +1,29 @@
|
|
|
1
|
+
# Use Bun official image
|
|
2
|
+
FROM oven/bun:1.1.26 AS base
|
|
3
|
+
WORKDIR /usr/src/app
|
|
4
|
+
|
|
5
|
+
# ---- 1. Install Dependencies ----
|
|
6
|
+
FROM base AS install
|
|
7
|
+
COPY package.json bun.lock ./
|
|
8
|
+
COPY packages/photon/package.json ./packages/photon/
|
|
9
|
+
COPY packages/stream/package.json ./packages/stream/
|
|
10
|
+
COPY packages/flux-console/package.json ./packages/flux-console/
|
|
11
|
+
RUN bun install --frozen-lockfile
|
|
12
|
+
|
|
13
|
+
# ---- 2. Copy Source ----
|
|
14
|
+
FROM base AS build
|
|
15
|
+
COPY --from=install /usr/src/app/node_modules ./node_modules
|
|
16
|
+
COPY --from=install /usr/src/app/packages ./packages
|
|
17
|
+
COPY . .
|
|
18
|
+
|
|
19
|
+
# ---- 3. Runner ----
|
|
20
|
+
FROM base AS release
|
|
21
|
+
WORKDIR /usr/src/app
|
|
22
|
+
COPY --from=build /usr/src/app ./
|
|
23
|
+
|
|
24
|
+
# Env defaults
|
|
25
|
+
ENV NODE_ENV=production
|
|
26
|
+
|
|
27
|
+
# Start the demo worker
|
|
28
|
+
# It uses the local packages/dist if available, but Bun can run TS directly
|
|
29
|
+
CMD ["bun", "run", "packages/flux-console/scripts/demo-worker.ts"]
|
|
@@ -0,0 +1,112 @@
|
|
|
1
|
+
# Gravito Zenith Evolution Blueprint
|
|
2
|
+
**Target**: Zenith v2.0 - The Universal Application Control Plane
|
|
3
|
+
**Core Philosophy**: Lightweight, Redis-Native, Language-Agnostic.
|
|
4
|
+
|
|
5
|
+
## 🧭 Strategic Vision
|
|
6
|
+
|
|
7
|
+
We aim to evolve Zenith from a **Queue Manager** into a **Lightweight Application Control Plane**. We will absorb the best features from industry leaders (Laravel Pulse, Sidekiq, BullMQ) while strictly avoiding their architectural pitfalls (heavy deps, SQL locks, language coupling).
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## 📅 Roadmap Structure
|
|
12
|
+
|
|
13
|
+
### Phase 1: Deep Visibility (Queue Insights)
|
|
14
|
+
**Focus**: Enhance the depth of information for the current Queue System.
|
|
15
|
+
**Benchmarks**: Sidekiq, BullMQ.
|
|
16
|
+
|
|
17
|
+
#### 1.1 Worker "X-Ray" Vision (The "Busy" State)
|
|
18
|
+
- **Concept**: Instead of just showing "Busy", show *what* the worker is doing.
|
|
19
|
+
- **Implementation**: Update Heartbeat Protocol to include `currentJobId` and `jobName`.
|
|
20
|
+
- **UI**: Workers table shows "Processing: Order #1024 (SendEmail)".
|
|
21
|
+
- **Benefit**: Instantly identify which job is hogging a worker.
|
|
22
|
+
- **Difference**: Unlike Sidekiq which uses expensive extensive locking, we use ephemeral keys.
|
|
23
|
+
|
|
24
|
+
#### 1.2 Enhanced Payload Inspector
|
|
25
|
+
- **Concept**: Developer-friendly data inspection.
|
|
26
|
+
- **Implementation**:
|
|
27
|
+
- Syntax-highlighted JSON viewer with folding.
|
|
28
|
+
- "Copy as cURL" or "Copy to Clipboard" actions.
|
|
29
|
+
- Display stack traces with click-to-open (if local) potential.
|
|
30
|
+
- **UX Rule**: Always lazy-load heavy payloads. Never fetch them in the list view.
|
|
31
|
+
|
|
32
|
+
#### 1.3 Timeline Visualization (Gantt-lite)
|
|
33
|
+
- **Concept**: Visualize concurrency.
|
|
34
|
+
- **Implementation**: A canvas-based timeline showing job execution durations overlapping in time.
|
|
35
|
+
- **Benefit**: Spot resource contention or gaps in processing.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## Phase 2: System Pulse (Resource Monitoring)
|
|
40
|
+
**Focus**: Lightweight APM (Application Performance Monitoring).
|
|
41
|
+
**Benchmarks**: Laravel Pulse, PM2.
|
|
42
|
+
|
|
43
|
+
#### 2.1 Gravito Pulse Protocol (GPP)
|
|
44
|
+
- **Concept**: A standardized JSON structure for services to report health.
|
|
45
|
+
- **Structure**: `pulse:{service}:{id}` (TTL 30s).
|
|
46
|
+
- **Metrics**: CPU%, RAM (RSS/Heap), Disk Usage, Uptime, Event Loop Lag.
|
|
47
|
+
- **Avoidance**: Do NOT store historical time-series data in SQL. Use Redis Lists/Streams with aggressive trimming (e.g., keep last 1 hour).
|
|
48
|
+
|
|
49
|
+
#### 2.2 Grid Dashboard
|
|
50
|
+
- **Concept**: Customizable "Mission Control" view.
|
|
51
|
+
- **UI**: Drag-and-drop grid system.
|
|
52
|
+
- **Widgets**:
|
|
53
|
+
- Host Health (CPU/RAM guages).
|
|
54
|
+
- Queue Backlog (Sparklines).
|
|
55
|
+
- Exception Rate (Counter).
|
|
56
|
+
- Slowest Routes (List).
|
|
57
|
+
|
|
58
|
+
#### 2.3 Cross-Language Recorders
|
|
59
|
+
- **Goal**: Provide simple SDKs.
|
|
60
|
+
- `@gravito/recorder-node`: For Express/Hono/Nest.
|
|
61
|
+
- `@gravito/recorder-php`: For Laravel/Symfony.
|
|
62
|
+
- `@gravito/recorder-python`: For Django/FastAPI.
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## Phase 3: Intelligent Operations (Proactive Ops)
|
|
67
|
+
**Focus**: Alerting and Anomaly Detection.
|
|
68
|
+
**Benchmarks**: New Relic (Lite), Oban.
|
|
69
|
+
|
|
70
|
+
#### 3.1 Outlier Detection (The "Slow" & "Error" Trap)
|
|
71
|
+
- **Concept**: Only capture interesting data.
|
|
72
|
+
- **Logic**:
|
|
73
|
+
- If `duration > threshold`: Push to `pulse:slow_jobs`.
|
|
74
|
+
- If `status >= 500`: Push to `pulse:exceptions`.
|
|
75
|
+
- **Benefit**: Zero overhead for successful, fast requests.
|
|
76
|
+
|
|
77
|
+
#### 3.2 Smart Alerting Engine
|
|
78
|
+
- **Concept**: "Don't spam me."
|
|
79
|
+
- **Features**:
|
|
80
|
+
- **Thresholds**: "CPU > 90% for 5 minutes".
|
|
81
|
+
- **Cooldown**: "Alert once per hour per rule".
|
|
82
|
+
- **Channels**: Slack, Discord, Email, Webhook.
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## 🎨 UI/UX Unification Strategy
|
|
87
|
+
|
|
88
|
+
To prevent "Feature Bloat" (UI Clutter), we will enforce strict design rules:
|
|
89
|
+
|
|
90
|
+
### 1. Navigation Hierarchy
|
|
91
|
+
Refactor Sidebar into logical groups:
|
|
92
|
+
- **Dashboards** (Overview, System Pulse)
|
|
93
|
+
- **Queues** (Active, Waiting, Delayed, Failed)
|
|
94
|
+
- **Infrastructure** (Workers, Databases, Redis)
|
|
95
|
+
- **Settings** (Alerts, API Keys)
|
|
96
|
+
|
|
97
|
+
### 2. Contextual Density
|
|
98
|
+
- **List View**: Minimal info (ID, Name, Status, Time).
|
|
99
|
+
- **Detail Panel**: Slide-over panel for deep inspection (Payloads, Stack Traces, Logs).
|
|
100
|
+
- **Avoidance**: Don't try to cram everything into the table rows.
|
|
101
|
+
|
|
102
|
+
### 3. Unified Filters
|
|
103
|
+
- Create a shared "Filter Bar" component (Date Range, Status, Queue Name) that works consistently across Logs, Jobs, and Pulse views.
|
|
104
|
+
|
|
105
|
+
---
|
|
106
|
+
|
|
107
|
+
## ✅ Implementation Checklist (Next Steps)
|
|
108
|
+
|
|
109
|
+
1. **[ ] Protocol Definition**: Document `GPP` (Gravito Pulse Protocol).
|
|
110
|
+
2. **[ ] Backend Core**: Implement `SystemMonitor` service in Zenith.
|
|
111
|
+
3. **[ ] Frontend Core**: Create the `GridDashboard` component layout.
|
|
112
|
+
4. **[ ] Feature**: Implement `Worker X-Ray` (easiest win from Phase 1).
|
|
@@ -0,0 +1,152 @@
|
|
|
1
|
+
# JobInspector 滾動修復
|
|
2
|
+
|
|
3
|
+
## 🐛 問題描述
|
|
4
|
+
|
|
5
|
+
當佇列中有大量工作(例如 100+ 個)時,JobInspector 側邊欄會被撐得很高,導致:
|
|
6
|
+
- 整個側邊欄超出螢幕高度
|
|
7
|
+
- 無法看到底部的 "Dismiss" 按鈕
|
|
8
|
+
- 需要滾動整個頁面才能查看所有內容
|
|
9
|
+
- UX 體驗很差
|
|
10
|
+
|
|
11
|
+
## ✅ 修復方案
|
|
12
|
+
|
|
13
|
+
### 修改的 CSS 類別
|
|
14
|
+
|
|
15
|
+
#### 1. 主容器 (`motion.div`)
|
|
16
|
+
```tsx
|
|
17
|
+
// 修改前
|
|
18
|
+
className="bg-card border-l h-full w-full max-w-2xl shadow-2xl flex flex-col"
|
|
19
|
+
|
|
20
|
+
// 修改後
|
|
21
|
+
className="bg-card border-l h-screen max-h-screen w-full max-w-2xl shadow-2xl flex flex-col overflow-hidden"
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
**變更說明:**
|
|
25
|
+
- `h-full` → `h-screen max-h-screen`: 明確限制高度為視窗高度
|
|
26
|
+
- 新增 `overflow-hidden`: 防止內容溢出
|
|
27
|
+
|
|
28
|
+
#### 2. 頂部標題區域
|
|
29
|
+
```tsx
|
|
30
|
+
// 修改前
|
|
31
|
+
className="p-6 border-b flex justify-between items-center bg-muted/20"
|
|
32
|
+
|
|
33
|
+
// 修改後
|
|
34
|
+
className="p-6 border-b flex justify-between items-center bg-muted/20 flex-shrink-0"
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
**變更說明:**
|
|
38
|
+
- 新增 `flex-shrink-0`: 防止標題區域被壓縮
|
|
39
|
+
|
|
40
|
+
#### 3. 內容滾動區域
|
|
41
|
+
```tsx
|
|
42
|
+
// 修改前
|
|
43
|
+
className="p-0 overflow-y-auto flex-1 bg-muted/5"
|
|
44
|
+
|
|
45
|
+
// 修改後
|
|
46
|
+
className="flex-1 overflow-y-auto bg-muted/5 min-h-0"
|
|
47
|
+
```
|
|
48
|
+
|
|
49
|
+
**變更說明:**
|
|
50
|
+
- 移除 `p-0`(padding 由內部元素控制)
|
|
51
|
+
- 新增 `min-h-0`: 允許 flex 子元素縮小到 0,確保滾動正常運作
|
|
52
|
+
|
|
53
|
+
#### 4. 底部按鈕區域
|
|
54
|
+
```tsx
|
|
55
|
+
// 修改前
|
|
56
|
+
className="p-4 border-t bg-card text-right"
|
|
57
|
+
|
|
58
|
+
// 修改後
|
|
59
|
+
className="p-4 border-t bg-card text-right flex-shrink-0"
|
|
60
|
+
```
|
|
61
|
+
|
|
62
|
+
**變更說明:**
|
|
63
|
+
- 新增 `flex-shrink-0`: 防止按鈕區域被壓縮
|
|
64
|
+
|
|
65
|
+
## 🎯 修復效果
|
|
66
|
+
|
|
67
|
+
### 修復前
|
|
68
|
+
```
|
|
69
|
+
┌─────────────────────────────┐
|
|
70
|
+
│ Header (標題區) │ ← 正常
|
|
71
|
+
├─────────────────────────────┤
|
|
72
|
+
│ Job 1 │
|
|
73
|
+
│ Job 2 │
|
|
74
|
+
│ Job 3 │
|
|
75
|
+
│ ... │
|
|
76
|
+
│ Job 98 │
|
|
77
|
+
│ Job 99 │
|
|
78
|
+
│ Job 100 │ ← 超出螢幕
|
|
79
|
+
├─────────────────────────────┤
|
|
80
|
+
│ [Dismiss] 按鈕 │ ← 看不到
|
|
81
|
+
└─────────────────────────────┘
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
### 修復後
|
|
85
|
+
```
|
|
86
|
+
┌─────────────────────────────┐
|
|
87
|
+
│ Header (標題區) │ ← 固定在頂部
|
|
88
|
+
├─────────────────────────────┤
|
|
89
|
+
│ Job 1 │ ↕️
|
|
90
|
+
│ Job 2 │ ↕️
|
|
91
|
+
│ Job 3 │ ↕️ 可滾動區域
|
|
92
|
+
│ ... │ ↕️
|
|
93
|
+
│ Job 50 │ ↕️
|
|
94
|
+
├─────────────────────────────┤
|
|
95
|
+
│ [Dismiss] 按鈕 │ ← 固定在底部
|
|
96
|
+
└─────────────────────────────┘
|
|
97
|
+
```
|
|
98
|
+
|
|
99
|
+
## 📊 技術細節
|
|
100
|
+
|
|
101
|
+
### Flexbox 佈局結構
|
|
102
|
+
```
|
|
103
|
+
motion.div (h-screen, flex flex-col, overflow-hidden)
|
|
104
|
+
├─ Header (flex-shrink-0) ← 不會被壓縮
|
|
105
|
+
├─ Content (flex-1, overflow-y-auto, min-h-0) ← 可滾動,佔據剩餘空間
|
|
106
|
+
└─ Footer (flex-shrink-0) ← 不會被壓縮
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
### 關鍵 CSS 屬性
|
|
110
|
+
|
|
111
|
+
1. **`h-screen max-h-screen`**: 限制容器高度為視窗高度
|
|
112
|
+
2. **`overflow-hidden`**: 防止容器本身滾動
|
|
113
|
+
3. **`flex-shrink-0`**: 防止 header 和 footer 被壓縮
|
|
114
|
+
4. **`flex-1`**: 讓內容區域佔據所有剩餘空間
|
|
115
|
+
5. **`overflow-y-auto`**: 只在內容區域啟用垂直滾動
|
|
116
|
+
6. **`min-h-0`**: 允許 flex 子元素縮小(CSS flexbox 的特殊行為)
|
|
117
|
+
|
|
118
|
+
## 🧪 測試驗證
|
|
119
|
+
|
|
120
|
+
### 測試案例
|
|
121
|
+
1. ✅ 打開有 100+ 工作的佇列
|
|
122
|
+
2. ✅ 側邊欄高度固定為螢幕高度
|
|
123
|
+
3. ✅ 工作列表可以滾動
|
|
124
|
+
4. ✅ 頂部標題始終可見
|
|
125
|
+
5. ✅ 底部 Dismiss 按鈕始終可見
|
|
126
|
+
6. ✅ 滾動流暢,無卡頓
|
|
127
|
+
|
|
128
|
+
### 測試步驟
|
|
129
|
+
```bash
|
|
130
|
+
# 1. 確保測試資料存在
|
|
131
|
+
bun scripts/test-batch-operations.ts create
|
|
132
|
+
|
|
133
|
+
# 2. 打開 Flux Console
|
|
134
|
+
# http://localhost:3000
|
|
135
|
+
|
|
136
|
+
# 3. 點擊 test-batch 佇列的 Inspect
|
|
137
|
+
# 4. 確認可以看到頂部標題和底部按鈕
|
|
138
|
+
# 5. 滾動工作列表,確認滾動流暢
|
|
139
|
+
```
|
|
140
|
+
|
|
141
|
+
## 📝 相關文件
|
|
142
|
+
|
|
143
|
+
- `src/client/pages/QueuesPage.tsx` - JobInspector 元件
|
|
144
|
+
- 修改行數: 255, 258, 294, 505
|
|
145
|
+
|
|
146
|
+
## 🎉 總結
|
|
147
|
+
|
|
148
|
+
這個修復確保了 JobInspector 在處理大量工作時:
|
|
149
|
+
- ✅ 高度固定為螢幕高度
|
|
150
|
+
- ✅ 內容區域可以獨立滾動
|
|
151
|
+
- ✅ 頂部和底部區域始終可見
|
|
152
|
+
- ✅ UX 體驗大幅改善
|
|
@@ -0,0 +1,111 @@
|
|
|
1
|
+
# Gravito Pulse Implementation Plan
|
|
2
|
+
**Version**: 1.1.0 (Beta Targeted)
|
|
3
|
+
**Status**: Active
|
|
4
|
+
**Target**: Zenith v1.0 Beta
|
|
5
|
+
|
|
6
|
+
This document outlines the implementation plan for the **System Pulse** and **Universal Queue Connector**, enabling Zenith to monitor not just Gravito Stream, but also Laravel, BullMQ, and other queue systems directly in the current Beta phase.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## 🏗 Architecture Specifications
|
|
11
|
+
|
|
12
|
+
### 1. Redis Schema (Updated)
|
|
13
|
+
|
|
14
|
+
| Key Pattern | Type | TTL | Description |
|
|
15
|
+
| :--- | :--- | :--- | :--- |
|
|
16
|
+
| `pulse:server:{app}:{id}` | `String` (JSON) | 30s | **Heartbeat**. System resources (CPU/RAM). |
|
|
17
|
+
| `pulse:queues:{app}` | `String` (JSON) | 30s | **Queue Snapshot**. Metrics from external queues. |
|
|
18
|
+
| `pulse:slow:{app}` | `Stream` | MaxLen 1000 | **Slow Logs**. Validated heavy requests. |
|
|
19
|
+
|
|
20
|
+
### 2. Gravito Pulse Protocol (GPP) - Shared Types
|
|
21
|
+
|
|
22
|
+
```typescript
|
|
23
|
+
// System Heartbeat
|
|
24
|
+
interface PulseHeartbeat {
|
|
25
|
+
// ... (Existing fields)
|
|
26
|
+
}
|
|
27
|
+
|
|
28
|
+
// Universal Queue Snapshot
|
|
29
|
+
interface QueueSnapshot {
|
|
30
|
+
timestamp: number;
|
|
31
|
+
app: string;
|
|
32
|
+
queues: Array<{
|
|
33
|
+
name: string;
|
|
34
|
+
driver: 'gravito-stream' | 'laravel-horizon' | 'bullmq' | 'sqs' | 'other';
|
|
35
|
+
metrics: {
|
|
36
|
+
waiting: number;
|
|
37
|
+
active?: number | null; // Optional (some drivers can't count active)
|
|
38
|
+
delayed?: number;
|
|
39
|
+
failed?: number;
|
|
40
|
+
};
|
|
41
|
+
meta?: Record<string, any>;
|
|
42
|
+
}>;
|
|
43
|
+
}
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
---
|
|
47
|
+
|
|
48
|
+
## 📅 Implementation Phases (Beta Priority)
|
|
49
|
+
|
|
50
|
+
### Phase 1: Foundation (Protocol & Node SDK)
|
|
51
|
+
**Goal**: Define the standard so other languages can conform.
|
|
52
|
+
|
|
53
|
+
- [ ] **Task 1.1: Create `packages/pulse-protocol`**
|
|
54
|
+
- Define TypeScript interfaces for Heartbeat AND QueueSnapshot.
|
|
55
|
+
- Export Redis key constants.
|
|
56
|
+
|
|
57
|
+
- [ ] **Task 1.2: Create `packages/pulse-node`**
|
|
58
|
+
- Implement System Recorder (CPU/RAM).
|
|
59
|
+
|
|
60
|
+
### Phase 2: Universal Queue Adapters (v1.0 Beta Target)
|
|
61
|
+
**Goal**: Enable "One Dashboard, Any Queue".
|
|
62
|
+
|
|
63
|
+
- [ ] **Task 2.1: Laravel Adapter Specification (Concept)**
|
|
64
|
+
- Target: `gravito/zenith-laravel` (Composer package).
|
|
65
|
+
- Logic:
|
|
66
|
+
- Hook into Laravel Schedule.
|
|
67
|
+
- Run `Redis::llen` on queue lists or query `failed_jobs` table.
|
|
68
|
+
- Push JSON to `pulse:queues:{laravel_app}`.
|
|
69
|
+
- *Action*: Create a POC documentation/spec for PHP developers.
|
|
70
|
+
|
|
71
|
+
- [ ] **Task 2.2: BullMQ Adapter (Node.js)**
|
|
72
|
+
- Create `packages/adapter-bullmq`.
|
|
73
|
+
- Wrapper that accepts a BullMQ queue instance and auto-reports metrics to Zenith.
|
|
74
|
+
|
|
75
|
+
### Phase 3: Zenith Integration (Aggregated UI)
|
|
76
|
+
**Goal**: Visualize mixed queue sources.
|
|
77
|
+
|
|
78
|
+
- [ ] **Task 3.1: Backend Aggregation**
|
|
79
|
+
- `MonitorService` must now SCAN both `pulse:server:*` and `pulse:queues:*`.
|
|
80
|
+
- Stream consolidated data via SSE.
|
|
81
|
+
|
|
82
|
+
- [ ] **Task 3.2: Unified Queue Dashboard**
|
|
83
|
+
- Update `QueuesPage` to support "External Queues".
|
|
84
|
+
- External queues might be "Read-Only" initially (Metrics only, no Retry controls yet).
|
|
85
|
+
- Add visuals to distinguish Gravito Queues vs. External Queues.
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
## 🔮 Future Roadmap (v2.0)
|
|
90
|
+
|
|
91
|
+
### Phase 4: Full Application Monitoring (Pulse)
|
|
92
|
+
**Goal**: Catch the "bad" requests and visualize deep application health.
|
|
93
|
+
|
|
94
|
+
- [ ] **Task 4.1: Slow Request Interceptor**
|
|
95
|
+
- Add `httpMiddleware` to `pulse-node`.
|
|
96
|
+
- Tracks request start/end time.
|
|
97
|
+
- If > threshold (default 1s), XADD to `pulse:slow:{app}`.
|
|
98
|
+
|
|
99
|
+
- [ ] **Task 4.2: Exception Tracking**
|
|
100
|
+
- Aggregated 5xx error tracking via ZSET.
|
|
101
|
+
|
|
102
|
+
- [ ] **Task 4.3: Cross-Language SDKs**
|
|
103
|
+
- Python (Django/FastAPI) and Go Recorders.
|
|
104
|
+
- GPU Monitoring for AI workloads.
|
|
105
|
+
|
|
106
|
+
|
|
107
|
+
---
|
|
108
|
+
|
|
109
|
+
## ⚡ Development Guidelines for External Adapters
|
|
110
|
+
1. **Passive Reporting**: Adapters should strictly REPORT data. They should not rely on Zenith for commands in V1.
|
|
111
|
+
2. **Fault Tolerance**: If Redis is down, the adapter must not crash the main application.
|
|
@@ -0,0 +1,72 @@
|
|
|
1
|
+
# 🎯 Flux Console 統一測試指南
|
|
2
|
+
|
|
3
|
+
## 🚀 快速開始
|
|
4
|
+
|
|
5
|
+
我們已經將所有測試腳本整合為 `seed.ts` 與 `worker.ts`,以下是常用的測試流程。
|
|
6
|
+
|
|
7
|
+
### 1. 初始化資料 (Seeding)
|
|
8
|
+
|
|
9
|
+
根據測試需求選擇模式:
|
|
10
|
+
|
|
11
|
+
```bash
|
|
12
|
+
# 基本資料(Waiting, Delayed, Failed)
|
|
13
|
+
bun scripts/seed.ts standard
|
|
14
|
+
|
|
15
|
+
# 壓力測試(建立 15 個佇列,每個佇列有大量資料)
|
|
16
|
+
bun scripts/seed.ts stress
|
|
17
|
+
|
|
18
|
+
# 批次操作測試(專用大量資料)
|
|
19
|
+
bun scripts/seed.ts batch
|
|
20
|
+
|
|
21
|
+
# 註冊排程管理 (Cron Jobs)
|
|
22
|
+
bun scripts/seed.ts cron
|
|
23
|
+
|
|
24
|
+
# 懶人包:一次執行以上所有模式
|
|
25
|
+
bun scripts/seed.ts all
|
|
26
|
+
|
|
27
|
+
# 清理所有資料
|
|
28
|
+
bun scripts/seed.ts cleanup
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
### 2. 啟動背景處理 (Workers)
|
|
32
|
+
|
|
33
|
+
模擬工作正在被處理、成功或失敗的過程:
|
|
34
|
+
|
|
35
|
+
```bash
|
|
36
|
+
# 啟動預設佇列的處理
|
|
37
|
+
bun scripts/worker.ts
|
|
38
|
+
|
|
39
|
+
# 模擬高失敗率與處理延遲
|
|
40
|
+
bun scripts/worker.ts orders,reports --fail=0.3 --delay=500
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
---
|
|
44
|
+
|
|
45
|
+
## 🧪 核心測試情境
|
|
46
|
+
|
|
47
|
+
### 測試 1:排程管理 (Schedules)
|
|
48
|
+
1. 執行 `bun scripts/seed.ts cron`。
|
|
49
|
+
2. 在 UI 點擊側邊欄的 **"Schedules"**。
|
|
50
|
+
3. 檢查是否顯示 `cleanup-tmp`, `daily-report` 等排程。
|
|
51
|
+
4. 點擊 "Run Now" 檢查工作是否立即進入 `system` 或 `reports` 佇列。
|
|
52
|
+
|
|
53
|
+
### 測試 2:批次操作 (Batch Operations)
|
|
54
|
+
1. 執行 `bun scripts/seed.ts batch`。
|
|
55
|
+
2. 進入 `test-batch` 佇列。
|
|
56
|
+
3. 使用 **Cmd+A** 全選,然後點擊 "Delete Selected"。
|
|
57
|
+
4. 測試頁面底部的「刪除所有 X 個工作」警告橫幅。
|
|
58
|
+
|
|
59
|
+
### 測試 3:排版與搜尋
|
|
60
|
+
1. 執行 `bun scripts/seed.ts stress`。
|
|
61
|
+
2. 檢查側邊欄的佇列列表是否正確收合/捲動。
|
|
62
|
+
3. 使用 **Cmd+K** 開啟命令列,輸入 `billing` 跳轉至該佇列。
|
|
63
|
+
|
|
64
|
+
---
|
|
65
|
+
|
|
66
|
+
## 🧹 清理空間
|
|
67
|
+
|
|
68
|
+
測試完成後,建議執行:
|
|
69
|
+
```bash
|
|
70
|
+
bun scripts/seed.ts cleanup
|
|
71
|
+
```
|
|
72
|
+
這會清除 Redis 中所有帶有 `queue:` 前綴的 Key,以及 Console 的日誌快取。
|
package/README.md
ADDED
|
@@ -0,0 +1,33 @@
|
|
|
1
|
+
# @gravito/flux-console
|
|
2
|
+
|
|
3
|
+
Management and Monitoring UI for Gravito Stream.
|
|
4
|
+
|
|
5
|
+
## Features
|
|
6
|
+
|
|
7
|
+
- **Real-time Monitoring**: Throughput and error rates.
|
|
8
|
+
- **Worker Health**: Live CPU and RAM metrics.
|
|
9
|
+
- **Queue Management**: Pause/Resume queues, View Waiting/Delayed/Failed jobs.
|
|
10
|
+
- **DLQ Operations**: Batch retry or clear failed jobs directly from the UI.
|
|
11
|
+
- **Job Auditing & Search**: Permanent history via SQL (MySQL/SQLite) with global search.
|
|
12
|
+
- **Operational Log Archiving**: Persistent storage for system events and worker activities with history search.
|
|
13
|
+
- **Automated Alerting**: Slack notifications for failure spikes or backlog issues.
|
|
14
|
+
- **Batch Actions**: Flush delayed jobs, purge queues, and bulk operations.
|
|
15
|
+
- **Schedule Management**: Full UI for Cron jobs.
|
|
16
|
+
- **Zero-Config**: Built-in SQLite support for local auditing without a DB server.
|
|
17
|
+
|
|
18
|
+
## Development
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
# Start backend and frontend (proxy mode)
|
|
22
|
+
bun run dev
|
|
23
|
+
|
|
24
|
+
# Seed test data
|
|
25
|
+
bun scripts/seed-data.ts
|
|
26
|
+
|
|
27
|
+
# Start demo worker
|
|
28
|
+
bun scripts/demo-worker.ts
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
## Technical Specification
|
|
32
|
+
|
|
33
|
+
See [ARCHITECTURE.md](./ARCHITECTURE.md) and [DOCS_INTERNAL.md](./DOCS_INTERNAL.md) for implementation details.
|
package/ROADMAP.md
ADDED
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
# Gravito Zenith Roadmap (Control Plane)
|
|
2
|
+
|
|
3
|
+
This document outlines the future development plan for **Gravito Zenith**, moving from a basic monitoring tool to a comprehensive polyglot control plane for any job processing system.
|
|
4
|
+
|
|
5
|
+
## 🚀 High Priority (Immediate Next Steps)
|
|
6
|
+
|
|
7
|
+
### 1. Polyglot & Framework Integration (P0)
|
|
8
|
+
**Goal**: Official support for non-Node.js workers, making Zenith a universal dashboard.
|
|
9
|
+
- **Tasks**:
|
|
10
|
+
- [x] **Protocol Specification**: Defined standard Redis structures for Heartbeat and Logs.
|
|
11
|
+
- [x] **Laravel Integration**: Blueprint for `ZenithServiceProvider` and `ZenithConnector` defined.
|
|
12
|
+
- [ ] **Go/Python SDK**: Create lightweight client libraries for other languages.
|
|
13
|
+
|
|
14
|
+
### 2. Docker & Cloud-Native Deployment (P1)
|
|
15
|
+
**Goal**: Enable "One-Click Deployment" for any environment (local, EC2, K8s).
|
|
16
|
+
- **Current Blocker**: Local workspace dependencies (`workspace:*`) cause build failures in standard Docker contexts.
|
|
17
|
+
- **Tasks**:
|
|
18
|
+
- [x] Fix `Dockerfile` dependency resolution (via multi-stage builds).
|
|
19
|
+
- [x] Create `docker-compose.yml` for a full stack setup (Console + Redis + Demo Worker).
|
|
20
|
+
- [x] Implementation of Scheduled Jobs (Cron) Management.
|
|
21
|
+
|
|
22
|
+
### 3. System Pulse Monitoring (Lightweight APM) (P0 - NEW)
|
|
23
|
+
**Goal**: Application-Aware lightweight resource monitoring and alerting.
|
|
24
|
+
**Philosophy**: "If you can connect to Redis, you are monitored." No extra agents, no Prometheus config.
|
|
25
|
+
- **Features**:
|
|
26
|
+
- **Live Resource Cards**: CPU / RAM / Disk Usage per service (Node.js, PHP, Python).
|
|
27
|
+
- **Health Heartbeats**: Auto-discovery of active services via `pulse:{service}:{id}` keys.
|
|
28
|
+
- **Proactive Alerting**: Slack notifications for Disk Space < 10% or High CPU Load.
|
|
29
|
+
- **Protocol-Based**: Language-agnostic design (Gravito Pulse Protocol).
|
|
30
|
+
|
|
31
|
+
|
|
32
|
+
### 3. History Persistence (SQL Archive) (Completed ✅)
|
|
33
|
+
**Goal**: Store job history permanently for auditing and long-term analysis.
|
|
34
|
+
- [x] Implement a `PersistenceAdapter` in `@gravito/stream`.
|
|
35
|
+
- [x] Automatically archive completed/failed jobs to a SQL database.
|
|
36
|
+
- [x] **Zero-Config (SQLite)**: Integrated support for local testing.
|
|
37
|
+
- [x] **Time Travel Audit**: comprehensive UI for tracing job history.
|
|
38
|
+
|
|
39
|
+
## ✨ Feature Enhancements (Mid-Term)
|
|
40
|
+
|
|
41
|
+
### 4. Alerting & Notifications (Completed ✅)
|
|
42
|
+
**Goal**: Proactive issue notification system.
|
|
43
|
+
- [x] **AlertService**: Lightweight rules for failure spikes, backlog, and worker loss.
|
|
44
|
+
- [x] **Cooldown Mechanism**: Prevents alerting storms.
|
|
45
|
+
- [x] **Slack Integration**: Webhook support with test notification UI.
|
|
46
|
+
- [x] **Real-time Monitoring**: Integrated directly into server metrics loop.
|
|
47
|
+
|
|
48
|
+
### 5. Scheduled Jobs (Cron) Management (Completed ✅)
|
|
49
|
+
**Goal**: UI-based management for recurring tasks.
|
|
50
|
+
- [x] Dashboard to view all registered Cron jobs.
|
|
51
|
+
- [x] Ability to "Trigger Now" manually.
|
|
52
|
+
- [x] Ability to Pause/Resume (Delete/Register) specific Cron schedules.
|
|
53
|
+
- [x] Real-time ticking via the Console server.
|
|
54
|
+
|
|
55
|
+
### 6. Batch Operations (Completed ✅)
|
|
56
|
+
**Goal**: Bulk management actions.
|
|
57
|
+
- **Problem**: Can only retry/delete one job or "all" jobs. Hard to handle "the 50 jobs that failed due to the bug yesterday".
|
|
58
|
+
- **Tasks**:
|
|
59
|
+
- [x] Multi-select checkboxes in job lists.
|
|
60
|
+
- [x] Bulk Retry / Bulk Delete.
|
|
61
|
+
- [x] Select All Matching Query (Delete/Retry ALL jobs of a type).
|
|
62
|
+
- [x] Confirmation dialogs with loading states.
|
|
63
|
+
- [x] Keyboard shortcuts (Ctrl+A, Escape).
|
|
64
|
+
- [x] Visual feedback and total count display.
|
|
65
|
+
|
|
66
|
+
## 🔮 Enterprise Features (Long-Term)
|
|
67
|
+
|
|
68
|
+
### 7. Role-Based Access Control (RBAC)
|
|
69
|
+
**Goal**: Granular permission management for teams.
|
|
70
|
+
- **Problem**: Single password for everyone. Risky for junior devs to have "Delete Queue" power.
|
|
71
|
+
- **Features**:
|
|
72
|
+
- Roles: `Viewer` (Read-only), `Operator` (Retry/Pause), `Admin` (Delete/Purge).
|
|
73
|
+
- User Management system (potentially integrated with OAuth/SSO).
|
|
74
|
+
|
|
75
|
+
### 8. Multi-Cluster Management
|
|
76
|
+
**Goal**: Centralized control pane for multiple environments.
|
|
77
|
+
- **Problem**: Need to open 3 different tabs for Dev, Staging, and Prop.
|
|
78
|
+
- **Features**:
|
|
79
|
+
- Connection Switcher in the UI header.
|
|
80
|
+
- Unified view of multiple Redis instances.
|
|
81
|
+
|
|
82
|
+
### 9. Enhanced Search (Indexer)
|
|
83
|
+
**Goal**: Full-text search on Job Payloads.
|
|
84
|
+
- **Problem**: Can only search by Job ID. Cannot search by "email: user@example.com".
|
|
85
|
+
- **Solution**: Implement a lightweight search index (RedisSearch or external engine) to index payloads.
|