@sysnee/pgs 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +314 -0
- package/docker-compose.yml +16 -0
- package/docs/ARCHITECTURE_RECOMMENDATIONS.md +480 -0
- package/docs/CRITICAL_REVIEW.md +748 -0
- package/docs/EXECUTIVE_SUMMARY.md +210 -0
- package/docs/PROJECT.md +250 -0
- package/haproxy-lua/pg-route.lua +177 -0
- package/manager.js +510 -0
- package/manifest.json +32 -0
- package/package.json +24 -0
|
@@ -0,0 +1,210 @@
|
|
|
1
|
+
# Resumo Executivo: Análise para Produto Comercial
|
|
2
|
+
|
|
3
|
+
## 🎯 Visão Geral
|
|
4
|
+
|
|
5
|
+
**Objetivo**: Transformar a solução atual em um produto comercial de PostgreSQL gerenciado (PaaS)
|
|
6
|
+
|
|
7
|
+
**Estado Atual**: ✅ Excelente para MVP/POC, ⚠️ Necessita melhorias críticas para produção comercial
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## 🔴 GAPS CRÍTICOS (Bloqueadores)
|
|
12
|
+
|
|
13
|
+
### 1. Docker Compose - Até onde é aceitável?
|
|
14
|
+
|
|
15
|
+
**Limites Práticos:**
|
|
16
|
+
- ✅ **ACEITÁVEL**: < 100 tenants, MVP, desenvolvimento
|
|
17
|
+
- ❌ **INACEITÁVEL**: > 500 tenants, produção enterprise, multi-region
|
|
18
|
+
|
|
19
|
+
**Problemas:**
|
|
20
|
+
- Single host = single point of failure
|
|
21
|
+
- Não escala além de ~50-100 containers/host
|
|
22
|
+
- YAML único cresce indefinidamente
|
|
23
|
+
- Sem orquestração automática
|
|
24
|
+
|
|
25
|
+
**Recomendação:**
|
|
26
|
+
- **Fase 1 (MVP)**: Docker Compose melhorado com sharding multi-host
|
|
27
|
+
- **Fase 2-3**: Migrar para Kubernetes
|
|
28
|
+
|
|
29
|
+
### 2. Alta Disponibilidade - CRÍTICO
|
|
30
|
+
|
|
31
|
+
**Problema**: Zero redundância
|
|
32
|
+
- 1 container PostgreSQL = 1 ponto de falha
|
|
33
|
+
- HAProxy sem redundância
|
|
34
|
+
- Sem failover automático
|
|
35
|
+
|
|
36
|
+
**Solução**:
|
|
37
|
+
- Replicação PostgreSQL (Patroni/pg_auto_failover)
|
|
38
|
+
- HAProxy redundante com Keepalived
|
|
39
|
+
- Auto-failover configurado
|
|
40
|
+
|
|
41
|
+
### 3. Backup e Recovery - ESSENCIAL
|
|
42
|
+
|
|
43
|
+
**Problema**: Não existe sistema de backup
|
|
44
|
+
|
|
45
|
+
**Solução**:
|
|
46
|
+
- Backup automático diário (pg_dump/pgBackRest)
|
|
47
|
+
- Point-in-Time Recovery (PITR)
|
|
48
|
+
- Retenção configurável (7/30/90 dias)
|
|
49
|
+
- Storage offsite (S3/GCS)
|
|
50
|
+
|
|
51
|
+
### 4. Monitoramento e Alertas
|
|
52
|
+
|
|
53
|
+
**Problema**: Sem visibilidade de saúde do sistema
|
|
54
|
+
|
|
55
|
+
**Solução**:
|
|
56
|
+
- Prometheus + Grafana
|
|
57
|
+
- Alertas críticos (container down, backup falhou)
|
|
58
|
+
- SLA tracking
|
|
59
|
+
- Métricas por tenant
|
|
60
|
+
|
|
61
|
+
### 5. Segurança
|
|
62
|
+
|
|
63
|
+
**Problemas**:
|
|
64
|
+
- Senhas em texto plano
|
|
65
|
+
- Sem SSL/TLS
|
|
66
|
+
- Sem secret management
|
|
67
|
+
|
|
68
|
+
**Solução**:
|
|
69
|
+
- HashiCorp Vault para secrets
|
|
70
|
+
- SSL/TLS obrigatório
|
|
71
|
+
- Rotação automática de senhas
|
|
72
|
+
- Network policies
|
|
73
|
+
|
|
74
|
+
### 6. API e Automação
|
|
75
|
+
|
|
76
|
+
**Problema**: Apenas CLI manual
|
|
77
|
+
|
|
78
|
+
**Solução**:
|
|
79
|
+
- REST API completa
|
|
80
|
+
- Webhooks para eventos
|
|
81
|
+
- SDK/CLI oficial
|
|
82
|
+
- Integração com sistemas externos
|
|
83
|
+
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## 📊 COMPARAÇÃO: Docker Compose vs Alternativas
|
|
87
|
+
|
|
88
|
+
| Solução | Multi-Host | Auto-Scale | HA | Complexidade | Produção |
|
|
89
|
+
|---------|-----------|------------|----|--------------| ---------|
|
|
90
|
+
| **Docker Compose** | ❌ | ❌ | ❌ | ✅ Baixa | ❌ |
|
|
91
|
+
| **Docker Swarm** | ✅ | ⚠️ | ✅ | ⚠️ Média | ⚠️ |
|
|
92
|
+
| **Kubernetes** | ✅ | ✅ | ✅ | ❌ Alta | ✅ |
|
|
93
|
+
| **Nomad** | ✅ | ✅ | ✅ | ⚠️ Média | ✅ |
|
|
94
|
+
|
|
95
|
+
---
|
|
96
|
+
|
|
97
|
+
## 🚀 ROADMAP RECOMENDADO
|
|
98
|
+
|
|
99
|
+
### FASE 1: MVP Comercial (2-3 meses)
|
|
100
|
+
```
|
|
101
|
+
✅ Docker Compose melhorado (multi-host sharding)
|
|
102
|
+
✅ Backup automático básico
|
|
103
|
+
✅ API REST
|
|
104
|
+
✅ Monitoring básico (Prometheus + Grafana)
|
|
105
|
+
✅ Secrets management básico
|
|
106
|
+
✅ SSL/TLS
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
**Custo estimado**: $5K-15K (infraestrutura + desenvolvimento)
|
|
110
|
+
|
|
111
|
+
### FASE 2: Produção (3-4 meses)
|
|
112
|
+
```
|
|
113
|
+
✅ Alta disponibilidade (replicação)
|
|
114
|
+
✅ Backup avançado (pgBackRest + PITR)
|
|
115
|
+
✅ Dashboard web
|
|
116
|
+
✅ Billing integrado
|
|
117
|
+
✅ Alertas automáticos
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
**Custo estimado**: $10K-25K
|
|
121
|
+
|
|
122
|
+
### FASE 3: Escala (6-12 meses)
|
|
123
|
+
```
|
|
124
|
+
✅ Migração para Kubernetes
|
|
125
|
+
✅ Multi-region
|
|
126
|
+
✅ Read replicas
|
|
127
|
+
✅ Auto-scaling
|
|
128
|
+
✅ Enterprise features
|
|
129
|
+
```
|
|
130
|
+
|
|
131
|
+
**Custo estimado**: $25K-50K
|
|
132
|
+
|
|
133
|
+
---
|
|
134
|
+
|
|
135
|
+
## 💰 MODELO DE NEGÓCIO
|
|
136
|
+
|
|
137
|
+
### Planos Sugeridos
|
|
138
|
+
|
|
139
|
+
**Starter** ($29/mês)
|
|
140
|
+
- 1 CPU, 2GB RAM, 50GB storage
|
|
141
|
+
- Backup diário (7 dias retenção)
|
|
142
|
+
- 99.9% SLA
|
|
143
|
+
|
|
144
|
+
**Professional** ($99/mês)
|
|
145
|
+
- 2 CPU, 4GB RAM, 200GB storage
|
|
146
|
+
- Backup horário (30 dias retenção)
|
|
147
|
+
- 99.95% SLA
|
|
148
|
+
- Read replica (opcional)
|
|
149
|
+
|
|
150
|
+
**Enterprise** ($299/mês)
|
|
151
|
+
- 4 CPU, 16GB RAM, 1TB storage
|
|
152
|
+
- Backup contínuo (90 dias retenção)
|
|
153
|
+
- 99.99% SLA
|
|
154
|
+
- Multi-region
|
|
155
|
+
- SLA garantido
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## ⚠️ RISCOS PRINCIPAIS
|
|
160
|
+
|
|
161
|
+
1. **Escalabilidade**: Docker Compose não escala além de ~100 tenants
|
|
162
|
+
2. **Confiabilidade**: Sem HA, qualquer falha afeta múltiplos clientes
|
|
163
|
+
3. **Compliance**: Sem backups/audit, difícil atender regulamentações
|
|
164
|
+
4. **Operações**: Overhead manual aumenta com escala
|
|
165
|
+
|
|
166
|
+
---
|
|
167
|
+
|
|
168
|
+
## ✅ AÇÕES IMEDIATAS
|
|
169
|
+
|
|
170
|
+
1. **Definir MVP Scope**
|
|
171
|
+
- Quantos tenants iniciais?
|
|
172
|
+
- SLA mínimo aceitável?
|
|
173
|
+
- Features essenciais?
|
|
174
|
+
|
|
175
|
+
2. **Escolher Arquitetura Fase 1**
|
|
176
|
+
- Docker Compose melhorado (recomendado)
|
|
177
|
+
- Ou investir direto em Kubernetes?
|
|
178
|
+
|
|
179
|
+
3. **Priorizar Features**
|
|
180
|
+
- Backup automático (ESSENCIAL)
|
|
181
|
+
- API REST (ESSENCIAL)
|
|
182
|
+
- Monitoring (ESSENCIAL)
|
|
183
|
+
- HA (Fase 2)
|
|
184
|
+
|
|
185
|
+
4. **Estimativa de Custo**
|
|
186
|
+
- Infraestrutura cloud
|
|
187
|
+
- Desenvolvimento
|
|
188
|
+
- Operações
|
|
189
|
+
|
|
190
|
+
---
|
|
191
|
+
|
|
192
|
+
## 📚 DOCUMENTAÇÃO RELACIONADA
|
|
193
|
+
|
|
194
|
+
- **[CRITICAL_REVIEW.md](./CRITICAL_REVIEW.md)** - Análise detalhada completa
|
|
195
|
+
- **[ARCHITECTURE_RECOMMENDATIONS.md](./ARCHITECTURE_RECOMMENDATIONS.md)** - Recomendações técnicas práticas
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## 🎯 CONCLUSÃO
|
|
200
|
+
|
|
201
|
+
**Veredito**: Solução tem excelente base, mas precisa de investimento significativo em:
|
|
202
|
+
|
|
203
|
+
1. ✅ **Alta Disponibilidade** (crítico)
|
|
204
|
+
2. ✅ **Backup/Recovery** (crítico)
|
|
205
|
+
3. ✅ **API REST** (essencial)
|
|
206
|
+
4. ✅ **Monitoramento** (essencial)
|
|
207
|
+
5. ✅ **Migração para orquestrador** (escalabilidade)
|
|
208
|
+
|
|
209
|
+
**Recomendação Final**: Começar com Docker Compose melhorado para MVP rápido, mas planejar migração para Kubernetes quando atingir ~100 tenants ou necessidade de HA.
|
|
210
|
+
|
package/docs/PROJECT.md
ADDED
|
@@ -0,0 +1,250 @@
|
|
|
1
|
+
# PostgreSQL Multi-Tenant Instance Manager
|
|
2
|
+
|
|
3
|
+
## Project Definition
|
|
4
|
+
|
|
5
|
+
### What It Is
|
|
6
|
+
|
|
7
|
+
A **dynamic PostgreSQL multi-tenant management system** that provides complete database isolation by creating dedicated PostgreSQL instances per tenant. The system uses HAProxy with custom PostgreSQL protocol parsing to route connections intelligently while maintaining complete tenant isolation at the database server level.
|
|
8
|
+
|
|
9
|
+
### Core Concept
|
|
10
|
+
|
|
11
|
+
Instead of sharing a single PostgreSQL instance with multiple databases (shared database architecture), this system creates **one PostgreSQL container per tenant**, ensuring:
|
|
12
|
+
|
|
13
|
+
- **Complete Data Isolation**: Each tenant has its own PostgreSQL process and data directory
|
|
14
|
+
- **Independent Scaling**: Resources can be allocated per tenant
|
|
15
|
+
- **Enhanced Security**: No risk of cross-tenant data access
|
|
16
|
+
- **Simplified Operations**: Each tenant can be managed independently
|
|
17
|
+
|
|
18
|
+
### Key Features
|
|
19
|
+
|
|
20
|
+
1. **Dynamic Tenant Provisioning**
|
|
21
|
+
- Create new PostgreSQL instances on-demand
|
|
22
|
+
- Automatic volume and network configuration
|
|
23
|
+
- Custom initialization scripts per tenant
|
|
24
|
+
|
|
25
|
+
2. **Intelligent Routing**
|
|
26
|
+
- HAProxy parses PostgreSQL protocol to extract username
|
|
27
|
+
- Routes connections to correct tenant backend automatically
|
|
28
|
+
- Single external port (5432) for all tenants
|
|
29
|
+
|
|
30
|
+
3. **Access Control**
|
|
31
|
+
- Per-tenant external access enable/disable
|
|
32
|
+
- Secure by default (access disabled on creation)
|
|
33
|
+
- Runtime access control without service restart
|
|
34
|
+
|
|
35
|
+
4. **Complete Isolation**
|
|
36
|
+
- Separate Docker containers per tenant
|
|
37
|
+
- Isolated volumes for data persistence
|
|
38
|
+
- Network isolation via Docker bridge network
|
|
39
|
+
- No shared processes or memory
|
|
40
|
+
|
|
41
|
+
5. **Zero-Downtime Operations**
|
|
42
|
+
- Graceful HAProxy reloads
|
|
43
|
+
- Independent tenant management
|
|
44
|
+
- No impact on other tenants during operations
|
|
45
|
+
|
|
46
|
+
## Architecture
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
┌─────────────────────────────────────────────────────────┐
|
|
50
|
+
│ External Access │
|
|
51
|
+
│ (localhost:5432) │
|
|
52
|
+
└──────────────────────┬──────────────────────────────────┘
|
|
53
|
+
│
|
|
54
|
+
▼
|
|
55
|
+
┌─────────────────────────────────────────────────────────┐
|
|
56
|
+
│ HAProxy Proxy │
|
|
57
|
+
│ ┌──────────────────────────────────────────────────┐ │
|
|
58
|
+
│ │ Frontend: postgres_frontend │ │
|
|
59
|
+
│ │ - Listens on port 5432 │ │
|
|
60
|
+
│ │ - Parses PostgreSQL protocol (Lua script) │ │
|
|
61
|
+
│ │ - Extracts username from startup packet │ │
|
|
62
|
+
│ │ - Checks tenant-access.json for permissions │ │
|
|
63
|
+
│ └──────────────────────────────────────────────────┘ │
|
|
64
|
+
└──────────────────────┬──────────────────────────────────┘
|
|
65
|
+
│
|
|
66
|
+
┌──────────────┼──────────────┐
|
|
67
|
+
│ │ │
|
|
68
|
+
▼ ▼ ▼
|
|
69
|
+
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
70
|
+
│ Backend │ │ Backend │ │ Backend │
|
|
71
|
+
│ pgs_tenant1 │ │ pgs_tenant2 │ │ pgs_tenant3 │
|
|
72
|
+
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
|
|
73
|
+
│ │ │
|
|
74
|
+
▼ ▼ ▼
|
|
75
|
+
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
|
|
76
|
+
│ PostgreSQL │ │ PostgreSQL │ │ PostgreSQL │
|
|
77
|
+
│ Container 1 │ │ Container 2 │ │ Container 3 │
|
|
78
|
+
│ │ │ │ │ │
|
|
79
|
+
│ Port: 5432 │ │ Port: 5432 │ │ Port: 5432 │
|
|
80
|
+
│ (internal) │ │ (internal) │ │ (internal) │
|
|
81
|
+
│ │ │ │ │ │
|
|
82
|
+
│ Volume: │ │ Volume: │ │ Volume: │
|
|
83
|
+
│ pgdata_1 │ │ pgdata_2 │ │ pgdata_3 │
|
|
84
|
+
└──────────────┘ └──────────────┘ └──────────────┘
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## Technical Implementation
|
|
88
|
+
|
|
89
|
+
### Components
|
|
90
|
+
|
|
91
|
+
1. **Manager Script (manager.js)**
|
|
92
|
+
- Node.js CLI tool for tenant lifecycle management
|
|
93
|
+
- Dynamically generates docker-compose.yml entries
|
|
94
|
+
- Manages HAProxy configuration
|
|
95
|
+
- Controls tenant access permissions
|
|
96
|
+
|
|
97
|
+
2. **HAProxy Reverse Proxy**
|
|
98
|
+
- TCP-level load balancer and router
|
|
99
|
+
- Custom Lua script for PostgreSQL protocol parsing
|
|
100
|
+
- Routes based on extracted username
|
|
101
|
+
- Per-tenant access control
|
|
102
|
+
|
|
103
|
+
3. **PostgreSQL Protocol Parser (pg-route.lua)**
|
|
104
|
+
- Parses binary PostgreSQL startup packet
|
|
105
|
+
- Extracts username and connection parameters
|
|
106
|
+
- Handles SSL negotiation
|
|
107
|
+
- Enforces access control policies
|
|
108
|
+
|
|
109
|
+
4. **Docker Infrastructure**
|
|
110
|
+
- Separate container per tenant
|
|
111
|
+
- Bridge network for internal communication
|
|
112
|
+
- Persistent volumes for data
|
|
113
|
+
- Isolated execution environments
|
|
114
|
+
|
|
115
|
+
### Connection Flow
|
|
116
|
+
|
|
117
|
+
1. Client connects to `localhost:5432` with username `tenant_id`
|
|
118
|
+
2. HAProxy receives connection and invokes Lua script
|
|
119
|
+
3. Script parses PostgreSQL startup packet and extracts username
|
|
120
|
+
4. Script checks `tenant-access.json` for permission
|
|
121
|
+
5. If allowed, routes to backend `pgs_{tenant_id}`
|
|
122
|
+
6. Backend forwards to PostgreSQL container on internal network
|
|
123
|
+
7. Connection established with complete isolation
|
|
124
|
+
|
|
125
|
+
## Comparison with Similar Solutions
|
|
126
|
+
|
|
127
|
+
### Shared Database Architecture
|
|
128
|
+
|
|
129
|
+
**Traditional Multi-Tenant PostgreSQL:**
|
|
130
|
+
- Single PostgreSQL instance
|
|
131
|
+
- Multiple databases/schemas per instance
|
|
132
|
+
- Shared processes and memory
|
|
133
|
+
- Risk of cross-tenant data access
|
|
134
|
+
- Less isolation
|
|
135
|
+
|
|
136
|
+
**This Solution:**
|
|
137
|
+
- Multiple PostgreSQL instances
|
|
138
|
+
- One instance per tenant
|
|
139
|
+
- Complete process isolation
|
|
140
|
+
- Zero risk of cross-tenant access
|
|
141
|
+
- Maximum isolation
|
|
142
|
+
|
|
143
|
+
### Similar Open Source Solutions
|
|
144
|
+
|
|
145
|
+
#### 1. **PgBouncer**
|
|
146
|
+
- **Purpose**: Connection pooling, not tenant isolation
|
|
147
|
+
- **Difference**: Pools connections to single instance; this creates separate instances
|
|
148
|
+
- **Use Case**: Different - PgBouncer optimizes connections; this isolates tenants
|
|
149
|
+
|
|
150
|
+
#### 2. **Citus**
|
|
151
|
+
- **Purpose**: PostgreSQL extension for distributed PostgreSQL
|
|
152
|
+
- **Difference**: Shards data across nodes; this creates separate instances per tenant
|
|
153
|
+
- **Use Case**: Horizontal scaling vs. tenant isolation
|
|
154
|
+
|
|
155
|
+
#### 3. **Patroni + HAProxy**
|
|
156
|
+
- **Purpose**: High availability and load balancing
|
|
157
|
+
- **Difference**: Replicates single database; this creates isolated instances
|
|
158
|
+
- **Use Case**: HA for single database vs. multi-tenant isolation
|
|
159
|
+
|
|
160
|
+
#### 4. **Schema-based Multi-tenancy**
|
|
161
|
+
- **Purpose**: Share database, separate schemas
|
|
162
|
+
- **Difference**: Shared instance; this uses separate instances
|
|
163
|
+
- **Use Case**: Resource efficiency vs. complete isolation
|
|
164
|
+
|
|
165
|
+
#### 5. **Row-level Security (RLS)**
|
|
166
|
+
- **Purpose**: Application-level tenant isolation
|
|
167
|
+
- **Difference**: Logic-based separation; this uses infrastructure isolation
|
|
168
|
+
- **Use Case**: Application isolation vs. infrastructure isolation
|
|
169
|
+
|
|
170
|
+
### Unique Aspects of This Solution
|
|
171
|
+
|
|
172
|
+
1. **Instance-per-tenant at infrastructure level**
|
|
173
|
+
- Not just database or schema separation
|
|
174
|
+
- Complete process and memory isolation
|
|
175
|
+
|
|
176
|
+
2. **Dynamic provisioning with single external port**
|
|
177
|
+
- No need for port management
|
|
178
|
+
- Automatic routing based on connection parameters
|
|
179
|
+
|
|
180
|
+
3. **Protocol-aware routing**
|
|
181
|
+
- Parses PostgreSQL binary protocol
|
|
182
|
+
- Routes before connection completion
|
|
183
|
+
- Handles SSL negotiation
|
|
184
|
+
|
|
185
|
+
4. **Runtime access control**
|
|
186
|
+
- Enable/disable tenant access without restart
|
|
187
|
+
- No downtime for access changes
|
|
188
|
+
|
|
189
|
+
5. **Docker-native architecture**
|
|
190
|
+
- Leverages container isolation
|
|
191
|
+
- Simple deployment and scaling
|
|
192
|
+
- Resource limits per tenant
|
|
193
|
+
|
|
194
|
+
## Use Cases
|
|
195
|
+
|
|
196
|
+
### Ideal For
|
|
197
|
+
|
|
198
|
+
- **SaaS Applications** requiring strict tenant data isolation
|
|
199
|
+
- **Healthcare/Finance** applications with compliance requirements
|
|
200
|
+
- **Multi-tenant platforms** needing independent scaling
|
|
201
|
+
- **Development/Testing** environments with isolated databases
|
|
202
|
+
- **Legacy application migration** requiring tenant separation
|
|
203
|
+
|
|
204
|
+
### Not Ideal For
|
|
205
|
+
|
|
206
|
+
- Thousands of tenants (resource overhead)
|
|
207
|
+
- Shared resource requirements
|
|
208
|
+
- Simple multi-tenant applications without strict isolation needs
|
|
209
|
+
- Environments requiring minimal resource usage
|
|
210
|
+
|
|
211
|
+
## Advantages
|
|
212
|
+
|
|
213
|
+
✅ **Maximum Isolation**: Complete process and data separation
|
|
214
|
+
✅ **Security**: Zero risk of cross-tenant data access
|
|
215
|
+
✅ **Flexibility**: Independent scaling and management per tenant
|
|
216
|
+
✅ **Simplicity**: Single external port, automatic routing
|
|
217
|
+
✅ **Compliance**: Easier to meet regulatory requirements
|
|
218
|
+
✅ **Debugging**: Isolated environments simplify troubleshooting
|
|
219
|
+
|
|
220
|
+
## Trade-offs
|
|
221
|
+
|
|
222
|
+
⚠️ **Resource Usage**: Higher memory/CPU per tenant
|
|
223
|
+
⚠️ **Management Overhead**: More containers to manage
|
|
224
|
+
⚠️ **Scaling Limits**: Practical limit on number of tenants per host
|
|
225
|
+
⚠️ **Backup Complexity**: Need to backup multiple instances
|
|
226
|
+
|
|
227
|
+
## Technology Stack
|
|
228
|
+
|
|
229
|
+
- **Runtime**: Node.js (ES Modules)
|
|
230
|
+
- **Container Orchestration**: Docker Compose
|
|
231
|
+
- **Reverse Proxy**: HAProxy with Lua scripting
|
|
232
|
+
- **Database**: PostgreSQL 18+
|
|
233
|
+
- **Protocol Parsing**: Custom Lua script
|
|
234
|
+
- **Configuration**: YAML (docker-compose.yml), JSON (tenant-access.json)
|
|
235
|
+
|
|
236
|
+
## Future Enhancements
|
|
237
|
+
|
|
238
|
+
- [ ] Health checks and automatic failover
|
|
239
|
+
- [ ] Backup/restore automation per tenant
|
|
240
|
+
- [ ] Resource limits (CPU/memory) per tenant
|
|
241
|
+
- [ ] Monitoring and metrics collection
|
|
242
|
+
- [ ] Tenant migration tools
|
|
243
|
+
- [ ] Kubernetes support
|
|
244
|
+
- [ ] Connection pooling per tenant
|
|
245
|
+
- [ ] SSL/TLS termination
|
|
246
|
+
|
|
247
|
+
## License & Status
|
|
248
|
+
|
|
249
|
+
This is a custom solution built for specific multi-tenant requirements. It combines open-source tools (HAProxy, PostgreSQL, Docker) with custom routing logic to achieve instance-per-tenant isolation with intelligent connection routing.
|
|
250
|
+
|
|
@@ -0,0 +1,177 @@
|
|
|
1
|
+
-- PostgreSQL Protocol Parser for HAProxy
|
|
2
|
+
-- Routes connections based on username extracted from startup packet
|
|
3
|
+
-- Checks tenant-access.json for access control
|
|
4
|
+
|
|
5
|
+
-- Simple JSON parser for our limited use case (flat object with string keys and boolean values)
|
|
6
|
+
local function parse_json(str)
|
|
7
|
+
local result = {}
|
|
8
|
+
-- Match patterns like "key": true or "key": false
|
|
9
|
+
for key, value in string.gmatch(str, '"([^"]+)":%s*(%w+)') do
|
|
10
|
+
if value == "true" then
|
|
11
|
+
result[key] = true
|
|
12
|
+
elseif value == "false" then
|
|
13
|
+
result[key] = false
|
|
14
|
+
end
|
|
15
|
+
end
|
|
16
|
+
return result
|
|
17
|
+
end
|
|
18
|
+
|
|
19
|
+
-- Cache for tenant access configuration
|
|
20
|
+
local tenant_access_cache = {}
|
|
21
|
+
local cache_timestamp = 0
|
|
22
|
+
local CACHE_TTL = 5 -- seconds
|
|
23
|
+
|
|
24
|
+
-- Load tenant access configuration
|
|
25
|
+
local function load_tenant_access()
|
|
26
|
+
local now = core.now().sec
|
|
27
|
+
if now - cache_timestamp < CACHE_TTL then
|
|
28
|
+
return tenant_access_cache
|
|
29
|
+
end
|
|
30
|
+
|
|
31
|
+
local file = io.open("/etc/haproxy/tenant-access.json", "r")
|
|
32
|
+
if not file then
|
|
33
|
+
core.Warning("tenant-access.json not found, denying all connections")
|
|
34
|
+
tenant_access_cache = {}
|
|
35
|
+
cache_timestamp = now
|
|
36
|
+
return tenant_access_cache
|
|
37
|
+
end
|
|
38
|
+
|
|
39
|
+
local content = file:read("*all")
|
|
40
|
+
file:close()
|
|
41
|
+
|
|
42
|
+
local data = parse_json(content)
|
|
43
|
+
tenant_access_cache = data or {}
|
|
44
|
+
cache_timestamp = now
|
|
45
|
+
|
|
46
|
+
return tenant_access_cache
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
-- Parse PostgreSQL startup packet to extract username
|
|
50
|
+
-- PostgreSQL startup packet format:
|
|
51
|
+
-- [4 bytes: length] [4 bytes: protocol version] [key=value pairs\0]
|
|
52
|
+
local function parse_startup_packet(data)
|
|
53
|
+
if #data < 8 then
|
|
54
|
+
return nil
|
|
55
|
+
end
|
|
56
|
+
|
|
57
|
+
-- Read packet length (big-endian)
|
|
58
|
+
local len = (string.byte(data, 1) * 16777216) +
|
|
59
|
+
(string.byte(data, 2) * 65536) +
|
|
60
|
+
(string.byte(data, 3) * 256) +
|
|
61
|
+
string.byte(data, 4)
|
|
62
|
+
|
|
63
|
+
if #data < len then
|
|
64
|
+
return nil
|
|
65
|
+
end
|
|
66
|
+
|
|
67
|
+
-- Read protocol version
|
|
68
|
+
local major = (string.byte(data, 5) * 256) + string.byte(data, 6)
|
|
69
|
+
local minor = (string.byte(data, 7) * 256) + string.byte(data, 8)
|
|
70
|
+
|
|
71
|
+
-- Check for SSL request (protocol 1234.5679)
|
|
72
|
+
if major == 1234 and minor == 5679 then
|
|
73
|
+
return { ssl_request = true }
|
|
74
|
+
end
|
|
75
|
+
|
|
76
|
+
-- Check for cancel request (protocol 1234.5678)
|
|
77
|
+
if major == 1234 and minor == 5678 then
|
|
78
|
+
return { cancel_request = true }
|
|
79
|
+
end
|
|
80
|
+
|
|
81
|
+
-- Normal startup message (protocol 3.0)
|
|
82
|
+
if major ~= 3 then
|
|
83
|
+
return nil
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
-- Parse key-value pairs starting at byte 9
|
|
87
|
+
local params = {}
|
|
88
|
+
local pos = 9
|
|
89
|
+
|
|
90
|
+
while pos < len do
|
|
91
|
+
-- Read key
|
|
92
|
+
local key_start = pos
|
|
93
|
+
while pos <= len and string.byte(data, pos) ~= 0 do
|
|
94
|
+
pos = pos + 1
|
|
95
|
+
end
|
|
96
|
+
|
|
97
|
+
if pos > len then break end
|
|
98
|
+
|
|
99
|
+
local key = string.sub(data, key_start, pos - 1)
|
|
100
|
+
pos = pos + 1 -- skip null
|
|
101
|
+
|
|
102
|
+
if key == "" then break end -- end of parameters
|
|
103
|
+
|
|
104
|
+
-- Read value
|
|
105
|
+
local val_start = pos
|
|
106
|
+
while pos <= len and string.byte(data, pos) ~= 0 do
|
|
107
|
+
pos = pos + 1
|
|
108
|
+
end
|
|
109
|
+
|
|
110
|
+
local value = string.sub(data, val_start, pos - 1)
|
|
111
|
+
pos = pos + 1 -- skip null
|
|
112
|
+
|
|
113
|
+
params[key] = value
|
|
114
|
+
end
|
|
115
|
+
|
|
116
|
+
return params
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
-- Main routing function called by HAProxy
|
|
120
|
+
function pg_route(txn)
|
|
121
|
+
-- Get data from the request buffer
|
|
122
|
+
local data = txn.req:data(0)
|
|
123
|
+
|
|
124
|
+
if not data or #data == 0 then
|
|
125
|
+
-- No data yet, need to wait
|
|
126
|
+
return
|
|
127
|
+
end
|
|
128
|
+
|
|
129
|
+
local params = parse_startup_packet(data)
|
|
130
|
+
|
|
131
|
+
if not params then
|
|
132
|
+
core.Warning("Failed to parse PostgreSQL startup packet")
|
|
133
|
+
txn:set_var("txn.pg_blocked", true)
|
|
134
|
+
return
|
|
135
|
+
end
|
|
136
|
+
|
|
137
|
+
-- Handle SSL request - route to default pool for SSL negotiation
|
|
138
|
+
-- PostgreSQL will respond 'N' (no SSL), then client retries without SSL
|
|
139
|
+
if params.ssl_request then
|
|
140
|
+
core.Info("SSL request received, routing to SSL negotiation pool")
|
|
141
|
+
-- Route to SSL pool backend which forwards to any available PostgreSQL
|
|
142
|
+
-- This allows SSL negotiation to complete
|
|
143
|
+
txn:set_var("txn.pg_backend", "pg_ssl_pool")
|
|
144
|
+
return
|
|
145
|
+
end
|
|
146
|
+
|
|
147
|
+
-- Handle cancel request
|
|
148
|
+
if params.cancel_request then
|
|
149
|
+
txn:set_var("txn.pg_blocked", true)
|
|
150
|
+
return
|
|
151
|
+
end
|
|
152
|
+
|
|
153
|
+
local username = params["user"]
|
|
154
|
+
if not username then
|
|
155
|
+
core.Warning("No username in PostgreSQL startup packet")
|
|
156
|
+
txn:set_var("txn.pg_blocked", true)
|
|
157
|
+
return
|
|
158
|
+
end
|
|
159
|
+
|
|
160
|
+
-- Load tenant access configuration
|
|
161
|
+
local access = load_tenant_access()
|
|
162
|
+
|
|
163
|
+
-- Check if tenant has external access enabled
|
|
164
|
+
if not access[username] then
|
|
165
|
+
core.Warning("Access denied: " .. username)
|
|
166
|
+
txn:set_var("txn.pg_blocked", true)
|
|
167
|
+
return
|
|
168
|
+
end
|
|
169
|
+
|
|
170
|
+
-- Route to tenant's backend
|
|
171
|
+
local backend = "pgs_" .. username
|
|
172
|
+
core.Info("Routing user " .. username .. " to backend " .. backend)
|
|
173
|
+
txn:set_var("txn.pg_backend", backend)
|
|
174
|
+
end
|
|
175
|
+
|
|
176
|
+
-- Register the action with HAProxy
|
|
177
|
+
core.register_action("pg_route", { "tcp-req" }, pg_route, 0)
|