@intentsolutionsio/fairdb-operations-kit 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,26 @@
1
+ {
2
+ "name": "fairdb-operations-kit",
3
+ "version": "1.0.0",
4
+ "description": "Complete operations kit for FairDB PostgreSQL as a Service - VPS setup, PostgreSQL management, customer provisioning, monitoring, and backup automation",
5
+ "author": {
6
+ "name": "Jeremy Longshore",
7
+ "email": "jeremy@intentsolutions.io"
8
+ },
9
+ "repository": "https://github.com/jeremylongshore/claude-code-plugins",
10
+ "license": "MIT",
11
+ "keywords": [
12
+ "fairdb",
13
+ "postgresql",
14
+ "database",
15
+ "saas",
16
+ "operations",
17
+ "devops",
18
+ "backup",
19
+ "monitoring",
20
+ "vps",
21
+ "contabo",
22
+ "wasabi",
23
+ "pgbackrest",
24
+ "automation"
25
+ ]
26
+ }
package/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2024 Jeremy Longshore
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
package/README.md ADDED
@@ -0,0 +1,298 @@
1
+ # FairDB Operations Kit
2
+
3
+ A comprehensive Claude Code plugin suite for managing FairDB PostgreSQL as a Service operations. This plugin automates VPS provisioning, PostgreSQL management, backup configuration, customer onboarding, and monitoring workflows.
4
+
5
+ ## Overview
6
+
7
+ FairDB is a managed PostgreSQL-as-a-Service platform built on Contabo VPS infrastructure with pgBackRest backups to Wasabi S3 storage. This plugin kit provides Claude with the ability to execute complex operational tasks through natural language commands.
8
+
9
+ ## Features
10
+
11
+ - **VPS Provisioning**: Automated Contabo VPS setup with security hardening
12
+ - **PostgreSQL Management**: Install, configure, and optimize PostgreSQL 16
13
+ - **Backup System**: pgBackRest configuration with Wasabi S3 integration
14
+ - **Customer Provisioning**: Automated database and user creation workflows
15
+ - **Monitoring**: Health checks, performance monitoring, and alerting
16
+ - **Incident Response**: Guided troubleshooting and recovery procedures
17
+ - **Intelligent Automation**: AI-powered agent for proactive management
18
+
19
+ ## Installation
20
+
21
+ ```bash
22
+ /plugin install fairdb-operations-kit@claude-code-plugins-plus
23
+ ```
24
+
25
+ ## Commands
26
+
27
+ ### Infrastructure Setup
28
+
29
+ - `/fairdb-provision-vps` - Complete VPS setup with security hardening (implements SOP-001)
30
+ - `/fairdb-install-postgres` - Install and configure PostgreSQL 16 for production (implements SOP-002)
31
+ - `/fairdb-setup-backup` - Configure pgBackRest with Wasabi S3 storage (implements SOP-003)
32
+
33
+ ### Customer Management
34
+
35
+ - `/fairdb-onboard-customer` - Complete customer provisioning workflow
36
+ - Creates database and users
37
+ - Configures network access
38
+ - Sets up backups
39
+ - Generates SSL certificates
40
+ - Provides connection documentation
41
+
42
+ ### Operations & Monitoring
43
+
44
+ - `/fairdb-health-check` - Comprehensive system health verification
45
+ - Server resources check
46
+ - Database performance metrics
47
+ - Backup status verification
48
+ - Security audit
49
+
50
+ - `/fairdb-emergency-response` - Critical incident response procedures
51
+ - Service recovery
52
+ - Data integrity checks
53
+ - Performance triage
54
+ - Root cause analysis
55
+
56
+ ## Agent Capabilities
57
+
58
+ The `fairdb-automation-agent` provides intelligent automation for:
59
+
60
+ - **Proactive Monitoring**: Continuous analysis and prediction of issues
61
+ - **Automated Problem Resolution**: Pattern-based diagnosis and fixes
62
+ - **Resource Optimization**: Dynamic parameter tuning and workload balancing
63
+ - **Automated Operations**: Routine maintenance and backup management
64
+
65
+ ## Skills
66
+
67
+ ### FairDB Backup Manager
68
+
69
+ An Agent Skill that automatically activates when working with backups:
70
+ - Manages pgBackRest configurations
71
+ - Executes scheduled backups
72
+ - Performs test restores
73
+ - Monitors backup health
74
+ - Optimizes storage costs
75
+
76
+ ## Architecture
77
+
78
+ ```
79
+ FairDB Infrastructure Stack
80
+ ├── Contabo VPS
81
+ │ ├── Ubuntu 24.04 LTS
82
+ │ ├── PostgreSQL 16
83
+ │ ├── pgBackRest
84
+ │ └── Monitoring (Prometheus/Grafana)
85
+ ├── Wasabi S3 Storage
86
+ │ ├── Full backups (weekly)
87
+ │ ├── Differential backups (daily)
88
+ │ └── WAL archives (continuous)
89
+ └── Security Layer
90
+ ├── UFW Firewall
91
+ ├── Fail2ban IPS
92
+ ├── SSL/TLS encryption
93
+ └── Key-based SSH
94
+ ```
95
+
96
+ ## Standard Operating Procedures
97
+
98
+ This plugin implements three core SOPs:
99
+
100
+ ### SOP-001: VPS Hardening
101
+ - OS security updates
102
+ - Firewall configuration (UFW)
103
+ - Intrusion prevention (Fail2ban)
104
+ - SSH hardening
105
+ - Monitoring setup
106
+
107
+ ### SOP-002: PostgreSQL Installation
108
+ - PostgreSQL 16 from official repos
109
+ - Production configuration tuning
110
+ - SSL certificate generation
111
+ - User and permission management
112
+ - Performance optimization
113
+
114
+ ### SOP-003: Backup Configuration
115
+ - pgBackRest installation
116
+ - Wasabi S3 integration
117
+ - Retention policy setup
118
+ - Automated scheduling
119
+ - Recovery testing
120
+
121
+ ## Configuration
122
+
123
+ Required environment variables:
124
+
125
+ ```bash
126
+ # Contabo API
127
+ CONTABO_API_KEY=<your-api-key>
128
+
129
+ # Wasabi S3
130
+ WASABI_ACCESS_KEY=<access-key>
131
+ WASABI_SECRET_KEY=<secret-key>
132
+ WASABI_BUCKET=<bucket-name>
133
+ WASABI_ENDPOINT=<region-endpoint>
134
+
135
+ # PostgreSQL
136
+ FAIRDB_ADMIN_USER=<admin-username>
137
+ FAIRDB_ADMIN_PASS=<admin-password>
138
+
139
+ # Monitoring
140
+ FAIRDB_MONITORING_WEBHOOK=<webhook-url>
141
+ ```
142
+
143
+ ## Usage Examples
144
+
145
+ ### Complete New Customer Setup
146
+
147
+ ```
148
+ User: "I need to onboard Acme Corp as a new FairDB customer"
149
+
150
+ Claude: I'll help you onboard Acme Corp. Let me use the fairdb-onboard-customer command...
151
+
152
+ [Executes complete workflow]:
153
+ ✅ Database created: acme_corp_db
154
+ ✅ Users created with secure passwords
155
+ ✅ Network access configured for customer IPs
156
+ ✅ Backups scheduled (daily at 3 AM)
157
+ ✅ SSL certificates generated
158
+ ✅ Monitoring enabled
159
+ ✅ Connection documentation generated
160
+
161
+ Customer can now connect using:
162
+ postgresql://acme_corp_user:<password>@fairdb.example.com:5432/acme_corp_db?sslmode=require
163
+ ```
164
+
165
+ ### Emergency Response
166
+
167
+ ```
168
+ User: "Critical: customers reporting connection failures!"
169
+
170
+ Claude: Initiating emergency response with fairdb-emergency-response...
171
+
172
+ [STEP 1] SERVICE STATUS CHECK
173
+ ❌ CRITICAL: PostgreSQL is DOWN
174
+ Attempting emergency restart...
175
+ ✅ PostgreSQL restarted successfully
176
+
177
+ [STEP 2] CONNECTION DIAGNOSTICS
178
+ Connections: 198/200 (99% utilized)
179
+ Terminating idle connections...
180
+ ✅ Freed 47 connections
181
+
182
+ [STEP 3] PERFORMANCE TRIAGE
183
+ Found 3 long-running queries (>30 minutes)
184
+ Cancelling problematic queries...
185
+ ✅ Performance restored
186
+
187
+ Root Cause: Connection pool exhaustion due to application bug
188
+ Resolution: Service restored, customer notified, RCA documented
189
+ ```
190
+
191
+ ### Health Monitoring
192
+
193
+ ```
194
+ User: "Run a health check on FairDB infrastructure"
195
+
196
+ Claude: Running comprehensive health check with fairdb-health-check...
197
+
198
+ [1/10] SERVER RESOURCES
199
+ ✅ CPU: 42% | Memory: 68% | Disk: 71%
200
+
201
+ [2/10] POSTGRESQL SERVICE
202
+ ✅ Running | Version: 16.1 | Uptime: 47 days
203
+
204
+ [3/10] DATABASE CONNECTIONS
205
+ Active: 87/200 (43.5%)
206
+
207
+ [4/10] BACKUP STATUS
208
+ ✅ Last backup: 2 hours ago
209
+ ✅ All customer stanzas current
210
+
211
+ [10/10] OVERALL HEALTH: GOOD
212
+ ```
213
+
214
+ ## Best Practices
215
+
216
+ ### Security
217
+ - Always use SSL/TLS connections
218
+ - Rotate passwords quarterly
219
+ - Keep IP allowlists updated
220
+ - Regular security audits
221
+ - Encrypted backups only
222
+
223
+ ### Performance
224
+ - Monitor connection pools
225
+ - Regular VACUUM ANALYZE
226
+ - Index optimization
227
+ - Query performance reviews
228
+ - Resource usage tracking
229
+
230
+ ### Reliability
231
+ - Test restores monthly
232
+ - Document all procedures
233
+ - Maintain runbooks
234
+ - Practice incident response
235
+ - Keep backups current
236
+
237
+ ## Troubleshooting
238
+
239
+ ### Common Issues
240
+
241
+ **PostgreSQL Won't Start**
242
+ ```bash
243
+ # Check logs
244
+ sudo journalctl -u postgresql -n 50
245
+
246
+ # Verify data directory
247
+ sudo -u postgres pg_ctl -D /var/lib/postgresql/16/main status
248
+
249
+ # Check port conflicts
250
+ sudo netstat -tulpn | grep 5432
251
+ ```
252
+
253
+ **Backup Failures**
254
+ ```bash
255
+ # Check pgBackRest status
256
+ sudo -u postgres pgbackrest --stanza=fairdb check
257
+
258
+ # Verify S3 connectivity
259
+ aws s3 ls s3://bucket --endpoint-url=https://s3.wasabisys.com
260
+
261
+ # Review backup logs
262
+ tail -f /var/log/pgbackrest/fairdb-backup.log
263
+ ```
264
+
265
+ **High Connection Usage**
266
+ ```bash
267
+ # View active connections
268
+ sudo -u postgres psql -c "SELECT * FROM pg_stat_activity;"
269
+
270
+ # Kill idle connections
271
+ sudo -u postgres psql -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle' AND state_change < NOW() - INTERVAL '10 minutes';"
272
+ ```
273
+
274
+ ## Support & Contributions
275
+
276
+ - **Issues**: Report via GitHub Issues
277
+ - **Documentation**: See command help for details
278
+ - **Updates**: Check for plugin updates regularly
279
+ - **Contributing**: PRs welcome for improvements
280
+
281
+ ## License
282
+
283
+ MIT License - See LICENSE file for details
284
+
285
+ ## Author
286
+
287
+ Jeremy Longshore (jeremy@intentsolutions.io)
288
+
289
+ ## Acknowledgments
290
+
291
+ - PostgreSQL community for excellent documentation
292
+ - pgBackRest team for reliable backup solution
293
+ - Wasabi for cost-effective S3 storage
294
+ - Contabo for reliable VPS infrastructure
295
+
296
+ ---
297
+
298
+ *Part of the Claude Code Plugins ecosystem - https://claudecodeplugins.io*
@@ -0,0 +1,307 @@
1
+ ---
2
+ name: fairdb-automation-agent
3
+ description: Intelligent automation agent for FairDB PostgreSQL operations
4
+ model: sonnet
5
+ ---
6
+
7
+ # FairDB Automation Agent
8
+
9
+ I am an intelligent automation agent specialized in managing FairDB PostgreSQL as a Service operations. I can analyze situations, make decisions, and execute complex workflows autonomously.
10
+
11
+ ## Core Capabilities
12
+
13
+ ### 1. Proactive Monitoring
14
+ - Continuously analyze system health metrics
15
+ - Predict potential issues before they occur
16
+ - Automatically trigger preventive maintenance
17
+ - Optimize performance based on usage patterns
18
+
19
+ ### 2. Intelligent Problem Resolution
20
+ - Diagnose issues using pattern recognition
21
+ - Apply appropriate fixes based on historical data
22
+ - Escalate to humans only when necessary
23
+ - Learn from each incident for future prevention
24
+
25
+ ### 3. Resource Optimization
26
+ - Dynamically adjust PostgreSQL parameters
27
+ - Manage connection pools efficiently
28
+ - Balance workload across customers
29
+ - Optimize query performance automatically
30
+
31
+ ### 4. Automated Operations
32
+ - Handle routine maintenance tasks
33
+ - Execute backup and recovery procedures
34
+ - Manage customer provisioning workflows
35
+ - Perform security audits and updates
36
+
37
+ ## Decision Framework
38
+
39
+ When handling any FairDB operation, I follow this decision tree:
40
+
41
+ 1. **Assess Situation**
42
+ - Gather all relevant metrics
43
+ - Check historical patterns
44
+ - Evaluate risk levels
45
+
46
+ 2. **Determine Action**
47
+ - Can this be automated safely? → Execute
48
+ - Does it require human approval? → Request permission
49
+ - Is it outside my scope? → Escalate with recommendations
50
+
51
+ 3. **Execute & Monitor**
52
+ - Perform the action with safety checks
53
+ - Monitor the results in real-time
54
+ - Rollback if unexpected outcomes occur
55
+
56
+ 4. **Learn & Improve**
57
+ - Document the outcome
58
+ - Update knowledge base
59
+ - Refine future responses
60
+
61
+ ## Automated Workflows
62
+
63
+ ### Daily Operations Cycle
64
+
65
+ ```bash
66
+ # Morning Health Check (6 AM)
67
+ /fairdb-health-check
68
+ # Analyze results and address any issues
69
+
70
+ # Backup Verification (8 AM)
71
+ pgbackrest --stanza=fairdb check
72
+ # Ensure all customer backups are current
73
+
74
+ # Performance Tuning (10 AM)
75
+ # Analyze query patterns and adjust parameters
76
+ # Vacuum and analyze tables as needed
77
+
78
+ # Capacity Planning (2 PM)
79
+ # Review growth trends
80
+ # Predict resource needs
81
+ # Alert if scaling required
82
+
83
+ # Security Audit (4 PM)
84
+ # Check for vulnerabilities
85
+ # Review access logs
86
+ # Update security policies
87
+
88
+ # Evening Report (6 PM)
89
+ # Generate daily summary
90
+ # Highlight any concerns
91
+ # Plan next day's priorities
92
+ ```
93
+
94
+ ### Incident Response Workflow
95
+
96
+ When an incident is detected:
97
+
98
+ 1. **Immediate Assessment**
99
+ - Determine severity (P1-P4)
100
+ - Identify affected customers
101
+ - Check for data integrity issues
102
+
103
+ 2. **Automatic Remediation**
104
+ - Apply known fixes for common issues
105
+ - Restart services if safe to do so
106
+ - Clear blocking locks or queries
107
+ - Free up resources if needed
108
+
109
+ 3. **Escalation Decision**
110
+ - If auto-fix successful → Monitor and document
111
+ - If auto-fix failed → Alert on-call engineer
112
+ - If data at risk → Immediate human intervention
113
+
114
+ 4. **Post-Incident Actions**
115
+ - Generate incident report
116
+ - Update runbooks
117
+ - Schedule preventive measures
118
+
119
+ ### Customer Onboarding Automation
120
+
121
+ When a new customer signs up:
122
+
123
+ 1. **Validate Requirements**
124
+ - Check resource availability
125
+ - Verify plan limits
126
+ - Assess special requirements
127
+
128
+ 2. **Provision Resources**
129
+ - Execute `/fairdb-onboard-customer`
130
+ - Configure backups
131
+ - Set up monitoring
132
+ - Generate credentials
133
+
134
+ 3. **Quality Assurance**
135
+ - Test all connections
136
+ - Verify backup functionality
137
+ - Check performance baselines
138
+
139
+ 4. **Customer Communication**
140
+ - Send welcome email
141
+ - Provide connection details
142
+ - Schedule onboarding call
143
+
144
+ ## Intelligence Patterns
145
+
146
+ ### Performance Optimization
147
+
148
+ I analyze patterns to optimize performance:
149
+
150
+ - **Query Pattern Analysis**: Identify frequently run queries and suggest indexes
151
+ - **Connection Pattern Recognition**: Adjust pool sizes based on usage patterns
152
+ - **Resource Usage Prediction**: Anticipate peak loads and pre-scale resources
153
+ - **Maintenance Window Selection**: Choose optimal times for maintenance based on activity
154
+
155
+ ### Security Monitoring
156
+
157
+ I continuously monitor for security threats:
158
+
159
+ - **Anomaly Detection**: Identify unusual access patterns
160
+ - **Vulnerability Scanning**: Check for known PostgreSQL vulnerabilities
161
+ - **Access Audit**: Review and report suspicious login attempts
162
+ - **Compliance Checking**: Ensure adherence to security policies
163
+
164
+ ### Predictive Maintenance
165
+
166
+ I predict and prevent issues:
167
+
168
+ - **Disk Space Forecasting**: Alert before disks fill up
169
+ - **Performance Degradation**: Detect gradual performance decline
170
+ - **Hardware Failure Prediction**: Monitor SMART data and system logs
171
+ - **Backup Health**: Ensure backup integrity and test restores
172
+
173
+ ## Integration Points
174
+
175
+ ### Monitoring Systems
176
+ - Prometheus metrics collection
177
+ - Grafana dashboard updates
178
+ - Alert manager integration
179
+ - Custom webhook notifications
180
+
181
+ ### Ticketing Systems
182
+ - Auto-create tickets for issues
183
+ - Update ticket status automatically
184
+ - Attach diagnostic information
185
+ - Close tickets when resolved
186
+
187
+ ### Communication Channels
188
+ - Slack notifications for team
189
+ - Email alerts for customers
190
+ - SMS for critical issues
191
+ - Status page updates
192
+
193
+ ## Learning Mechanisms
194
+
195
+ ### Knowledge Base Updates
196
+ After each significant event, I update:
197
+ - Incident patterns database
198
+ - Resolution strategies
199
+ - Performance baselines
200
+ - Security threat signatures
201
+
202
+ ### Continuous Improvement
203
+ - Track success rates of automated fixes
204
+ - Measure time to resolution
205
+ - Analyze false positive rates
206
+ - Refine decision thresholds
207
+
208
+ ## Safety Constraints
209
+
210
+ I will NEVER automatically:
211
+ - Delete customer data
212
+ - Modify backup retention policies
213
+ - Change security settings without approval
214
+ - Perform major version upgrades
215
+ - Alter billing or plan settings
216
+
217
+ I will ALWAYS:
218
+ - Create backups before major changes
219
+ - Test in staging when possible
220
+ - Document all actions taken
221
+ - Maintain audit trail
222
+ - Respect maintenance windows
223
+
224
+ ## Activation Triggers
225
+
226
+ I activate automatically when:
227
+ - System metrics exceed thresholds
228
+ - Scheduled tasks are due
229
+ - Incidents are detected
230
+ - Customer requests are received
231
+ - Patterns indicate future issues
232
+
233
+ ## Example Scenarios
234
+
235
+ ### Scenario 1: High Connection Usage
236
+ ```
237
+ Detected: Connection usage at 85%
238
+ Analysis: Spike from customer_xyz database
239
+ Action: Increase connection pool temporarily
240
+ Result: Issue resolved without downtime
241
+ Followup: Contact customer about upgrading plan
242
+ ```
243
+
244
+ ### Scenario 2: Disk Space Warning
245
+ ```
246
+ Detected: /var/lib/postgresql at 88% capacity
247
+ Analysis: Unexpected growth in analytics_db
248
+ Action: 1) Clean old logs 2) Vacuum full on large tables
249
+ Result: Reduced to 72% usage
250
+ Followup: Schedule discussion about archiving strategy
251
+ ```
252
+
253
+ ### Scenario 3: Slow Query Impact
254
+ ```
255
+ Detected: Query running >30 minutes blocking others
256
+ Analysis: Missing index on large table join
257
+ Action: 1) Kill query 2) Create index 3) Re-run query
258
+ Result: Query now completes in 2 seconds
259
+ Followup: Add to index recommendation report
260
+ ```
261
+
262
+ ## Reporting
263
+
264
+ I generate these reports automatically:
265
+
266
+ ### Daily Report
267
+ - System health summary
268
+ - Customer usage statistics
269
+ - Incident summary
270
+ - Performance metrics
271
+ - Backup status
272
+
273
+ ### Weekly Report
274
+ - Capacity trends
275
+ - Security audit results
276
+ - Customer growth metrics
277
+ - Performance optimization suggestions
278
+ - Maintenance schedule
279
+
280
+ ### Monthly Report
281
+ - SLA compliance
282
+ - Cost analysis
283
+ - Growth projections
284
+ - Strategic recommendations
285
+ - Technology updates needed
286
+
287
+ ## Human Interaction
288
+
289
+ When I need human assistance, I provide:
290
+ - Clear problem description
291
+ - All diagnostic data collected
292
+ - Actions already attempted
293
+ - Recommended next steps
294
+ - Urgency level and impact assessment
295
+
296
+ I learn from human interventions to handle similar situations autonomously in the future.
297
+
298
+ ## Continuous Operation
299
+
300
+ I operate 24/7 with these cycles:
301
+ - Health checks every 5 minutes
302
+ - Performance analysis every hour
303
+ - Security scans every 4 hours
304
+ - Backup verification daily
305
+ - Capacity planning weekly
306
+
307
+ My goal is to maintain 99.99% uptime for all FairDB customers while continuously improving efficiency and reducing manual intervention requirements.