archsight 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (58) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +26 -5
  3. data/lib/archsight/analysis/executor.rb +112 -0
  4. data/lib/archsight/analysis/result.rb +174 -0
  5. data/lib/archsight/analysis/sandbox.rb +319 -0
  6. data/lib/archsight/analysis.rb +11 -0
  7. data/lib/archsight/annotations/architecture_annotations.rb +2 -2
  8. data/lib/archsight/cli.rb +163 -0
  9. data/lib/archsight/database.rb +6 -2
  10. data/lib/archsight/helpers/analysis_renderer.rb +83 -0
  11. data/lib/archsight/helpers/formatting.rb +95 -0
  12. data/lib/archsight/helpers.rb +20 -4
  13. data/lib/archsight/import/concurrent_progress.rb +341 -0
  14. data/lib/archsight/import/executor.rb +466 -0
  15. data/lib/archsight/import/git_analytics.rb +626 -0
  16. data/lib/archsight/import/handler.rb +263 -0
  17. data/lib/archsight/import/handlers/github.rb +161 -0
  18. data/lib/archsight/import/handlers/gitlab.rb +202 -0
  19. data/lib/archsight/import/handlers/jira_base.rb +189 -0
  20. data/lib/archsight/import/handlers/jira_discover.rb +161 -0
  21. data/lib/archsight/import/handlers/jira_metrics.rb +179 -0
  22. data/lib/archsight/import/handlers/openapi_schema_parser.rb +279 -0
  23. data/lib/archsight/import/handlers/repository.rb +439 -0
  24. data/lib/archsight/import/handlers/rest_api.rb +293 -0
  25. data/lib/archsight/import/handlers/rest_api_index.rb +183 -0
  26. data/lib/archsight/import/progress.rb +91 -0
  27. data/lib/archsight/import/registry.rb +54 -0
  28. data/lib/archsight/import/shared_file_writer.rb +67 -0
  29. data/lib/archsight/import/team_matcher.rb +195 -0
  30. data/lib/archsight/import.rb +14 -0
  31. data/lib/archsight/resources/analysis.rb +91 -0
  32. data/lib/archsight/resources/application_component.rb +2 -2
  33. data/lib/archsight/resources/application_service.rb +12 -12
  34. data/lib/archsight/resources/business_product.rb +12 -12
  35. data/lib/archsight/resources/data_object.rb +1 -1
  36. data/lib/archsight/resources/import.rb +79 -0
  37. data/lib/archsight/resources/technology_artifact.rb +23 -2
  38. data/lib/archsight/version.rb +1 -1
  39. data/lib/archsight/web/api/docs.rb +17 -0
  40. data/lib/archsight/web/api/json_helpers.rb +164 -0
  41. data/lib/archsight/web/api/openapi/spec.yaml +500 -0
  42. data/lib/archsight/web/api/routes.rb +101 -0
  43. data/lib/archsight/web/application.rb +66 -43
  44. data/lib/archsight/web/doc/import.md +458 -0
  45. data/lib/archsight/web/doc/index.md.erb +1 -0
  46. data/lib/archsight/web/public/css/artifact.css +10 -0
  47. data/lib/archsight/web/public/css/graph.css +14 -0
  48. data/lib/archsight/web/public/css/instance.css +489 -0
  49. data/lib/archsight/web/views/api_docs.erb +19 -0
  50. data/lib/archsight/web/views/partials/artifact/_project_estimate.haml +14 -8
  51. data/lib/archsight/web/views/partials/instance/_analysis_detail.haml +74 -0
  52. data/lib/archsight/web/views/partials/instance/_analysis_result.haml +64 -0
  53. data/lib/archsight/web/views/partials/instance/_detail.haml +7 -3
  54. data/lib/archsight/web/views/partials/instance/_import_detail.haml +87 -0
  55. data/lib/archsight/web/views/partials/instance/_relations.haml +4 -4
  56. data/lib/archsight/web/views/partials/layout/_content.haml +4 -0
  57. data/lib/archsight/web/views/partials/layout/_navigation.haml +6 -5
  58. metadata +78 -1
@@ -0,0 +1,458 @@
1
+ # Import System
2
+
3
+ The import system allows you to declaratively define data imports that generate architecture resources from external sources like GitLab, GitHub, and git repositories.
4
+
5
+ ## Overview
6
+
7
+ Imports are defined as YAML resources with kind `Import`. Each import specifies:
8
+
9
+ - A **handler** that knows how to fetch and process data
10
+ - **Configuration** specific to that handler
11
+ - Optional **caching** to avoid re-running unchanged imports
12
+
13
+ Dependencies between imports are automatically derived from the `generates` relation. When a parent import generates child imports, those children depend on the parent and will run after it completes.
14
+
15
+ The import executor runs imports concurrently in dependency order, with a visual progress display showing completion percentage and ETA.
16
+
17
+ ## Running Imports
18
+
19
+ ```bash
20
+ # Run all pending imports
21
+ archsight import
22
+
23
+ # Verbose output
24
+ archsight import -v
25
+
26
+ # Show execution plan without running
27
+ archsight import --dry-run
28
+
29
+ # Force re-run all imports (ignore cache)
30
+ archsight import --force
31
+ ```
32
+
33
+ ### Progress Display
34
+
35
+ In TTY mode, imports show a live progress display:
36
+
37
+ ```
38
+ Overall ████████████░░░░░░░░ 60% [30/50] ETA: 2:15
39
+ Import:Repo:project-a - Analyzing code
40
+ Import:Repo:project-b - Cloning repository
41
+ Import:Repo:project-c - Done
42
+ ```
43
+
44
+ ## Defining an Import
45
+
46
+ Create YAML files in the `imports/` directory:
47
+
48
+ ```yaml
49
+ apiVersion: architecture/v1alpha1
50
+ kind: Import
51
+ metadata:
52
+ name: Import:MyData
53
+ annotations:
54
+ import/handler: <handler-name>
55
+ import/priority: "10"
56
+ import/cacheTime: "24h"
57
+ import/config/key: value
58
+ spec: {}
59
+ ```
60
+
61
+ ### Core Annotations
62
+
63
+ | Annotation | Description |
64
+ |------------|-------------|
65
+ | `import/handler` | Handler to execute (required) |
66
+ | `import/enabled` | Set to "false" to disable |
67
+ | `import/priority` | Execution order (lower runs first, default: 0) |
68
+ | `import/cacheTime` | Cache duration: "30m", "1h", "24h", "7d", or "never" |
69
+ | `import/outputPath` | Output file path relative to resources directory |
70
+
71
+ ### Caching
72
+
73
+ Imports can be cached to avoid re-running when data hasn't changed:
74
+
75
+ ```yaml
76
+ import/cacheTime: "24h" # Re-run after 24 hours
77
+ ```
78
+
79
+ Supported duration formats:
80
+ - `30m` - 30 minutes
81
+ - `1h` - 1 hour
82
+ - `24h` - 24 hours
83
+ - `7d` - 7 days
84
+ - `never` - Always run (default)
85
+
86
+ The cache uses the `generated/at` annotation written by handlers. When an import completes, it writes a marker with the current timestamp. On subsequent runs, the executor checks if `generated/at + cacheTime > now` to skip cached imports.
87
+
88
+ ### Configuration Pattern
89
+
90
+ Handler-specific configuration uses `import/config/*` annotations:
91
+
92
+ ```yaml
93
+ import/config/host: gitlab.company.com
94
+ import/config/fallbackTeam: "Team:Platform"
95
+ ```
96
+
97
+ ## Available Handlers
98
+
99
+ ### gitlab
100
+
101
+ Lists repositories from a GitLab instance and generates child Import resources.
102
+
103
+ **Configuration:**
104
+ - `host` - GitLab host (required)
105
+ - `exploreGroups` - If "true", explore all visible groups (default: false)
106
+ - `repoOutputPath` - Output path for repository handler results
107
+ - `fallbackTeam` - Default team when no contributor match found
108
+ - `botTeam` - Team for bot-only repositories
109
+
110
+ **Environment:**
111
+ - `GITLAB_TOKEN` - Personal access token (required)
112
+
113
+ **Output:** Generates `Import:Repo:gitlab:*` resources for each repository.
114
+
115
+ ### github
116
+
117
+ Lists repositories from a GitHub organization and generates child Import resources.
118
+
119
+ **Configuration:**
120
+ - `org` - GitHub organization (required)
121
+ - `repoOutputPath` - Output path for repository handler results
122
+
123
+ **Environment:**
124
+ - `GITHUB_TOKEN` - GitHub Personal Access Token (required)
125
+ - Create at: https://github.com/settings/tokens
126
+ - Required scopes: `repo` (private repos) or `public_repo` (public only)
127
+ - If you have `gh` CLI authenticated: `export GITHUB_TOKEN=$(gh auth token)`
128
+
129
+ **Output:** Generates `Import:Repo:github:*` resources for each repository.
130
+
131
+ ### repository
132
+
133
+ Analyzes a single git repository and generates a TechnologyArtifact resource.
134
+
135
+ **Configuration:**
136
+ - `path` - Local repository path (required)
137
+ - `gitUrl` - Git URL to clone from (if not already cloned)
138
+ - `archived` - If "true", mark as archived
139
+ - `visibility` - Repository visibility: internal, public, open-source
140
+ - `sccPath` - Path to scc binary (default: scc)
141
+ - `fallbackTeam` - Default team when no contributor match found
142
+ - `botTeam` - Team for bot-only repositories
143
+
144
+ **Output:** Generates one TechnologyArtifact resource with:
145
+ - Code analysis metrics (languages, LOC, estimated cost)
146
+ - Git activity metrics (commits, contributors, bus factor)
147
+ - Team matching based on contributor history
148
+ - Deployment artifact detection (containers, charts, etc.)
149
+ - Agentic tool detection (Claude, Cursor, etc.)
150
+
151
+ **Special Cases:**
152
+
153
+ The repository handler creates minimal artifacts for repositories that can't be fully analyzed:
154
+
155
+ | Status | Reason |
156
+ |--------|--------|
157
+ | `inaccessible` | Clone failed due to access denied or auth errors |
158
+ | `empty` | Repository has no commits |
159
+ | `no-code` | No analyzable source code (only config, docs, etc.) |
160
+
161
+ These artifacts include `activity/status` and `activity/reason` annotations documenting why full analysis wasn't possible.
162
+
163
+ ### rest-api-index
164
+
165
+ Fetches an API index JSON and generates child Import resources for each API.
166
+
167
+ **Configuration:**
168
+ - `indexUrl` - URL to fetch API index JSON (required)
169
+ - `baseUrl` - Base URL for spec files (optional, derived from indexUrl)
170
+ - `interfaceOutputPath` - Shared output path for ApplicationInterface resources
171
+ - `dataObjectOutputPath` - Shared output path for DataObject resources
172
+ - `skipVisibility` - Comma-separated visibilities to skip (e.g., "public-preview,beta")
173
+
174
+ **Expected Index Format:**
175
+ ```json
176
+ [
177
+ {
178
+ "name": "compute",
179
+ "version": "6.0",
180
+ "visibility": "private",
181
+ "specPath": "/rest-api/compute/openapi.yaml",
182
+ "redocPath": "/rest-api/compute/redoc.html",
183
+ "gate": "GA"
184
+ }
185
+ ]
186
+ ```
187
+
188
+ **Output:** Generates `Import:RestApi:*` resources for each API in the index.
189
+
190
+ ### rest-api
191
+
192
+ Downloads an OpenAPI spec and generates ApplicationInterface and DataObject resources.
193
+
194
+ **Configuration:**
195
+ - `name` - API name (required)
196
+ - `version` - API version (default: "1.0")
197
+ - `visibility` - API visibility (default: "private")
198
+ - `specUrl` - Full URL to OpenAPI spec - http, https, or file:// (required)
199
+ - `htmlUrl` - Full URL to HTML documentation (optional)
200
+ - `gate` - Release gate: "GA", "BETA", etc. (default: "GA")
201
+ - `interfaceOutputPath` - Output path for ApplicationInterface resources
202
+ - `dataObjectOutputPath` - Output path for DataObject resources
203
+
204
+ **Output:**
205
+ - ApplicationInterface resource with detected auth methods (JWT, Basic, API Key, OAuth2, OIDC)
206
+ - DataObject resources extracted from OpenAPI schema definitions
207
+
208
+ **Interface Naming:** `{Visibility}:{ApiName}:v{MajorVersion}:RestAPI`
209
+ - Example: `Private:Compute:v6:RestAPI`
210
+
211
+ ### Example: REST API Multi-Stage Import
212
+
213
+ ```yaml
214
+ # imports/rest-api.yaml
215
+ apiVersion: architecture/v1alpha1
216
+ kind: Import
217
+ metadata:
218
+ name: Import:RestApi:Index
219
+ annotations:
220
+ import/handler: rest-api-index
221
+ import/priority: "1"
222
+ import/cacheTime: "24h"
223
+ import/config/indexUrl: https://api.example.com/index.json
224
+ import/config/interfaceOutputPath: generated/rest-api-interfaces.yaml
225
+ import/config/dataObjectOutputPath: generated/rest-api-data-objects.yaml
226
+ import/config/skipVisibility: public-preview
227
+ spec: {}
228
+ ```
229
+
230
+ After running, this generates child imports like:
231
+
232
+ ```yaml
233
+ # generated/Import_RestApi_Index.yaml
234
+ ---
235
+ apiVersion: architecture/v1alpha1
236
+ kind: Import
237
+ metadata:
238
+ name: Import:RestApi:compute
239
+ annotations:
240
+ import/handler: rest-api
241
+ import/config/name: compute
242
+ import/config/version: "6.0"
243
+ import/config/visibility: private
244
+ import/config/specUrl: https://api.example.com/rest-api/compute/openapi.yaml
245
+ import/config/htmlUrl: https://api.example.com/rest-api/compute/redoc.html
246
+ import/config/gate: GA
247
+ import/config/interfaceOutputPath: generated/rest-api-interfaces.yaml
248
+ import/config/dataObjectOutputPath: generated/rest-api-data-objects.yaml
249
+ spec: {}
250
+ ---
251
+ # Parent import tracks what it generates
252
+ apiVersion: architecture/v1alpha1
253
+ kind: Import
254
+ metadata:
255
+ name: Import:RestApi:Index
256
+ spec:
257
+ generates:
258
+ imports:
259
+ - Import:RestApi:compute
260
+ ```
261
+
262
+ The dependency of `Import:RestApi:compute` on `Import:RestApi:Index` is automatically derived from the `generates` relation above.
263
+
264
+ ## Multi-Stage Import Pattern
265
+
266
+ Imports can generate other Import resources, enabling multi-stage workflows:
267
+
268
+ 1. **GitLab import** runs and discovers repositories
269
+ 2. Generates `Import:Repo:*` for each repository
270
+ 3. Database reloads, discovers new Import resources
271
+ 4. **Repository imports** run concurrently (up to 20)
272
+ 5. Each generates TechnologyArtifact resources
273
+ 6. Loop continues until no pending imports remain
274
+
275
+ ### Example: GitLab Multi-Stage Import
276
+
277
+ ```yaml
278
+ # imports/gitlab.yaml
279
+ apiVersion: architecture/v1alpha1
280
+ kind: Import
281
+ metadata:
282
+ name: Import:GitLab
283
+ annotations:
284
+ import/handler: gitlab
285
+ import/priority: "1"
286
+ import/cacheTime: "24h"
287
+ import/config/host: gitlab.company.com
288
+ import/config/repoOutputPath: generated/repositories.yaml
289
+ import/config/fallbackTeam: "Team:Platform"
290
+ spec: {}
291
+ ```
292
+
293
+ After running, this generates:
294
+
295
+ ```yaml
296
+ # generated/gitlab-imports.yaml
297
+ ---
298
+ apiVersion: architecture/v1alpha1
299
+ kind: Import
300
+ metadata:
301
+ name: Import:Repo:gitlab:company:my-service
302
+ annotations:
303
+ import/handler: repository
304
+ import/config/path: ~/.cache/archsight/git/gitlab/company/my-service
305
+ import/config/gitUrl: git@gitlab.company.com:company/my-service.git
306
+ import/config/archived: "false"
307
+ import/config/visibility: internal
308
+ import/outputPath: generated/repositories.yaml
309
+ spec: {}
310
+ ---
311
+ # Parent import tracks what it generates
312
+ apiVersion: architecture/v1alpha1
313
+ kind: Import
314
+ metadata:
315
+ name: Import:GitLab
316
+ spec:
317
+ generates:
318
+ imports:
319
+ - Import:Repo:gitlab:company:my-service
320
+ ```
321
+
322
+ The child import depends on `Import:GitLab` because the parent's `generates` relation includes it.
323
+
324
+ ## Generated Annotations
325
+
326
+ ### TechnologyArtifact Annotations
327
+
328
+ Repository analysis generates these annotations:
329
+
330
+ **Repository Info:**
331
+ - `artifact/type` - Always "repo" for repositories
332
+ - `repository/git` - Git URL
333
+ - `repository/visibility` - internal, public, open-source
334
+ - `repository/recentTags` - Recent git tags (releases)
335
+ - `repository/accessible` - "false" if repo couldn't be accessed
336
+ - `repository/error` - Error message for inaccessible repos
337
+
338
+ **Code Metrics (from scc):**
339
+ - `scc/languages` - Comma-separated language list
340
+ - `scc/estimatedCost` - COCOMO cost estimate
341
+ - `scc/estimatedPeople` - Estimated team size
342
+ - `scc/language/*/loc` - Lines of code per language
343
+
344
+ **Activity Metrics:**
345
+ - `activity/status` - active, abandoned, bot-only, archived, inaccessible, empty, no-code
346
+ - `activity/reason` - Explanation for non-standard statuses
347
+ - `activity/commits` - Monthly commit counts (12 months)
348
+ - `activity/contributors` - Monthly contributor counts
349
+ - `activity/contributors/6m` - Unique contributors (6 months)
350
+ - `activity/contributors/total` - Total unique contributors
351
+ - `activity/busFactor` - Risk assessment: high, medium, low
352
+ - `activity/createdAt` - First commit date
353
+ - `activity/lastHumanCommit` - Last non-bot commit date
354
+
355
+ **Deployment Detection:**
356
+ - `repository/artifacts` - Detected artifact types (container, chart, debian, rpm)
357
+ - `deployment/images` - OCI container image names
358
+ - `workflow/platforms` - CI/CD platforms (github-actions, gitlab-ci)
359
+ - `workflow/types` - Workflow types (build, test, deploy, etc.)
360
+ - `agentic/tools` - AI coding tools detected (claude, cursor, aider)
361
+
362
+ ## Troubleshooting
363
+
364
+ ### Deadlock Error
365
+
366
+ If you see "Deadlock: pending imports have unsatisfied dependencies", check that:
367
+
368
+ 1. All dependencies exist as Import resources
369
+ 2. Dependencies don't form a circular chain
370
+ 3. Dependent imports haven't failed
371
+
372
+ ### Access Denied Errors
373
+
374
+ When a repository can't be cloned due to access issues, the handler creates a minimal artifact with:
375
+ - `activity/status: inaccessible`
376
+ - `repository/accessible: false`
377
+ - `repository/error: <error message>`
378
+
379
+ This allows the import to continue processing other repositories.
380
+
381
+ ### Cached Imports Not Updating
382
+
383
+ If an import isn't re-running when expected:
384
+ 1. Check `import/cacheTime` annotation value
385
+ 2. Check `generated/at` annotation on the Import resource
386
+ 3. Use `archsight import --force` to bypass cache for all imports
387
+ 4. Alternatively, use `import/cacheTime: never` to always re-run a specific import
388
+
389
+ **Note:** The cache is automatically invalidated when:
390
+ - The output file (specified by `import/outputPath`) is deleted or missing
391
+ - The import source YAML file has been modified since the last run
392
+
393
+ ## Creating Custom Handlers
394
+
395
+ To create a new handler:
396
+
397
+ 1. Create a Ruby file in `lib/archsight/import/handlers/`
398
+ 2. Inherit from `Archsight::Import::Handler`
399
+ 3. Implement the `execute` method
400
+ 4. Register with `Registry.register("name", YourHandler)`
401
+
402
+ ```ruby
403
+ # Repository handler - clones/syncs and analyzes a git repository
404
+ class Archsight::Import::Handlers::Custom < Archsight::Import::Handler
405
+ def execute
406
+ # Read configuration
407
+ url = config("url")
408
+
409
+ # Update progress
410
+ progress.update("Fetching data")
411
+
412
+ # Fetch and process data
413
+ data = fetch_data(url)
414
+
415
+ # Generate resources
416
+ resources = data.map { |item| build_resource(item) }
417
+
418
+ # Write output with self-marker for caching
419
+ yaml_content = resources_to_yaml(resources) + YAML.dump(self_marker)
420
+ write_yaml(yaml_content)
421
+ end
422
+
423
+ # Marker for cache timestamp
424
+ def self_marker
425
+ {
426
+ "apiVersion" => "architecture/v1alpha1",
427
+ "kind" => "Import",
428
+ "metadata" => {
429
+ "name" => import_resource.name,
430
+ "annotations" => { "generated/at" => Time.now.utc.iso8601 }
431
+ },
432
+ "spec" => {}
433
+ }
434
+ end
435
+
436
+ private
437
+
438
+ def build_resource(item)
439
+ resource_yaml(
440
+ kind: "TechnologyArtifact",
441
+ name: item["name"],
442
+ annotations: { "custom/field" => item["value"] }
443
+ )
444
+ end
445
+ end
446
+
447
+ Archsight::Import::Registry.register("custom", Archsight::Import::Handlers::Custom)
448
+ ```
449
+
450
+ ### Handler Helper Methods
451
+
452
+ | Method | Description |
453
+ |--------|-------------|
454
+ | `config(key, default:)` | Get configuration value |
455
+ | `progress.update(msg)` | Update progress display |
456
+ | `write_yaml(content)` | Write YAML to output path |
457
+ | `resource_yaml(kind:, name:, ...)` | Build resource hash |
458
+ | `import_yaml(name:, handler:, ...)` | Build child import hash |
@@ -7,6 +7,7 @@ Welcome to the Architecture Documentation System. This documentation covers how
7
7
  - [Tool Guide](/doc/tool) - How to use the architecture tool
8
8
  - [Query Syntax](/doc/search) - Search and filter resources
9
9
  - [Architecture Modeling](/doc/modeling) - How to model your architecture
10
+ - [REST API Documentation](/api/docs) - JSON API for programmatic access
10
11
 
11
12
  ## Resource Model
12
13
 
@@ -129,6 +129,16 @@
129
129
  color: var(--color);
130
130
  }
131
131
 
132
+ .estimate-note {
133
+ margin-left: 0.5rem;
134
+ color: var(--muted-color);
135
+ cursor: help;
136
+ }
137
+
138
+ .estimate-note i {
139
+ font-size: 0.875rem;
140
+ }
141
+
132
142
  /* Activity summary */
133
143
  .activity-summary {
134
144
  display: flex;
@@ -104,3 +104,17 @@
104
104
  fill: #666666 !important;
105
105
  }
106
106
  }
107
+
108
+
109
+ /* ===== Graph Too Large Message ===== */
110
+
111
+ .graph-too-large {
112
+ color: var(--pico-muted-color);
113
+ font-size: 0.875rem;
114
+ margin: 0.5rem 0;
115
+ }
116
+
117
+ .graph-too-large i {
118
+ margin-right: 0.25rem;
119
+ opacity: 0.7;
120
+ }