npm - delivery-friction-analyzer - Versions diffs - 0.10.0 → 0.12.0 - Mend

delivery-friction-analyzer 0.10.0 → 0.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (18) hide show

package/docs/contracts/friction-report.md +7 -2
package/docs/contracts/normalized-entities.md +3 -0
package/docs/reference/github-access-coverage.md +4 -1
package/docs/reference/repository-profile.md +26 -0
package/docs/reference/run-presets.md +91 -0
package/package.json +1 -1
package/release-log.md +16 -0
package/schemas/normalized-entities.schema.json +26 -0
package/schemas/repository-profile.schema.json +23 -0
package/src/cli/analyze-github.js +344 -27
package/src/collect/coverage.js +8 -0
package/src/collect/gh-provider.js +11 -0
package/src/collect/github-source-bundle.js +107 -1
package/src/github/comment-source.js +6 -1
package/src/normalize/github-fixture.js +9 -1
package/src/profile/contributor-source.js +165 -0
package/src/report/evidence-artifacts.js +21 -2
package/src/report/friction-report.js +47 -1

package/docs/contracts/friction-report.md CHANGED Viewed

@@ -1,6 +1,6 @@
 # Friction Report Contract
-Milestone 3 introduced `friction-report.v1`, a deterministic report generated from a `friction-metrics.v1` repository metrics summary. Milestones 4 and 5 add Markdown and methodology profile suggestions without adding report JSON fields. The report layer does not fetch GitHub data, mutate repositories, rank individuals, or depend on services beyond the data collection path that produced the metrics summary.
+Milestone 3 introduced `friction-report.v1`, a deterministic report generated from a `friction-metrics.v1` repository metrics summary. Milestone 4 added configured workflow context. Milestone 5 adds sanitized contributor-source metadata for configured `.all-contributorsrc` coverage without raw contributor file contents or individual rankings. The report layer does not fetch GitHub data, mutate repositories, rank individuals, or depend on services beyond the data collection path that produced the metrics summary.
 ## Outputs
@@ -28,6 +28,7 @@ The command reads local `friction-metrics.v1` JSON and writes deterministic `fri
 - `targetRepository`: analyzed repository identity; live analysis sample size is encoded as `targetRepository.analysisPullRequestLimit` from collection metadata.
 - `analysisFilter`: optional metadata for explicit filters applied before metrics computation, including excluded PR classes and before/after PR counts.
 - `configuredWorkflow`: optional user-configured workflow context from the repository profile. It is not observed GitHub evidence and does not change scoring, ranking, CSV exports, or PR class matching.
+- `contributorSource`: optional sanitized contributor-source metadata from the repository profile and collection path. It records source type, path, coverage status, parsed hint count, and guardrail note. It does not include raw contributor file contents or contributor rankings.
 - `summary`: repository totals and top bottleneck identifiers.
 - `coverage`: PR-open diff, workflow-run, and review-thread coverage counts plus caveats.
 - `commentSources`: total and source-grouped review comments for Copilot, human, bot, scanner, author replies, and unknown sources.
@@ -57,6 +58,7 @@ The Markdown renderer presents the same report data for human review:
 - a compact recommendation-category snapshot before detailed bottlenecks, with the full category reference retained later in the report;
 - a short "How To Read This Report" guide that distinguishes observed evidence, interpretation, recommendations, and caveats;
 - a configured workflow context section only when repository profile workflow fields are present, labeled as user-configured profile context rather than observed GitHub evidence;
+- a contributor source context section only when a contributor source is configured, labeled as metadata that may improve comment-source classification coverage without changing scores, authorship conclusions, reviewer attribution, CSV export shape, person-level CSV output, or individual ranking guardrails;
 - workflow data caveats when configured workflow context clarifies unavailable PR-open diff or workflow-run evidence;
 - evidence-quality and coverage tables before detailed recommendations;
 - key findings that highlight top bottlenecks, strongest displayed signal, outlier caveats, PR class caveats, and coverage caveats;
@@ -75,7 +77,7 @@ The Markdown renderer presents the same report data for human review:
 - a reference to the detailed `methodology.md` artifact generated by full live analysis;
 - guardrails, follow-up, and artifact-sensitivity guidance.
-Markdown output should not include individual contributor or reviewer rankings.
+Markdown output should not include raw contributor file contents or individual contributor/reviewer rankings.
 Status labels are Markdown presentation helpers, not `friction-report.v1` fields. They should preserve the underlying source labels and counts rather than replacing auditable evidence.
 Profile suggestions are also presentation helpers, not `friction-report.v1` fields. They are derived from existing PR class and file-surface evidence, appear at most once per suggestion category, and do not change scores, rankings, CSV exports, filtering, or PR class matching. Because the report JSON does not carry repository-profile rule inventory, all analyzed PRs using fallback `unknown` PR class evidence is the renderer's small-sample proxy for no configured PR class rule producing usable classification evidence.
@@ -114,6 +116,7 @@ Full live analysis writes `methodology.md` as a hybrid artifact: stable explanat
 - target repository and report/metric versions;
 - profile path when available;
 - configured workflow context when supplied by the repository profile, labeled as user-configured context rather than observed GitHub evidence;
+- contributor-source context when configured, including source type, path, coverage status, and parsed hint count, without raw contributor contents or rankings;
 - profile suggestions when PR class, file/path, or workflow-context profile evidence crosses deterministic fallback thresholds, or an explicit no-threshold note when none were triggered;
 - requested and collected PR counts;
 - collection coverage status and API-family diagnostics;
@@ -139,6 +142,8 @@ Minimum CSV column groups:
 Empty CSV cells mean unavailable or not applicable. Numeric zero should be used only for observed or computed zero counts. Count columns that depend on optional GitHub coverage should keep source or coverage labels nearby so spreadsheet readers can tell unavailable evidence apart from observed zeroes. CSVs must not include raw comment bodies, raw workflow logs, tokens, secret-bearing environment details, or individual contributor/reviewer rankings.
+Contributor-source coverage appears in `collection-coverage.csv` as the `contributor_source` API family when configured. CSVs may include aggregate comment-source counts influenced by contributor hints, but they must not include raw `.all-contributorsrc` contents, contributor names, contributor login lists, or person rankings.
 ## Optional Downstream Narrative Drafting
 The existing `friction-report.json` artifact plus the curated CSV exports are sufficient context for an optional local workflow where a separate model drafts a narrative summary. M2 does not justify a new `report-context.json`, CLI flag, fixture output, or artifact write path.

package/docs/contracts/normalized-entities.md CHANGED Viewed

@@ -9,6 +9,7 @@ Milestone 1 defines the first normalized fixture shape. It is intentionally limi
 - `TargetRepository`: owner/name/default branch/visibility/window for the repository being analyzed.
 - `AnalysisFilter`: optional metadata for downstream analysis filters applied after collection and normalization. When present, it records excluded PR classes, the original collected PR count, and the filtered PR count.
 - `RepositoryLanguageDistribution`: byte counts from `GET /repos/{owner}/{repo}/languages`, stored as context only.
+- `ContributorSource`: optional sanitized metadata from a configured structured contributor source. It records source type, path, coverage status, diagnostics, and parsed hint count. It must not preserve raw contributor file contents or contributor login lists.
 - `PullRequest`: source IDs, author login when known, URL, state, PR class evidence, lifecycle timestamps, final diff shape, PR-open diff source confidence, optional PR-open additions/deletions/changed-file counts when direct or reconstructed data is available, files, reviews, review decision summary, review threads, comments, checks, and workflow-run coverage.
 - `PrClassSummary`: profile-driven PR class, classification source, and winning rule ID. Unmatched PRs use `class: "unknown"`, `classificationSource: "fallback_rule"`, and `ruleId: null`.
 - `Commit`: commit OID, authored timestamp, committed timestamp when present, and message headline.
@@ -24,6 +25,8 @@ Milestone 1 defines the first normalized fixture shape. It is intentionally limi
 Normalized data must preserve whether a value came from a public API, GraphQL thread query, repository profile rule, fallback rule, internal UI partial, or unavailable coverage. Later metric and report stages should use those source labels before making confidence claims.
+Configured contributor hints may classify otherwise-unknown comment authors into existing comment-source groups during the analysis run, but parsed login lists are transient and must not be persisted in generated artifacts. Hints must not change PR authorship, reviewer attribution, scoring formulas, or person-level report/CSV rows.
 ## Analysis Filters
 `source-bundle.json` remains the full collected sample. When a local analysis excludes one or more PR classes, `normalized.json` contains the filtered PR set and an `analysisFilter` object:

package/docs/reference/github-access-coverage.md CHANGED Viewed

@@ -6,6 +6,7 @@ The MVP runs locally with the user's GitHub credentials. Reports must expose una
 | --- | --- | --- | --- | --- | --- |
 | REST repository metadata | unauthenticated or token | public read | `repo` or fine-grained metadata read | repository visibility, default branch confirmation | mark repository metadata partial |
 | REST languages | unauthenticated or token | public read | `repo` or fine-grained contents/metadata read | language byte distribution context | omit language context; do not infer file role from language |
+| REST repository contents for configured contributor source | token recommended | public contents read | `repo` or fine-grained contents read | optional `.all-contributorsrc` contributor hints for comment-source classification metadata | mark contributor-source coverage unavailable, malformed, partial, or unsupported; do not infer identities from other sources |
 | Pull request metadata | `gh` token / GraphQL-backed PR fields | public read | `repo` or pull request read | lifecycle, final diff shape, files, commits, reviews | mark PR inventory partial |
 | REST review comments | token recommended | public read | `repo` or pull request read | individual review comments and comment paths | source breakdown based only on reviews if unavailable |
 | GraphQL review threads | token | public read | `repo` or pull request read | thread count, resolved state, outdated state | thread metrics unavailable; comment count can remain REST-only |
@@ -18,8 +19,10 @@ The MVP runs locally with the user's GitHub credentials. Reports must expose una
 The analyzer should record:
 - API family attempted.
-- Coverage status: `available`, `partial`, `unavailable`, or `rate_limited`.
+- Coverage status: `available`, `partial`, `unavailable`, `malformed`, `unsupported`, or `rate_limited`.
 - Required scope or permission when known.
 - Impact on downstream metrics.
 Missing coverage must flow into report metadata. For example, unavailable GraphQL review threads should disable thread-resolution metrics while preserving REST review-comment counts.
+Contributor-source coverage is optional. Missing, inaccessible, malformed, or unsupported contributor files should not fail analysis and should not cause the analyzer to infer private identity data from names, emails, commits, or external services.

package/docs/reference/repository-profile.md CHANGED Viewed

@@ -4,6 +4,8 @@ Repository profiles map paths to file categories, file roles, and functional sur
 Schema: `schemas/repository-profile.schema.json`.
+Repository profiles own repository semantics. Keep file rules, PR class rules, workflow context, branch or release strategy, and contributor-source declarations here. Optional [run presets](run-presets.md) only store reusable run settings such as the target repository, profile path, sample size, output directory, dry-run mode, CSV preference, JSON completion preference, validation-target mode, and requested PR class exclusions. Explicit CLI flags override preset values.
 ## Categories
 - `code`
@@ -160,3 +162,27 @@ Example:
 Use stable identifiers exactly as shown above. Display labels such as "squash merges" or "release PRs" belong in CLI prompts or documentation, not in profile data.
 When interactive setup writes profile changes, it preserves deterministic two-space JSON formatting in place. If an existing profile uses other formatting, setup writes a generated profile copy and prints that generated path in completion output instead of rewriting the original file.
+## Contributor Source
+`contributors` is optional user-configured context for structured contributor hints. The first supported source is `.all-contributorsrc` as `all_contributors` JSON. When omitted, analysis runs normally without contributor hints.
+Supported fields:
+- `sourceType`: optional, defaults to `all_contributors` when `contributors` is present.
+- `path`: optional trimmed, slash-delimited repository-relative path, defaults to `.all-contributorsrc`.
+Example:
+```json
+{
+  "contributors": {
+    "sourceType": "all_contributors",
+    "path": ".all-contributorsrc"
+  }
+}
+```
+Markdown contributor files such as `CONTRIBUTORS.md` are not supported contributor sources in this milestone. The analyzer records them as unsupported/unparsed coverage when encountered and does not parse Markdown into identities.
+Contributor hints may improve repository-level comment-source classification coverage, such as classifying a configured contributor login as an existing human-reviewer source. They do not change scoring formulas, PR authorship conclusions, reviewer attribution, PR class matching, CSV export shape, or individual rankings. Generated artifacts expose only contributor-source metadata such as type, path, status, diagnostics, and parsed hint count; they do not include raw contributor file contents, contributor login lists, or contributor rankings.

package/docs/reference/run-presets.md ADDED Viewed

@@ -0,0 +1,91 @@
+# Run Presets
+Run presets are optional local JSON files for reusing CLI run settings. They are intended for rerunning the same analysis without re-answering interactive prompts.
+Repository meaning stays in repository profiles. Put file rules, PR class rules, workflow context, branch or release strategy, and contributor-source declarations in a repository profile. A run preset may only point at a profile and store run inputs or preferences such as the target repository, sample size, output directory, dry-run mode, CSV preference, JSON completion preference, validation-target mode, and requested PR class exclusions.
+## Save A Preset
+Interactive setup asks whether to save a local run preset near the end of the prompt flow. If you answer yes, you choose the preset path explicitly. The CLI does not invent a global or cloud-synced preset location.
+Saving a preset may overwrite an existing regular file at that path, but the path must not be a directory, symbolic link, or other special file.
+You can also save a preset from flags:
+```sh
+npm run analyze:github -- \
+  --repo example/example-repo \
+  --limit 30 \
+  --profile profiles/example-repo.json \
+  --out reports/example-repo \
+  --save-preset .delivery-friction-analyzer/example-repo.run-preset.json
+```
+When a preset is written, the completion output includes:
+```text
+Run preset saved: .delivery-friction-analyzer/example-repo.run-preset.json.
+```
+With `--json`, the same path is emitted as `savedRunPresetPath` in the machine-readable completion receipt.
+## Rerun From A Preset
+Use `--preset <path>` to load saved settings without prompts:
+```sh
+npm run analyze:github -- --preset .delivery-friction-analyzer/example-repo.run-preset.json
+```
+Explicit CLI flags override preset values. This makes one-off reruns predictable:
+```sh
+npm run analyze:github -- \
+  --preset .delivery-friction-analyzer/example-repo.run-preset.json \
+  --limit 10 \
+  --no-csv
+```
+In that command, the preset still supplies values such as `--repo`, `--profile`, and `--out`, while `--limit 10` and `--no-csv` win over the saved sample size and CSV preference.
+Boolean preset values can be overridden in either direction:
+- `--dry-run` or `--no-dry-run`
+- `--validation-target` or `--no-validation-target`
+- `--csv` or `--no-csv`
+- `--json` or `--no-json`
+If both forms are provided in one command, the later flag wins. For example, `--preset local.json --dry-run --no-dry-run` runs a full analysis, while `--preset local.json --no-csv --csv` writes CSV evidence files.
+## Format
+Preset files use `analyze-github-run-preset.v1`:
+```json
+{
+  "schemaVersion": "analyze-github-run-preset.v1",
+  "run": {
+    "repository": "example/example-repo",
+    "limit": 30,
+    "profilePath": "profiles/example-repo.json",
+    "outDir": "reports/example-repo",
+    "dryRun": false,
+    "isValidationTarget": false,
+    "csv": true,
+    "json": false,
+    "excludedPrClasses": []
+  }
+}
+```
+The CLI only reads and writes the allowlisted `run` keys shown above. Presets must not contain GitHub tokens, secrets, raw source bundles, normalized data, metrics, reports, methodology text, CSV contents, contributor file contents, or repository profile rules.
+## Cleanup
+Preset files are local user-owned files. Delete a preset when it no longer matches how you want to run the analyzer:
+```sh
+rm .delivery-friction-analyzer/example-repo.run-preset.json
+```
+Deleting a preset does not delete generated reports or repository profiles. If a preset points at a generated profile, review and clean up that profile separately.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "delivery-friction-analyzer",
-  "version": "0.10.0",
+  "version": "0.12.0",
   "description": "Local GitHub pull request analytics for delivery friction reports.",
   "license": "MIT",
   "type": "module",

package/release-log.md CHANGED Viewed

@@ -2,6 +2,22 @@
 ## Unreleased
+### 2026-06-20 — Reusable Run Presets
+- What changed: GitHub analysis can now load local run settings from `--preset` and save reusable settings with `--save-preset` or the interactive setup flow, with explicit CLI flags taking precedence.
+- Why it matters: Maintainers can rerun an interactive setup non-interactively without moving repository semantics out of repository profiles.
+- Who is affected: Maintainers using `--interactive` or repeated local analysis commands.
+- Action needed: Optional; save a local preset for repeated runs and delete stale preset files when they no longer match the desired analysis settings.
+- PR: #48
+### 2026-06-20 — Contributor Source Configuration
+- What changed: Repository profiles can now configure `.all-contributorsrc` as a structured contributor source, and analysis records contributor-source coverage while using sanitized hints only for aggregate comment-source classification.
+- Why it matters: Maintainers can improve comment-source coverage without parsing Markdown contributor files, changing scores, or emitting raw contributor contents or person rankings in reports or CSVs.
+- Who is affected: Maintainers authoring repository profiles or reviewing generated reports, methodology, and coverage artifacts.
+- Action needed: Optional; add `contributors.sourceType: "all_contributors"` and a repository-relative `contributors.path` when a target repository has a trusted `.all-contributorsrc`.
+- PR: #47
 ### 2026-06-20 — Workflow Data Caveats
 - What changed: Markdown friction reports and methodology now explain PR-open diff and workflow-run coverage limits with configured workflow context when it is available, and suggest adding workflow context when omitted context would clarify unavailable evidence.

package/schemas/normalized-entities.schema.json CHANGED Viewed

@@ -36,6 +36,32 @@
         }
       }
     },
+    "contributorSource": {
+      "type": "object",
+      "additionalProperties": false,
+      "required": ["sourceType", "path", "coverage", "hintCount"],
+      "properties": {
+        "sourceType": { "enum": ["all_contributors"] },
+        "path": { "type": "string" },
+        "coverage": {
+          "type": "object",
+          "additionalProperties": false,
+          "required": ["family", "source", "status", "attempts", "diagnostics", "downstreamImpact"],
+          "properties": {
+            "family": { "const": "contributor_source" },
+            "source": { "type": "string" },
+            "status": { "enum": ["available", "partial", "unavailable", "malformed", "unsupported", "rate_limited"] },
+            "attempts": { "type": "integer", "minimum": 1 },
+            "diagnostics": {
+              "type": "array",
+              "items": { "type": "string" }
+            },
+            "downstreamImpact": { "type": ["string", "null"] }
+          }
+        },
+        "hintCount": { "type": "integer", "minimum": 0 }
+      }
+    },
     "pullRequests": {
       "type": "array",
       "items": {

package/schemas/repository-profile.schema.json CHANGED Viewed

@@ -103,6 +103,29 @@
         }
       },
       "minProperties": 1
+    },
+    "contributors": {
+      "type": "object",
+      "additionalProperties": false,
+      "properties": {
+        "sourceType": {
+          "enum": ["all_contributors"]
+        },
+        "path": {
+          "type": "string",
+          "minLength": 1,
+          "pattern": "\\S",
+          "not": {
+            "anyOf": [
+              { "pattern": "^/" },
+              { "pattern": "^\\s|\\s$" },
+              { "pattern": "\\\\" },
+              { "pattern": "(^|/)\\.\\.(/|$)" }
+            ]
+          }
+        }
+      },
+      "minProperties": 1
     }
   }
 }