metascope 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/.DS_Store +0 -0
- package/dist/bin/chunk-BjEoQXZ0.js +1 -0
- package/dist/bin/cli.js +45845 -0
- package/dist/bin/jiti-D2Njwwqq.js +9 -0
- package/dist/bin/web-tree-sitter-LICENSE +21 -0
- package/dist/bin/web-tree-sitter.wasm +0 -0
- package/dist/grammars/tree-sitter-python-LICENSE +21 -0
- package/dist/grammars/tree-sitter-python.wasm +0 -0
- package/dist/grammars/tree-sitter-ruby-LICENSE +22 -0
- package/dist/grammars/tree-sitter-ruby.wasm +0 -0
- package/dist/lib/chunk-DrSxFLj_.js +14 -0
- package/dist/lib/index.d.ts +1496 -0
- package/dist/lib/index.js +6215 -0
- package/license.txt +21 -0
- package/package.json +109 -0
- package/readme.md +548 -0
package/readme.md
ADDED
|
@@ -0,0 +1,548 @@
|
|
|
1
|
+
<!--+ Warning: Content inside HTML comment blocks was generated by mdat and may be overwritten. +-->
|
|
2
|
+
|
|
3
|
+
<!-- title -->
|
|
4
|
+
|
|
5
|
+
# metascope
|
|
6
|
+
|
|
7
|
+
<!-- /title -->
|
|
8
|
+
|
|
9
|
+
<!-- badges -->
|
|
10
|
+
|
|
11
|
+
[](https://npmjs.com/package/metascope)
|
|
12
|
+
[](https://opensource.org/licenses/MIT)
|
|
13
|
+
|
|
14
|
+
<!-- /badges -->
|
|
15
|
+
|
|
16
|
+
<!-- short-description -->
|
|
17
|
+
|
|
18
|
+
**A CLI tool and TypeScript library to easily extract metadata from all kinds of software repositories.**
|
|
19
|
+
|
|
20
|
+
<!-- /short-description -->
|
|
21
|
+
|
|
22
|
+
> [!NOTE]
|
|
23
|
+
>
|
|
24
|
+
> Metascope is under development. Expect breaking changes until a 1.0 release.
|
|
25
|
+
|
|
26
|
+
## Overview
|
|
27
|
+
|
|
28
|
+
Metascope aggregates metadata from a local code repository into a single monolithic JSON object. Given a project directory, it checks multiple sources in parallel — local git history, package manifests, the GitHub API, the NPM registry, lines of code analysis, and more — and returns a JSON object containing everything it could find.
|
|
29
|
+
|
|
30
|
+
From there, an (optional) template system lets you refine and transform the output to reflect exactly which fields you need, useful for archival purposes, populating dashboards, or feeding data into other tools. The template system also provides a spec-compliant implementation of the [CodeMeta](https://codemeta.github.io/) vocabulary, allowing easy generation of `codemeta.json` files for a semantically normalized view of a variety of project types.
|
|
31
|
+
|
|
32
|
+
Highlights:
|
|
33
|
+
|
|
34
|
+
- **A wide net**\
|
|
35
|
+
Metascope pulls project metadata from many available sources: `package.json`, `pyproject.toml`, NPM, PyPI, GitHub, git, filesystem stats, and [more](#sources).
|
|
36
|
+
|
|
37
|
+
- **Graceful degradation**\
|
|
38
|
+
Each source checks its own availability before extraction. Missing tools, unavailable APIs, or absent credentials are silently skipped — you always get back whatever data _is_ available.
|
|
39
|
+
|
|
40
|
+
- **Parallel extraction**\
|
|
41
|
+
After an initial codemeta pass for discovery hints (package name, repository URL, keywords), all remaining sources are checked and extracted concurrently.
|
|
42
|
+
|
|
43
|
+
- **Typed templates**\
|
|
44
|
+
The `defineTemplate()` helper provides full autocomplete on available fields. TypeScript infers the return type from your template function, so `getMetadata()` returns exactly the shape you need.
|
|
45
|
+
|
|
46
|
+
- **CLI and library**\
|
|
47
|
+
Use it as a command-line tool for quick inspection or pipe-friendly JSON output, or import it as a library for programmatic access with full type safety.
|
|
48
|
+
|
|
49
|
+
## Getting started
|
|
50
|
+
|
|
51
|
+
### Dependencies
|
|
52
|
+
|
|
53
|
+
Metascope requires [Node.js](https://nodejs.org/) 22.17+. It is implemented in TypeScript, ships as ESM, and bundles complete type definitions.
|
|
54
|
+
|
|
55
|
+
Metascope also requires a recent version of [git](https://git-scm.com/) on your path for quickly identifying ignored files and aggregating repository statistics.
|
|
56
|
+
|
|
57
|
+
Optional external tools:
|
|
58
|
+
|
|
59
|
+
- [GitHub CLI](https://cli.github.com)\
|
|
60
|
+
Used as a fallback for GitHub API authentication if no token is provided via `--github-token` or `$GITHUB_TOKEN`. It's trivially installed from [Homebrew](https://brew.sh/): `brew install gh`.
|
|
61
|
+
|
|
62
|
+
### Installation
|
|
63
|
+
|
|
64
|
+
Invoke directly on the current directory:
|
|
65
|
+
|
|
66
|
+
```sh
|
|
67
|
+
npx metascope
|
|
68
|
+
```
|
|
69
|
+
|
|
70
|
+
...or install locally:
|
|
71
|
+
|
|
72
|
+
```sh
|
|
73
|
+
npm install metascope
|
|
74
|
+
```
|
|
75
|
+
|
|
76
|
+
...or install globally:
|
|
77
|
+
|
|
78
|
+
```sh
|
|
79
|
+
npm install --global metascope
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
If you're using PNPM, you can safely ignore the build scripts for the tree-sitter dependencies, since we're only interested in their bundled WASM implementations.
|
|
83
|
+
|
|
84
|
+
In your `pnpm-workspace.yaml`:
|
|
85
|
+
|
|
86
|
+
```yaml
|
|
87
|
+
ignoredBuiltDependencies:
|
|
88
|
+
- tree-sitter-python
|
|
89
|
+
- tree-sitter-ruby
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
## Usage
|
|
93
|
+
|
|
94
|
+
### CLI
|
|
95
|
+
|
|
96
|
+
<!-- cli-help -->
|
|
97
|
+
|
|
98
|
+
#### Command: `metascope`
|
|
99
|
+
|
|
100
|
+
Extract metadata from a code repository.
|
|
101
|
+
|
|
102
|
+
Usage:
|
|
103
|
+
|
|
104
|
+
```txt
|
|
105
|
+
metascope [path]
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
| Positional Argument | Description | Type | Default |
|
|
109
|
+
| ------------------- | ---------------------- | -------- | ------- |
|
|
110
|
+
| `path` | Project directory path | `string` | `"."` |
|
|
111
|
+
|
|
112
|
+
| Option | Description | Type | Default |
|
|
113
|
+
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------- |
|
|
114
|
+
| `--template`<br>`-t` | Built-in template name (`codemeta`, `frontmatter`, `project`) or path to a custom template file | `string` | |
|
|
115
|
+
| `--github-token` | GitHub API token (or set `$GITHUB_TOKEN`) | `string` | |
|
|
116
|
+
| `--author-name` | Optional author name(s) for ownership checks in templates | `array` | |
|
|
117
|
+
| `--github-account` | Optional GitHub account name(s) for ownership checks in templates | `array` | |
|
|
118
|
+
| `--absolute` | Output absolute paths. Use `--no-absolute` for relative paths. | `boolean` | `true` |
|
|
119
|
+
| `--offline` | Skip sources requiring network requests | `boolean` | `false` |
|
|
120
|
+
| `--no-ignore` | Include files ignored by .gitignore in the file tree | `boolean` | `false` |
|
|
121
|
+
| `--recursive`<br>`-r` | Search for metadata files recursively in subdirectories | `boolean` | `false` |
|
|
122
|
+
| `--workspaces`<br>`-w` | Include workspace-specific metadata in monorepos; pass a `boolean` to enable or disable auto-detection, or pass one or more `string`s to explicitly define workspace paths | | `true` |
|
|
123
|
+
| `--verbose` | Run with verbose logging | `boolean` | `false` |
|
|
124
|
+
| `--help`<br>`-h` | Show help | `boolean` | |
|
|
125
|
+
| `--version`<br>`-v` | Show version number | `boolean` | |
|
|
126
|
+
|
|
127
|
+
<!-- /cli-help -->
|
|
128
|
+
|
|
129
|
+
#### Examples
|
|
130
|
+
|
|
131
|
+
##### Basic metadata extraction
|
|
132
|
+
|
|
133
|
+
Extract all available metadata from the current directory:
|
|
134
|
+
|
|
135
|
+
```sh
|
|
136
|
+
metascope
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
Output is pretty-printed JSON when writing to a terminal, compact JSON when piped.
|
|
140
|
+
|
|
141
|
+
##### Scan a specific directory
|
|
142
|
+
|
|
143
|
+
```sh
|
|
144
|
+
metascope /path/to/project
|
|
145
|
+
```
|
|
146
|
+
|
|
147
|
+
##### Use a built-in template
|
|
148
|
+
|
|
149
|
+
```sh
|
|
150
|
+
metascope --template project
|
|
151
|
+
```
|
|
152
|
+
|
|
153
|
+
##### Pass template data for ownership checks
|
|
154
|
+
|
|
155
|
+
Some preset templates return information based on the (relative) ownership status of a repo. This requires additional context data, which can be passed in via additional CLI flags:
|
|
156
|
+
|
|
157
|
+
```sh
|
|
158
|
+
metascope --template project --author-name "Jane Doe" --github-account janedoe
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
Multiple values are supported:
|
|
162
|
+
|
|
163
|
+
```sh
|
|
164
|
+
metascope --template project --author-name "Jane Doe" "John Doe" --github-account janedoe johndoe
|
|
165
|
+
```
|
|
166
|
+
|
|
167
|
+
##### Use a custom template file
|
|
168
|
+
|
|
169
|
+
```sh
|
|
170
|
+
metascope --template ./my-template.ts
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
Where `my-template.ts` might look like:
|
|
174
|
+
|
|
175
|
+
```ts
|
|
176
|
+
import { defineTemplate, helpers } from 'metascope'
|
|
177
|
+
|
|
178
|
+
export default defineTemplate(({ codemetaJson, github, gitStats }) => {
|
|
179
|
+
const codemeta = helpers.firstOf(codemetaJson)
|
|
180
|
+
const git = helpers.firstOf(gitStats)
|
|
181
|
+
const gh = helpers.firstOf(github)
|
|
182
|
+
return {
|
|
183
|
+
commits: git?.data.commitCount,
|
|
184
|
+
name: codemeta?.data.name,
|
|
185
|
+
stars: gh?.data.stargazerCount,
|
|
186
|
+
version: codemeta?.data.version,
|
|
187
|
+
}
|
|
188
|
+
})
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
##### Pipe compact JSON to another tool
|
|
192
|
+
|
|
193
|
+
```sh
|
|
194
|
+
metascope | jq '.github.stargazerCount'
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
##### Provide a GitHub token
|
|
198
|
+
|
|
199
|
+
An optional GitHub token can allow access to metadata about private repositories, and raises the request limit if you're operating on a large collection of repositories:
|
|
200
|
+
|
|
201
|
+
```sh
|
|
202
|
+
metascope --github-token ghp_xxxxxxxxxxxx
|
|
203
|
+
```
|
|
204
|
+
|
|
205
|
+
Or set the `GITHUB_TOKEN` environment variable, or authenticate via `gh auth login`. Metascope will attempt to find a credential without bothering you.
|
|
206
|
+
|
|
207
|
+
##### Verbose logging
|
|
208
|
+
|
|
209
|
+
```sh
|
|
210
|
+
metascope --verbose
|
|
211
|
+
```
|
|
212
|
+
|
|
213
|
+
Logs source availability checks, extraction durations, and other diagnostics to stderr.
|
|
214
|
+
|
|
215
|
+
### API
|
|
216
|
+
|
|
217
|
+
The `metascope` library exports `getMetadata` as its primary function, `defineTemplate` for type-safe template authoring, and a `helpers` namespace with utility functions for working with metadata in templates.
|
|
218
|
+
|
|
219
|
+
#### `getMetadata`
|
|
220
|
+
|
|
221
|
+
```ts
|
|
222
|
+
// Without a template — returns full MetadataContext
|
|
223
|
+
function getMetadata(options: GetMetadataOptions): Promise<MetadataContext>
|
|
224
|
+
|
|
225
|
+
// With a template — returns the template's return type
|
|
226
|
+
function getMetadata<T>(options: GetMetadataTemplateOptions<T>): Promise<T>
|
|
227
|
+
```
|
|
228
|
+
|
|
229
|
+
The function accepts a project directory path, optional credentials, and an optional template (a built-in name or a template function). It returns a promise resolving to either the full `MetadataContext` or the shaped output of your template.
|
|
230
|
+
|
|
231
|
+
All `undefined` values and empty source objects are deep-stripped from the output before returning.
|
|
232
|
+
|
|
233
|
+
#### `defineTemplate`
|
|
234
|
+
|
|
235
|
+
```ts
|
|
236
|
+
function defineTemplate<T>(
|
|
237
|
+
transform: (context: MetadataContext, templateData: TemplateData) => T,
|
|
238
|
+
): Template<T>
|
|
239
|
+
```
|
|
240
|
+
|
|
241
|
+
An identity wrapper that provides autocomplete and type inference when authoring templates. The optional second `templateData` argument provides user-supplied values (like author names or GitHub accounts) for parameterized ownership checks. Templates that don't need it can simply ignore the argument. Template developers can pass additional values as needed.
|
|
242
|
+
|
|
243
|
+
#### Examples
|
|
244
|
+
|
|
245
|
+
##### Get all metadata
|
|
246
|
+
|
|
247
|
+
```ts
|
|
248
|
+
import { getMetadata, helpers } from 'metascope'
|
|
249
|
+
|
|
250
|
+
const metadata = await getMetadata({ path: '.' })
|
|
251
|
+
console.log(helpers.firstOf(metadata.codemetaJson)?.data.name)
|
|
252
|
+
console.log(helpers.firstOf(metadata.github)?.data.stargazerCount)
|
|
253
|
+
console.log(helpers.firstOf(metadata.gitStats)?.data.commitCount)
|
|
254
|
+
```
|
|
255
|
+
|
|
256
|
+
_See [output sample](./docs/metascope-basic.json) for this repository._
|
|
257
|
+
|
|
258
|
+
##### Get shaped metadata via a template
|
|
259
|
+
|
|
260
|
+
```ts
|
|
261
|
+
import { defineTemplate, getMetadata, helpers } from 'metascope'
|
|
262
|
+
|
|
263
|
+
const template = defineTemplate(({ codemetaJson, github }) => ({
|
|
264
|
+
name: helpers.firstOf(codemetaJson)?.data.name,
|
|
265
|
+
stars: helpers.firstOf(github)?.data.stargazerCount,
|
|
266
|
+
}))
|
|
267
|
+
|
|
268
|
+
// Result is typed as { name: ..., stars: ... }
|
|
269
|
+
const result = await getMetadata({ path: '.', template })
|
|
270
|
+
```
|
|
271
|
+
|
|
272
|
+
##### Provide credentials
|
|
273
|
+
|
|
274
|
+
```ts
|
|
275
|
+
import { getMetadata } from 'metascope'
|
|
276
|
+
|
|
277
|
+
const metadata = await getMetadata({
|
|
278
|
+
credentials: { githubToken: 'ghp_xxxxxxxxxxxx' },
|
|
279
|
+
path: '.',
|
|
280
|
+
})
|
|
281
|
+
```
|
|
282
|
+
|
|
283
|
+
Credential resolution follows a precedence chain: explicit options > environment variables > CLI tool fallbacks (e.g. `gh auth token`). This makes metascope work in both CI environments and local development without configuration.
|
|
284
|
+
|
|
285
|
+
##### Pass template data
|
|
286
|
+
|
|
287
|
+
```ts
|
|
288
|
+
import { defineTemplate, getMetadata, helpers } from 'metascope'
|
|
289
|
+
|
|
290
|
+
const template = defineTemplate(({ codemetaJson }, { authorName }) => {
|
|
291
|
+
const codemeta = helpers.firstOf(codemetaJson)
|
|
292
|
+
return {
|
|
293
|
+
isAuthoredByMe: codemeta?.data.author?.some((a) => a.name === authorName),
|
|
294
|
+
name: codemeta?.data.name,
|
|
295
|
+
}
|
|
296
|
+
})
|
|
297
|
+
|
|
298
|
+
const result = await getMetadata({
|
|
299
|
+
path: '.',
|
|
300
|
+
template,
|
|
301
|
+
templateData: { authorName: 'Jane Doe' },
|
|
302
|
+
})
|
|
303
|
+
```
|
|
304
|
+
|
|
305
|
+
##### Use a built-in template
|
|
306
|
+
|
|
307
|
+
```ts
|
|
308
|
+
import { getMetadata } from 'metascope'
|
|
309
|
+
|
|
310
|
+
const result = await getMetadata({ path: '.', template: 'frontmatter' })
|
|
311
|
+
```
|
|
312
|
+
|
|
313
|
+
## Sources
|
|
314
|
+
|
|
315
|
+
Metascope extracts data from a wide range of data sources:
|
|
316
|
+
|
|
317
|
+
### Local Files
|
|
318
|
+
|
|
319
|
+
| Ecosystem | Organization | Metascope Key | Source Specifications |
|
|
320
|
+
| ---------- | ------------------------------------------------------------------------------------------------------- | ----------------------------- | --------------------------------------------------------------------------------------------------- |
|
|
321
|
+
| Agnostic | | `readmeFile` | `README.md` (and variants) |
|
|
322
|
+
| Agnostic | [CodeMeta (v1)](https://codemeta.github.io/) | `codemetaJson` | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/1.0/codemeta.jsonld) |
|
|
323
|
+
| Agnostic | [CodeMeta (v2)](https://codemeta.github.io/) | `codemetaJson` | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/2.0/codemeta.jsonld) |
|
|
324
|
+
| Agnostic | [CodeMeta (v3.1)](https://codemeta.github.io/) | `codemetaJson` | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/3.1/codemeta.jsonld) |
|
|
325
|
+
| Agnostic | [CodeMeta (v3)](https://codemeta.github.io/) | `codemetaJson` | [`codemeta.json`](https://raw.githubusercontent.com/codemeta/codemeta/3.0/codemeta.jsonld) |
|
|
326
|
+
| Agnostic | [Documented below](#about-metadatajson) | `metadataFile` | `metadata.json` (and `.yaml` / `.yml` variants) |
|
|
327
|
+
| Agnostic | [Git](https://git-scm.com/) | `gitConfig` | `.git/config` |
|
|
328
|
+
| Agnostic | [Public Code](https://publiccode.net/) | `publiccodeYaml` | [`publiccode.yml`](https://yml.publiccode.tools/schema.core.html) (Also matches `.yaml`) |
|
|
329
|
+
| Agnostic | [SPDX](https://spdx.org/) | `licenseFile` | `LICENSE`, `LICENCE`, `COPYING`, `UNLICENSE` (and `.md`/`.txt` variants) |
|
|
330
|
+
| Apple | [Apple Info.plist](https://developer.apple.com/documentation/bundleresources/information-property-list) | `xcodeInfoPlist` | [`Info.plist`](https://developer.apple.com/documentation/bundleresources/information-property-list) |
|
|
331
|
+
| Apple | [Xcode Project](https://developer.apple.com/xcode/) | `xcodeProjectPbxproj` | [`*.xcodeproj/project.pbxproj`](https://developer.apple.com/documentation/xcode) |
|
|
332
|
+
| C++ | [Arduino Library](https://docs.arduino.cc/arduino-cli/library-specification/) | `arduinoLibraryProperties` | [`library.properties`](https://docs.arduino.cc/arduino-cli/library-specification/) |
|
|
333
|
+
| C++ | [Cinder CinderBlock](https://libcinder.org/docs/guides/cinder-blocks/index.html) | `cinderCinderblockXml` | [`cinderblock.xml`](https://libcinder.org/docs/guides/cinder-blocks/index.html) |
|
|
334
|
+
| C++ | [openFrameworks Addon (Legacy)](https://openframeworks.cc/) | `openframeworksInstallXml` | [`install.xml`](https://openframeworks.cc/) (Legacy format, replaced by `addon_config.mk`) |
|
|
335
|
+
| C++ | [openFrameworks Addon](https://openframeworks.cc/) | `openframeworksAddonConfigMk` | [`addon_config.mk`](https://github.com/openframeworks/ofxAddonTemplate) |
|
|
336
|
+
| Go | [Go Modules](https://go.dev/ref/mod) | `goGoMod` | [`go.mod`](https://go.dev/doc/modules/gomod-ref) |
|
|
337
|
+
| Go | [GoReleaser](https://goreleaser.com/) | `goGoreleaserYaml` | [`.goreleaser.yaml`](https://goreleaser.com/customization/) (Also matches `.yml`) |
|
|
338
|
+
| Java | [Maven](https://search.maven.org/) | `javaPomXml` | [`pom.xml`](https://maven.apache.org/pom.html) |
|
|
339
|
+
| Java | [Processing Library](https://github.com/benfry/processing4/wiki/Library-Guidelines) | `processingLibraryProperties` | [`library.properties`](https://github.com/benfry/processing4/wiki/Library-Guidelines) |
|
|
340
|
+
| Java | [Processing Sketch](https://processing.org/) | `processingSketchProperties` | [`sketch.properties`](https://github.com/benfry/processing4) (Not really specified...) |
|
|
341
|
+
| JavaScript | [NPM](https://www.npmjs.com/) | `nodePackageJson` | [`package.json`](https://docs.npmjs.com/cli/v11/configuring-npm/package-json) |
|
|
342
|
+
| Obsidian | [Obsidian](https://obsidian.md/) | `obsidianPluginManifestJson` | [`manifest.json`](https://docs.obsidian.md/Reference/Manifest) |
|
|
343
|
+
| Python | [PyPi (Distutils)](https://pypi.org/) | `pythonSetupCfg` | [`setup.cfg`](https://docs.python.org/3/distutils/apiref.html#distutils.config) |
|
|
344
|
+
| Python | [PyPi (Distutils)](https://pypi.org/) | `pythonSetupPy` | [`setup.py`](https://docs.python.org/3/distutils/setupscript.html) |
|
|
345
|
+
| Python | [PyPi (pep-0621)](https://pypi.org/) | `pythonPyprojectToml` | [`pyproject.toml`](https://peps.python.org/pep-0621/) |
|
|
346
|
+
| Python | [PyPi (PKG-INFO)](https://pypi.org/) | `pythonPkgInfo` | [`.egg-info/PKG-INFO`](https://packaging.python.org/en/latest/specifications/) |
|
|
347
|
+
| Ruby | [Ruby Gems](https://rubygems.org/) | `rubyGemspec` | [`*.gemspec`](https://guides.rubygems.org/specification-reference/) |
|
|
348
|
+
| Rust | [Crates](https://crates.io/) | `rustCargoToml` | [`Cargo.toml`](https://doc.rust-lang.org/cargo/reference/manifest.html) |
|
|
349
|
+
|
|
350
|
+
### Local Tools
|
|
351
|
+
|
|
352
|
+
| Ecosystem | Organization | Metascope Key | Source Specifications |
|
|
353
|
+
| --------- | --------------------------- | ------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------- |
|
|
354
|
+
| Agnostic | | `dependencyUpdates` | Dependency freshness (outdated packages, libyears) |
|
|
355
|
+
| Agnostic | | `fileStats` | Filesystem metadata (file counts, directory counts, total size) |
|
|
356
|
+
| Agnostic | [Git](https://git-scm.com/) | `gitStats` | Git CLI statistics (commits, branches, tags, contributors) |
|
|
357
|
+
| Agnostic | None | `codeStats` | Lines of code analysis from [tokei](https://github.com/XAMPPRocky/tokei) via [bundled native bindings](https://github.com/kitschpatrol/napi-tokei) |
|
|
358
|
+
|
|
359
|
+
### Remote Sources
|
|
360
|
+
|
|
361
|
+
You can skip network calls by passing `--offline` to the CLI.
|
|
362
|
+
|
|
363
|
+
| Ecosystem | Organization | Metascope Key | Source Specifications |
|
|
364
|
+
| ---------- | --------------------------------------------------------------------------------------- | ------------------------ | -------------------------------------------------------------------- |
|
|
365
|
+
| Agnostic | [GitHub Repository Metadata](https://docs.github.com/rest/repos/repos#get-a-repository) | `github` | _GitHub GraphQL metadata_ |
|
|
366
|
+
| JavaScript | [NPM Registry](https://www.npmjs.com/) | `nodeNpmRegistry` | _NPM registry API_ (download counts, publish dates, latest version) |
|
|
367
|
+
| Obsidian | [Obsidian Community Plugins](https://obsidian.md/plugins) | `obsidianPluginRegistry` | _Obsidian community plugin stats_ (download counts) |
|
|
368
|
+
| Python | [PyPI Registry](https://pypi.org/) | `pythonPypiRegistry` | _PyPI registry API_ (download counts, publish dates, latest version) |
|
|
369
|
+
|
|
370
|
+
### About metadata.json
|
|
371
|
+
|
|
372
|
+
Metascope supports a minimalist `metadata.json` (or `.yaml`) file is supported, which can capture the minimal metadata required to populate a GitHub project's repository page's description, homepage, and topics.
|
|
373
|
+
|
|
374
|
+
This is a non-standard format that exists primarily for use in combination with [github-action-repo-sync](https://github.com/kitschpatrol/github-action-repo-sync).
|
|
375
|
+
|
|
376
|
+
| Key | Key Aliases | CodeMeta Property | Notes |
|
|
377
|
+
| ------------- | ---------------------------- | ----------------- | ----------------------------------------------------------------------------- |
|
|
378
|
+
| `description` | _None_ | `description` | String description of project |
|
|
379
|
+
| `homepage` | `url` `repository` `website` | `url` | For repository values, git+ prefix and .git suffix are automatically stripped |
|
|
380
|
+
| `keywords` | `tags` `topics` | `keywords` | Array of strings, or a single comma-delimited string |
|
|
381
|
+
|
|
382
|
+
If multiple key aliases are present in the object, priority for populating the associated `codemeta.json` goes to the key, then falls through to key aliases in the order shown above. (E.g. homepage takes priority over url.)
|
|
383
|
+
|
|
384
|
+
_If you have more metadata to define but your project lacks a canonical package specification format, then creating a `codemeta.json` file is recommended over the non-standard `metadata.json`._
|
|
385
|
+
|
|
386
|
+
## Templates
|
|
387
|
+
|
|
388
|
+
Metascope provides a basic templating / output transformation functionality to compose its output into more compact and focused representations.
|
|
389
|
+
|
|
390
|
+
### Built-in templates
|
|
391
|
+
|
|
392
|
+
Three built-in templates are available by name. Pass the name as the `template` option on the CLI or in the API.
|
|
393
|
+
|
|
394
|
+
#### `codemeta`
|
|
395
|
+
|
|
396
|
+
The [CodeMeta](https://codemeta.github.io/) template provides a standard way to describe software using [JSON-LD](https://json-ld.org/) and [schema.org](https://schema.org/) terms. Most software projects already have rich metadata in manifests and other files (e.g. `package.json`, `Cargo.toml`, `pyproject.toml`, `LICENSE`, etc.), but the name and structure of semantically equivalent metadata is often inconsistent across ecosystems.
|
|
397
|
+
|
|
398
|
+
It leverages the [crosswalk](https://codemeta.github.io/crosswalk/) data generously compiled by CodeMeta contributors to assist in automating the mapping of various metadata formats to the CodeMeta standard. Where crosswalk data is unavailable or incomplete, heuristics are used instead.
|
|
399
|
+
|
|
400
|
+
This tool always outputs [CodeMeta v3.1](https://w3id.org/codemeta/3.1) files. When ingesting `codemeta.json` files defined in the older [CodeMeta 1](https://doi.org/10.5063/SCHEMA/CODEMETA-1.0) and [CodeMeta v2](https://doi.org/10.5063/SCHEMA/CODEMETA-2.0) contexts, all simple key re-mappings as defined in the crosswalk table are applied. However, some more nuanced conditional transformations (like the reassignment of copyright holding agents in v1) are not implemented.
|
|
401
|
+
|
|
402
|
+
More mature Python-based tools like [codemetapy](https://github.com/proycon/codemetapy) and [codemeta-harvester](https://github.com/proycon/codemeta-harvester) perform a similar task, and either of these are recommended if you need `codemeta.json` output and aren't limited to a Node.js runtime.
|
|
403
|
+
|
|
404
|
+
Note that Metascope and its its author is not affiliated with the CodeMeta project / governing bodies.
|
|
405
|
+
|
|
406
|
+
```sh
|
|
407
|
+
metascope --template codemeta
|
|
408
|
+
```
|
|
409
|
+
|
|
410
|
+
_See an [output sample](./docs/metascope-template-codemeta.json) from the `codemeta` template run against this repository._
|
|
411
|
+
|
|
412
|
+
#### `frontmatter`
|
|
413
|
+
|
|
414
|
+
A compact, non-nested, polyglot overview of the project. Designed for Obsidian frontmatter — flat keys with natural language names, blending all available sources into a single trackable snapshot. Uses `null` for missing values to ensure stable keys.
|
|
415
|
+
|
|
416
|
+
```sh
|
|
417
|
+
metascope --template frontmatter
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
_See an [output sample](./docs/metascope-template-frontmatter.json) from the `frontmatter` template run against this repository._
|
|
421
|
+
|
|
422
|
+
#### `project`
|
|
423
|
+
|
|
424
|
+
I needed this one for a legacy internal dashboard application. Includes ownership checks via `authorName` and `githubAccount` template data.
|
|
425
|
+
|
|
426
|
+
```sh
|
|
427
|
+
metascope --template project --author-name "Jane Doe" --github-account janedoe
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
_See an [output sample](./docs/metascope-template-project.json) from the `project` template run against this repository._
|
|
431
|
+
|
|
432
|
+
### Defining a custom template
|
|
433
|
+
|
|
434
|
+
Templates are pure functions that receive the full `MetadataContext` and an optional `TemplateData` object, and return whatever shape you like. They are applied _after_ all sources have been extracted, so all available data is accessible.
|
|
435
|
+
|
|
436
|
+
Yes, you can just pipe output to [jq](https://jqlang.org/) and filter / transform as you please, but for complex templates with a lot of logic, TypeScript can be nicer to work with.
|
|
437
|
+
|
|
438
|
+
Use `defineTemplate()` for type inference and autocomplete.
|
|
439
|
+
|
|
440
|
+
Many helper functions for working with template data are also under the `helpers` namespace:
|
|
441
|
+
|
|
442
|
+
```ts
|
|
443
|
+
// In e.g. "metascope-template.ts":
|
|
444
|
+
import { defineTemplate, helpers } from 'metascope'
|
|
445
|
+
|
|
446
|
+
export default defineTemplate(({ codemetaJson, codeStats, github, gitStats }) => {
|
|
447
|
+
const codemeta = helpers.firstOf(codemetaJson)
|
|
448
|
+
const git = helpers.firstOf(gitStats)
|
|
449
|
+
const gh = helpers.firstOf(github)
|
|
450
|
+
const loc = helpers.firstOf(codeStats)
|
|
451
|
+
return {
|
|
452
|
+
commits: git?.data.commitCount,
|
|
453
|
+
forks: gh?.data.forkCount,
|
|
454
|
+
linesOfCode: loc?.data.total?.code,
|
|
455
|
+
name: codemeta?.data.name,
|
|
456
|
+
stars: gh?.data.stargazerCount,
|
|
457
|
+
version: codemeta?.data.version,
|
|
458
|
+
}
|
|
459
|
+
})
|
|
460
|
+
```
|
|
461
|
+
|
|
462
|
+
### Passing template data
|
|
463
|
+
|
|
464
|
+
The second argument to a template function is a `TemplateData` object with optional `authorName` and `githubAccount` fields. This lets templates parameterize ownership checks instead of hardcoding author names:
|
|
465
|
+
|
|
466
|
+
```ts
|
|
467
|
+
import { defineTemplate, helpers } from 'metascope'
|
|
468
|
+
|
|
469
|
+
export default defineTemplate(({ codemetaJson }, { authorName, githubAccount }) => {
|
|
470
|
+
const codemeta = helpers.firstOf(codemetaJson)
|
|
471
|
+
const authors = codemeta?.data.author?.map((a) => a.name) ?? []
|
|
472
|
+
const repo = codemeta?.data.codeRepository?.toLowerCase() ?? ''
|
|
473
|
+
return {
|
|
474
|
+
isMyProject: authors.includes(authorName),
|
|
475
|
+
isOnMyGitHub: typeof githubAccount === 'string' && repo.includes(`/${githubAccount}/`),
|
|
476
|
+
name: codemeta?.data.name,
|
|
477
|
+
}
|
|
478
|
+
})
|
|
479
|
+
```
|
|
480
|
+
|
|
481
|
+
Values for the built-in templates are provided via the `--author-name` and `--github-account` CLI flags, or via the `templateData` option in the API. Templates that don't need this data can simply omit the second argument.
|
|
482
|
+
|
|
483
|
+
### Using a custom template via the CLI
|
|
484
|
+
|
|
485
|
+
```sh
|
|
486
|
+
metascope --template ./metascope-template.ts
|
|
487
|
+
```
|
|
488
|
+
|
|
489
|
+
Template files are loaded via [jiti](https://github.com/unjs/jiti), so TypeScript works out of the box without a build step.
|
|
490
|
+
|
|
491
|
+
## Background
|
|
492
|
+
|
|
493
|
+
Metascope was built to support automated generation of project dashboards, badges, and documentation where a single source of truth for project metadata is useful. Rather than querying each API individually, metascope handles the discovery, authentication, and aggregation in one pass for a wide variety of project types.
|
|
494
|
+
|
|
495
|
+
### Related projects
|
|
496
|
+
|
|
497
|
+
- [codemeta](https://codemeta.github.io/)\
|
|
498
|
+
Standard shared metadata vocabulary (JSON-LD)
|
|
499
|
+
- [codemetapy](https://github.com/proycon/codemetapy)\
|
|
500
|
+
Translate software metadata into the CodeMeta vocabulary (Python)
|
|
501
|
+
- [codemeta-harvester](https://github.com/proycon/codemeta-harvester)\
|
|
502
|
+
Aggregate software metadata into the CodeMeta vocabulary from source repositories and service endpoints (Python)
|
|
503
|
+
- [bibliothecary](https://github.com/librariesio/bibliothecary)\
|
|
504
|
+
Manifest discovery and parsing for [libraries.io](https://libraries.io/) (Ruby)
|
|
505
|
+
- [diggity](https://github.com/carbonetes/diggity)\
|
|
506
|
+
Generates SBOMs for container images, filesystems, archives, and more (Go)
|
|
507
|
+
- [SOMEF](https://github.com/KnowledgeCaptureAndDiscovery/somef/)\
|
|
508
|
+
Software Metadata Extraction Framework (Python)
|
|
509
|
+
- [Upstream Ontologist](https://github.com/jelmer/upstream-ontologist)\
|
|
510
|
+
A common interface for finding metadata about upstream software projects (Rust)
|
|
511
|
+
- [GrimoireLab](https://chaoss.github.io/grimoirelab/)\
|
|
512
|
+
Platform for software development analytics and insights (Python)
|
|
513
|
+
- [OSS Review Toolkit](https://oss-review-toolkit.org/ort/)\
|
|
514
|
+
A suite of CLI tools to automate software compliance checks (Kotlin)
|
|
515
|
+
- [Git Truck](https://github.com/git-truck/git-truck)\
|
|
516
|
+
Repository visualization. (TypeScript)
|
|
517
|
+
|
|
518
|
+
## Slop factor
|
|
519
|
+
|
|
520
|
+
_Medium._
|
|
521
|
+
|
|
522
|
+
The architecture and non-boilerplate parts of the documentation were human-driven, but sizable chunks of the implementation were mostly Claude Code's doing and have been subject to only moderate post-facto human scrutiny.
|
|
523
|
+
|
|
524
|
+
## Maintainers
|
|
525
|
+
|
|
526
|
+
[@kitschpatrol](https://github.com/kitschpatrol)
|
|
527
|
+
|
|
528
|
+
## Acknowledgments
|
|
529
|
+
|
|
530
|
+
Thank you to the [CodeMeta Project Management Committee and contributors](https://codemeta.github.io/governance/people/) for their development and stewardship of the standard.
|
|
531
|
+
|
|
532
|
+
Jacob Peddicord's [askalono](https://github.com/jpeddicord/askalono) project inspired the [Dice-Sørensen](https://en.wikipedia.org/wiki/Dice-S%C3%B8rensen_coefficient) scoring strategy used for classifying arbitrary license text.
|
|
533
|
+
|
|
534
|
+
<!-- contributing -->
|
|
535
|
+
|
|
536
|
+
## Contributing
|
|
537
|
+
|
|
538
|
+
[Issues](https://github.com/kitschpatrol/metascope/issues) and pull requests are welcome.
|
|
539
|
+
|
|
540
|
+
<!-- /contributing -->
|
|
541
|
+
|
|
542
|
+
<!-- license -->
|
|
543
|
+
|
|
544
|
+
## License
|
|
545
|
+
|
|
546
|
+
[MIT](license.txt) © Eric Mika
|
|
547
|
+
|
|
548
|
+
<!-- /license -->
|