tracer-sh 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +93 -0
- package/README.md +113 -0
- package/bin/tracer.mjs +31 -0
- package/package.json +46 -0
- package/packages/server/dist/chunk-4VNS5WPM.js +42 -0
- package/packages/server/dist/chunk-5IQ4TST5.js +233 -0
- package/packages/server/dist/chunk-6QUU7TGZ.js +196 -0
- package/packages/server/dist/chunk-7J2BYJNR.js +544 -0
- package/packages/server/dist/domain-knowledge-V6AENXZV.js +17 -0
- package/packages/server/dist/domain-knowledge-WHIEZOOH.js +13 -0
- package/packages/server/dist/index.js +58916 -0
- package/packages/server/dist/token-4WRACUIQ.js +69 -0
- package/packages/server/dist/token-util-NKFL6ZOU.js +5 -0
- package/packages/web/dist/assets/SearchableSelect-B7Oz7kCC.js +1 -0
- package/packages/web/dist/assets/Settings-6i0rgfI8.js +1 -0
- package/packages/web/dist/assets/highlighted-body-OFNGDK62-B2U64epE.js +1 -0
- package/packages/web/dist/assets/index-ckfqBqoi.css +1 -0
- package/packages/web/dist/assets/index-mD-B-NKT.js +48 -0
- package/packages/web/dist/assets/mermaid-GHXKKRXX-8orKVVwC.js +204 -0
- package/packages/web/dist/assets/mermaid-GHXKKRXX-B9Z7D9kT.css +1 -0
- package/packages/web/dist/favicon.svg +7 -0
- package/packages/web/dist/index.html +23 -0
- package/packages/web/dist/logo.svg +3 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,93 @@
|
|
|
1
|
+
Elastic License 2.0
|
|
2
|
+
|
|
3
|
+
URL: https://www.elastic.co/licensing/elastic-license
|
|
4
|
+
|
|
5
|
+
## Acceptance
|
|
6
|
+
|
|
7
|
+
By using the software, you agree to all of the terms and conditions below.
|
|
8
|
+
|
|
9
|
+
## Copyright License
|
|
10
|
+
|
|
11
|
+
The licensor grants you a non-exclusive, royalty-free, worldwide,
|
|
12
|
+
non-sublicensable, non-transferable license to use, copy, distribute, make
|
|
13
|
+
available, and prepare derivative works of the software, in each case subject to
|
|
14
|
+
the limitations and conditions below.
|
|
15
|
+
|
|
16
|
+
## Limitations
|
|
17
|
+
|
|
18
|
+
You may not provide the software to third parties as a hosted or managed
|
|
19
|
+
service, where the service provides users with access to any substantial set of
|
|
20
|
+
the features or functionality of the software.
|
|
21
|
+
|
|
22
|
+
You may not move, change, disable, or circumvent the license key functionality
|
|
23
|
+
in the software, and you may not remove or obscure any functionality in the
|
|
24
|
+
software that is protected by the license key.
|
|
25
|
+
|
|
26
|
+
You may not alter, remove, or obscure any licensing, copyright, or other notices
|
|
27
|
+
of the licensor in the software. Any use of the licensor’s trademarks is subject
|
|
28
|
+
to applicable law.
|
|
29
|
+
|
|
30
|
+
## Patents
|
|
31
|
+
|
|
32
|
+
The licensor grants you a license, under any patent claims the licensor can
|
|
33
|
+
license, or becomes able to license, to make, have made, use, sell, offer for
|
|
34
|
+
sale, import and have imported the software, in each case subject to the
|
|
35
|
+
limitations and conditions in this license. This license does not cover any
|
|
36
|
+
patent claims that you cause to be infringed by modifications or additions to
|
|
37
|
+
the software. If you or your company make any written claim that the software
|
|
38
|
+
infringes or contributes to infringement of any patent, your patent license for
|
|
39
|
+
the software granted under these terms ends immediately. If your company makes
|
|
40
|
+
such a claim, your patent license ends immediately for work on behalf of your
|
|
41
|
+
company.
|
|
42
|
+
|
|
43
|
+
## Notices
|
|
44
|
+
|
|
45
|
+
You must ensure that anyone who gets a copy of any part of the software from you
|
|
46
|
+
also gets a copy of these terms.
|
|
47
|
+
|
|
48
|
+
If you modify the software, you must include in any modified copies of the
|
|
49
|
+
software prominent notices stating that you have modified the software.
|
|
50
|
+
|
|
51
|
+
## No Other Rights
|
|
52
|
+
|
|
53
|
+
These terms do not imply any licenses other than those expressly granted in
|
|
54
|
+
these terms.
|
|
55
|
+
|
|
56
|
+
## Termination
|
|
57
|
+
|
|
58
|
+
If you use the software in violation of these terms, such use is not licensed,
|
|
59
|
+
and your licenses will automatically terminate. If the licensor provides you
|
|
60
|
+
with a notice of your violation, and you cease all violation of this license no
|
|
61
|
+
later than 30 days after you receive that notice, your licenses will be
|
|
62
|
+
reinstated retroactively. However, if you violate these terms after such
|
|
63
|
+
reinstatement, any additional violation of these terms will cause your licenses
|
|
64
|
+
to terminate automatically and permanently.
|
|
65
|
+
|
|
66
|
+
## No Liability
|
|
67
|
+
|
|
68
|
+
*As far as the law allows, the software comes as is, without any warranty or
|
|
69
|
+
condition, and the licensor will not be liable to you for any damages arising
|
|
70
|
+
out of these terms or the use or nature of the software, under any kind of
|
|
71
|
+
legal claim.*
|
|
72
|
+
|
|
73
|
+
## Definitions
|
|
74
|
+
|
|
75
|
+
The **licensor** is the entity offering these terms, and the **software** is the
|
|
76
|
+
software the licensor makes available under these terms, including any portion
|
|
77
|
+
of it.
|
|
78
|
+
|
|
79
|
+
**you** refers to the individual or entity agreeing to these terms.
|
|
80
|
+
|
|
81
|
+
**your company** is any legal entity, sole proprietorship, or other kind of
|
|
82
|
+
organization that you work for, plus all organizations that have control over,
|
|
83
|
+
are under the control of, or are under common control with that
|
|
84
|
+
organization. **control** means ownership of substantially all the assets of an
|
|
85
|
+
entity, or the power to direct its management and policies by vote, contract, or
|
|
86
|
+
otherwise. Control can be direct or indirect.
|
|
87
|
+
|
|
88
|
+
**your licenses** are all the licenses granted to you for the software under
|
|
89
|
+
these terms.
|
|
90
|
+
|
|
91
|
+
**use** means anything you do with the software requiring one of your licenses.
|
|
92
|
+
|
|
93
|
+
**trademark** means trademarks, service marks, and similar rights.
|
package/README.md
ADDED
|
@@ -0,0 +1,113 @@
|
|
|
1
|
+
# Tracer
|
|
2
|
+
|
|
3
|
+
[](https://www.npmjs.com/package/tracer-sh)
|
|
4
|
+
[](https://github.com/sholub-dev/tracer/actions/workflows/ci.yml)
|
|
5
|
+
[](https://github.com/sholub-dev/tracer/actions/workflows/codeql.yml)
|
|
6
|
+
|
|
7
|
+
Local-first AI-powered observability platform.
|
|
8
|
+
|
|
9
|
+
During an incident, most time goes to switching between observability tools
|
|
10
|
+
and gathering context — not fixing the problem. Tracer connects your providers
|
|
11
|
+
to a single AI chat interface so you find the root cause in one place.
|
|
12
|
+
|
|
13
|
+
## Debug
|
|
14
|
+
|
|
15
|
+
Chat with an AI agent that queries your providers in real-time and finds root causes — all from a single conversation.
|
|
16
|
+
|
|
17
|
+
- Natural language investigation
|
|
18
|
+
- Live query execution with inline charts
|
|
19
|
+
- Post-mortem reports — download as Markdown to share
|
|
20
|
+
- Share investigations as PNG — drop the exported image back into Tracer to re-open the analysis
|
|
21
|
+
- Agent memory across sessions
|
|
22
|
+
- Session history and cost tracking
|
|
23
|
+
|
|
24
|
+

|
|
25
|
+
|
|
26
|
+
## Settings
|
|
27
|
+
|
|
28
|
+
Configure providers, LLM credentials, agent behavior, and memory. All data is stored locally — nothing leaves your machine except the API calls you configure.
|
|
29
|
+
|
|
30
|
+
- Anthropic (Claude) and Google (Gemini) API keys
|
|
31
|
+
- Data provider setup with connectivity tests
|
|
32
|
+
- Thinking budgets and step limits
|
|
33
|
+
- Agent memory management
|
|
34
|
+
|
|
35
|
+

|
|
36
|
+
|
|
37
|
+
## How it works
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
┌─────────┐ your API keys ┌──────────────────┐
|
|
41
|
+
│ │ ◄──────────────────────────►│ Observability │
|
|
42
|
+
│ Tracer │ │ Providers │
|
|
43
|
+
│ local │ your API keys ├──────────────────┤
|
|
44
|
+
│ │ ◄──────────────────────────►│ LLM Providers │
|
|
45
|
+
└─────────┘ └──────────────────┘
|
|
46
|
+
```
|
|
47
|
+
|
|
48
|
+
Everything runs on your machine. Your data stays local in a SQLite database.
|
|
49
|
+
Tracer talks directly to your provider and LLM APIs using your own API keys —
|
|
50
|
+
no intermediary servers, no data leaves your machine except API calls you control.
|
|
51
|
+
|
|
52
|
+
## Install
|
|
53
|
+
|
|
54
|
+
Requires [Node.js 20+](https://nodejs.org/).
|
|
55
|
+
|
|
56
|
+
```bash
|
|
57
|
+
npx tracer-sh
|
|
58
|
+
```
|
|
59
|
+
|
|
60
|
+
Or install globally:
|
|
61
|
+
|
|
62
|
+
```bash
|
|
63
|
+
npm install -g tracer-sh
|
|
64
|
+
tracer-sh
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
Open `http://localhost:3579`, go to **Settings** to add your API keys and choose an LLM — done.
|
|
68
|
+
|
|
69
|
+
## Supported Providers
|
|
70
|
+
|
|
71
|
+
**Data:** New Relic (NRQL via NerdGraph), Google Cloud (Logs, Traces, Metrics, Errors)
|
|
72
|
+
|
|
73
|
+
**LLM:** Anthropic (Claude), Google (Gemini)
|
|
74
|
+
|
|
75
|
+
## Uninstall
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
npm uninstall -g tracer-sh
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
To also remove your local database (settings, sessions, API keys):
|
|
82
|
+
|
|
83
|
+
```bash
|
|
84
|
+
rm -rf ~/.tracer
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
## Troubleshooting
|
|
88
|
+
|
|
89
|
+
| Problem | Fix |
|
|
90
|
+
|---------|-----|
|
|
91
|
+
| `better-sqlite3` build fails | macOS: `xcode-select --install` / Linux: `sudo apt install build-essential python3` |
|
|
92
|
+
| Port in use | `TRACER_PORT=3580 tracer-sh` |
|
|
93
|
+
| No LLM responses | Add an API key in Settings |
|
|
94
|
+
|
|
95
|
+
## Contributing
|
|
96
|
+
|
|
97
|
+
Contributions are welcome! There are two main ways to help:
|
|
98
|
+
|
|
99
|
+
**Report bugs or request features** — [open an issue](https://github.com/sholub-dev/tracer/issues). Include steps to reproduce for bugs, or a clear description for feature requests.
|
|
100
|
+
|
|
101
|
+
**Submit a code change:**
|
|
102
|
+
|
|
103
|
+
1. Fork this repo
|
|
104
|
+
2. Create a branch (`git checkout -b fix/my-fix`)
|
|
105
|
+
3. Make your changes and commit
|
|
106
|
+
4. Push to your fork (`git push origin fix/my-fix`)
|
|
107
|
+
5. Open a pull request against `master`
|
|
108
|
+
|
|
109
|
+
All PRs require approval before merging.
|
|
110
|
+
|
|
111
|
+
## License
|
|
112
|
+
|
|
113
|
+
[Elastic License 2.0](https://www.elastic.co/licensing/elastic-license) — free for any use, including internal business use, modification, and redistribution. You may not offer it as a hosted or managed service competing with Tracer.
|
package/bin/tracer.mjs
ADDED
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
#!/usr/bin/env node
|
|
2
|
+
|
|
3
|
+
import { spawnSync } from "node:child_process";
|
|
4
|
+
import { fileURLToPath } from "node:url";
|
|
5
|
+
import { dirname, resolve } from "node:path";
|
|
6
|
+
|
|
7
|
+
const __dirname = dirname(fileURLToPath(import.meta.url));
|
|
8
|
+
const serverPath = resolve(__dirname, "../packages/server/dist/index.js");
|
|
9
|
+
|
|
10
|
+
// Must match RESTART_EXIT_CODE in packages/server/src/updater.ts
|
|
11
|
+
const RESTART_EXIT_CODE = 75;
|
|
12
|
+
|
|
13
|
+
const banner = `
|
|
14
|
+
╔═══════════════════════════════════╗
|
|
15
|
+
║ Tracer Debug Platform ║
|
|
16
|
+
╚═══════════════════════════════════╝
|
|
17
|
+
`;
|
|
18
|
+
|
|
19
|
+
console.log(banner);
|
|
20
|
+
|
|
21
|
+
// Restart loop: if server exits with code 75, it means an update was applied
|
|
22
|
+
while (true) {
|
|
23
|
+
const result = spawnSync(process.execPath, [serverPath], {
|
|
24
|
+
stdio: "inherit",
|
|
25
|
+
env: process.env,
|
|
26
|
+
});
|
|
27
|
+
if (result.status !== RESTART_EXIT_CODE) {
|
|
28
|
+
process.exit(result.status ?? 1);
|
|
29
|
+
}
|
|
30
|
+
console.log("\nRestarting after update...\n");
|
|
31
|
+
}
|
package/package.json
ADDED
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
{
|
|
2
|
+
"name": "tracer-sh",
|
|
3
|
+
"version": "0.1.0",
|
|
4
|
+
"type": "module",
|
|
5
|
+
"description": "Local-first debugging & analysis platform",
|
|
6
|
+
"license": "SEE LICENSE IN LICENSE",
|
|
7
|
+
"repository": {
|
|
8
|
+
"type": "git",
|
|
9
|
+
"url": "git+https://github.com/sholub-dev/tracer.git"
|
|
10
|
+
},
|
|
11
|
+
"packageManager": "pnpm@10.10.0",
|
|
12
|
+
"engines": {
|
|
13
|
+
"node": ">=20.0.0"
|
|
14
|
+
},
|
|
15
|
+
"bin": {
|
|
16
|
+
"tracer-sh": "./bin/tracer.mjs"
|
|
17
|
+
},
|
|
18
|
+
"files": [
|
|
19
|
+
"bin/tracer.mjs",
|
|
20
|
+
"packages/server/dist/",
|
|
21
|
+
"packages/web/dist/"
|
|
22
|
+
],
|
|
23
|
+
"scripts": {
|
|
24
|
+
"predev": "pnpm install --frozen-lockfile 2>/dev/null || pnpm install",
|
|
25
|
+
"dev": "concurrently --kill-others -n server,web -c blue,green \"pnpm --filter @tracer-sh/server dev\" \"pnpm --filter @tracer-sh/web dev\"",
|
|
26
|
+
"build": "pnpm --filter @tracer-sh/shared build && pnpm --filter @tracer-sh/server build && pnpm --filter @tracer-sh/web build",
|
|
27
|
+
"start": "node packages/server/dist/index.js",
|
|
28
|
+
"lint": "pnpm -r --parallel run typecheck",
|
|
29
|
+
"release": "bin/release.sh",
|
|
30
|
+
"prepublishOnly": "if [ -z \"${CI:-}\" ]; then echo 'ERROR: Do not publish manually. Use: pnpm release <version>' && exit 1; fi && pnpm build"
|
|
31
|
+
},
|
|
32
|
+
"dependencies": {
|
|
33
|
+
"better-sqlite3": "^12.6.2"
|
|
34
|
+
},
|
|
35
|
+
"pnpm": {
|
|
36
|
+
"onlyBuiltDependencies": [
|
|
37
|
+
"better-sqlite3",
|
|
38
|
+
"esbuild"
|
|
39
|
+
]
|
|
40
|
+
},
|
|
41
|
+
"devDependencies": {
|
|
42
|
+
"concurrently": "^9.2.1",
|
|
43
|
+
"tsx": "^4.21.0",
|
|
44
|
+
"typescript": "^5.9.3"
|
|
45
|
+
}
|
|
46
|
+
}
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
var __create = Object.create;
|
|
2
|
+
var __defProp = Object.defineProperty;
|
|
3
|
+
var __getOwnPropDesc = Object.getOwnPropertyDescriptor;
|
|
4
|
+
var __getOwnPropNames = Object.getOwnPropertyNames;
|
|
5
|
+
var __getProtoOf = Object.getPrototypeOf;
|
|
6
|
+
var __hasOwnProp = Object.prototype.hasOwnProperty;
|
|
7
|
+
var __require = /* @__PURE__ */ ((x) => typeof require !== "undefined" ? require : typeof Proxy !== "undefined" ? new Proxy(x, {
|
|
8
|
+
get: (a, b) => (typeof require !== "undefined" ? require : a)[b]
|
|
9
|
+
}) : x)(function(x) {
|
|
10
|
+
if (typeof require !== "undefined") return require.apply(this, arguments);
|
|
11
|
+
throw Error('Dynamic require of "' + x + '" is not supported');
|
|
12
|
+
});
|
|
13
|
+
var __commonJS = (cb, mod) => function __require2() {
|
|
14
|
+
return mod || (0, cb[__getOwnPropNames(cb)[0]])((mod = { exports: {} }).exports, mod), mod.exports;
|
|
15
|
+
};
|
|
16
|
+
var __export = (target, all) => {
|
|
17
|
+
for (var name in all)
|
|
18
|
+
__defProp(target, name, { get: all[name], enumerable: true });
|
|
19
|
+
};
|
|
20
|
+
var __copyProps = (to, from, except, desc) => {
|
|
21
|
+
if (from && typeof from === "object" || typeof from === "function") {
|
|
22
|
+
for (let key of __getOwnPropNames(from))
|
|
23
|
+
if (!__hasOwnProp.call(to, key) && key !== except)
|
|
24
|
+
__defProp(to, key, { get: () => from[key], enumerable: !(desc = __getOwnPropDesc(from, key)) || desc.enumerable });
|
|
25
|
+
}
|
|
26
|
+
return to;
|
|
27
|
+
};
|
|
28
|
+
var __toESM = (mod, isNodeMode, target) => (target = mod != null ? __create(__getProtoOf(mod)) : {}, __copyProps(
|
|
29
|
+
// If the importer is in node compatibility mode or this is not an ESM
|
|
30
|
+
// file that has been converted to a CommonJS file using a Babel-
|
|
31
|
+
// compatible transform (i.e. "__esModule" has not been set), then set
|
|
32
|
+
// "default" to the CommonJS "module.exports" for node compatibility.
|
|
33
|
+
isNodeMode || !mod || !mod.__esModule ? __defProp(target, "default", { value: mod, enumerable: true }) : target,
|
|
34
|
+
mod
|
|
35
|
+
));
|
|
36
|
+
|
|
37
|
+
export {
|
|
38
|
+
__require,
|
|
39
|
+
__commonJS,
|
|
40
|
+
__export,
|
|
41
|
+
__toESM
|
|
42
|
+
};
|
|
@@ -0,0 +1,233 @@
|
|
|
1
|
+
// src/providers/newrelic/domain-knowledge.ts
|
|
2
|
+
var NR_AUTH_STOP_RULE = `## Authentication Failure \u2014 STOP IMMEDIATELY
|
|
3
|
+
If any query returns an authentication or permission error (e.g. "Invalid API key", "401", "403", "Unauthorized"), **STOP ALL FURTHER TOOL CALLS** and report:
|
|
4
|
+
1. The exact error message received.
|
|
5
|
+
2. That the New Relic API key needs to be checked in Settings.
|
|
6
|
+
Do NOT retry \u2014 auth errors cannot be resolved by the sub-agent.`;
|
|
7
|
+
var NRQL_QUICK_REFERENCE = `## NRQL Reference
|
|
8
|
+
|
|
9
|
+
### Clauses
|
|
10
|
+
- \`SELECT func(attr) FROM EventType\` \u2014 required. Event types are **case-sensitive** (\`Transaction\` not \`transaction\`).
|
|
11
|
+
- \`WHERE attr op value\` \u2014 filter. Operators: \`=\`, \`!=\`, \`<\`, \`>\`, \`<=\`, \`>=\`, \`IN ('a','b')\`, \`IS [NOT] NULL\`.
|
|
12
|
+
- \`LIKE '%pattern%'\` \u2014 **case-sensitive**, \`%\` wildcard. Leading wildcard \`LIKE '%x'\` is slow.
|
|
13
|
+
- \`RLIKE '(?i)timeout|refused'\` \u2014 regex (RE2 syntax). Must match ENTIRE string for extraction; use \`.*pattern.*\` for partial matching. Use \`(?i)\` flag for case-insensitive matching.
|
|
14
|
+
- \`FACET attr [, attr2]\` \u2014 group by (max 5 attributes). Default LIMIT 10 facet values, max 5000. LIMIT applies per-facet group.
|
|
15
|
+
- \`FACET CASES (WHERE cond AS 'label', ...)\` \u2014 custom grouping buckets. Order matters: first match wins.
|
|
16
|
+
- \`LIMIT n\` \u2014 max rows/facets. Default 10 for FACET, 100 for non-FACET. \`LIMIT MAX\` for maximum allowed.
|
|
17
|
+
- \`SINCE time_expr\` \u2014 always include. NR default is 1 hour if omitted. Supports: \`N hours ago\`, \`today\`, \`yesterday\`, \`'2024-01-15T14:00'\`, epoch ms.
|
|
18
|
+
- \`UNTIL time_expr\` \u2014 end time (default NOW).
|
|
19
|
+
- \`COMPARE WITH N time_unit ago\` \u2014 overlay current vs prior period. Requires SINCE.
|
|
20
|
+
- \`TIMESERIES N time_unit\` or \`TIMESERIES AUTO\` \u2014 time-bucketed series. Max 366 buckets. Be explicit with bucket size \u2014 AUTO can hide short spikes.
|
|
21
|
+
- \`SLIDE BY N time_unit\` \u2014 sliding windows. Cannot use with TIMESERIES AUTO.
|
|
22
|
+
- \`EXTRAPOLATE\` \u2014 compensate for APM event sampling. Only works with: count, average, sum, histogram, rate, percentage, apdex, stddev. Does NOT work with: uniqueCount, percentile, min, max, latest, earliest.
|
|
23
|
+
- \`ORDER BY attr [ASC|DESC]\` \u2014 sort non-aggregation results only (not for FACET queries).
|
|
24
|
+
- \`WITH TIMEZONE 'America/New_York'\` \u2014 affects time display and time functions. Default UTC.
|
|
25
|
+
- Subqueries: \`WHERE x IN (SELECT ... FROM ...)\` or \`FROM (SELECT ... FACET y) WHERE ...\` \u2014 max 3 per query, cannot reference outer query attributes.
|
|
26
|
+
|
|
27
|
+
### Key Functions
|
|
28
|
+
**Aggregation:** \`count(*)\`, \`average(attr)\`, \`sum\`, \`min\`, \`max\`, \`percentile(attr, 50, 95, 99)\`, \`uniqueCount(attr)\`, \`uniques(attr)\`, \`histogram(attr)\`, \`median(attr)\`, \`stddev\`, \`latest(attr)\`, \`earliest(attr)\`.
|
|
29
|
+
**Rate/trend:** \`rate(aggregator, interval)\` \u2014 frequency per time unit. \`derivative(attr)\` \u2014 rate of change. \`filter(aggregator, WHERE cond)\` \u2014 conditional aggregation in SELECT, e.g. \`filter(count(*), WHERE error IS TRUE)\`. \`percentage(count(*), WHERE cond)\` \u2014 % matching condition.
|
|
30
|
+
**String/extraction:** \`capture(attr, r'.*(?P<name>pattern).*')\` \u2014 RE2 regex extraction. Named groups \`(?P<name>...)\`. Must match FULL string. \`aparse(attr, 'anchor*pattern')\` \u2014 simpler/faster anchor-based extraction. \`concat(a, b)\`, \`lower()\`, \`upper()\`, \`length()\`, \`substring(attr, start, end)\`, \`position(str, sub)\`, \`replace(str, search, repl)\`.
|
|
31
|
+
**Conditional:** \`if(condition, true_val, false_val)\` \u2014 per-row conditional.
|
|
32
|
+
**Time grouping:** \`hourOf(timestamp)\`, \`dateOf()\`, \`weekdayOf()\`, \`dayOfMonthOf()\`, \`monthOf()\` \u2014 UTC by default, use WITH TIMEZONE.
|
|
33
|
+
**Type conversion:** \`numeric(val)\`, \`string(val)\`, \`boolean(val)\`. JSON: \`jsonParse(str_attr)\`, \`mapKeys()\`, \`mapValues()\`.
|
|
34
|
+
**Discovery:** \`keyset() FROM EventType\` \u2014 list all attributes. \`SHOW EVENT TYPES\` \u2014 list all event types.
|
|
35
|
+
|
|
36
|
+
### Case Sensitivity
|
|
37
|
+
- **Event types**: case-sensitive (\`Transaction\`, not \`transaction\`).
|
|
38
|
+
- **Attribute names**: case-sensitive (\`appName\`, not \`appname\`).
|
|
39
|
+
- **NRQL keywords/functions**: case-insensitive (\`SELECT\`, \`select\`, \`Count()\` all work).
|
|
40
|
+
- **\`=\` and \`LIKE\`**: case-sensitive. For case-insensitive matching, use \`RLIKE '(?i)pattern'\` or normalize with \`lower()\`.`;
|
|
41
|
+
var NR_ANTI_PATTERNS = `## Common Mistakes \u2014 AVOID THESE
|
|
42
|
+
- No \`DISTINCT\` keyword \u2014 use \`uniques(field)\` or \`FACET field\`.
|
|
43
|
+
- No \`GROUP BY\` \u2014 use \`FACET\`.
|
|
44
|
+
- No backslashes in NRQL strings. Backtick dotted field names: \`SELECT \\\`error.message\\\` FROM TransactionError\`.
|
|
45
|
+
- \`WHERE attr != 'value'\` does NOT include rows where attr is NULL. Use \`WHERE attr != 'value' OR attr IS NULL\`.
|
|
46
|
+
- NULL values are excluded from FACET groups.
|
|
47
|
+
- \`count(*)\` counts all rows. \`count(attr)\` counts non-null values only.
|
|
48
|
+
- \`SELECT *\` over long ranges is slow \u2014 always aggregate for > 1 hour.
|
|
49
|
+
- \`Span\` data is sampled \u2014 never use for aggregate metrics (counts, averages). Use \`Transaction\` or \`Metric\` instead.
|
|
50
|
+
- RLIKE is slower than LIKE \u2014 use LIKE when wildcards suffice.
|
|
51
|
+
- \`COMPARE WITH\` + \`percentile()\` returns a JSON object (\`{"99": 0.984375}\`) instead of a plain number. To compare percentile values across periods, run two separate queries with explicit \`SINCE\`/\`UNTIL\` ranges rather than using \`COMPARE WITH\`.`;
|
|
52
|
+
var NR_INSIDE_OUT_DEBUGGING = `## Inside-Out Debugging
|
|
53
|
+
|
|
54
|
+
**With a specific identifier** (ID, error message, trace ID, user):
|
|
55
|
+
1. Find it \u2014 TransactionError first (\`\\\`error.message\\\` LIKE '%value%'\`, \`request.uri LIKE '%value%'\`). No match \u2192 Transaction \u2192 Log.
|
|
56
|
+
2. Extract context \u2014 traceId, appName, transactionName, error.class, timestamp.
|
|
57
|
+
3. Expand ONLY if needed \u2014 \`FROM Transaction WHERE traceId = '...'\` for the request chain, but only if the error context doesn't already answer the question.
|
|
58
|
+
4. If multiple identifiers surface, investigate 1-2 representative samples. If they show the same pattern, stop \u2014 that IS the pattern.
|
|
59
|
+
|
|
60
|
+
**Without a specific identifier** (vague symptoms):
|
|
61
|
+
1. Golden Signals \u2014 \`SELECT count(*), average(duration), percentage(count(*), WHERE error IS TRUE) FROM Transaction WHERE appName = '...' SINCE 1 hour ago\` \u2014 get multiple signals in ONE query.
|
|
62
|
+
2. FACET to scope \u2014 drill by appName, transactionName, or error.class to narrow down.
|
|
63
|
+
3. Once you have a specific identifier, switch to the "with identifier" flow above.
|
|
64
|
+
|
|
65
|
+
**Diagnostic safeguards:**
|
|
66
|
+
- **No data?** If your first 2 queries across different event types return empty, verify data exists: \`SELECT count(*) FROM Transaction SINCE 1 day ago\`. If zero, report no data \u2014 stop investigating.
|
|
67
|
+
|
|
68
|
+
**Identifier extraction:** \`error.message\` often contains IDs \u2014 extract with \`capture()\` or \`LIKE\` and search across types. \`request.uri\` has entity IDs (e.g. \`/api/users/12345\`). Custom attributes: \`keyset() FROM Transaction\`.
|
|
69
|
+
|
|
70
|
+
NEVER start with schema discovery (\`keyset()\`, \`SHOW EVENT TYPES\`) when you have a specific value.`;
|
|
71
|
+
var NR_SERVICE_HEALTH_RUNBOOK = `## Service Health Runbook
|
|
72
|
+
|
|
73
|
+
### Step 0 \u2014 Identify Entity Types
|
|
74
|
+
Before running any checks, determine which entity types exist for the service:
|
|
75
|
+
- **APM application present** \u2192 run the APM Health Checklist below
|
|
76
|
+
- **Browser application present** \u2192 run the Browser Health Checklist below
|
|
77
|
+
- **Both present** \u2192 run both checklists
|
|
78
|
+
- **Neither found** \u2192 report that no monitorable entity was found and stop
|
|
79
|
+
|
|
80
|
+
---
|
|
81
|
+
|
|
82
|
+
### APM Health Checklist
|
|
83
|
+
Run all 4 checks for any APM application entity.
|
|
84
|
+
|
|
85
|
+
**[ ] Check 1 \u2014 Response Time P99**
|
|
86
|
+
Measure the 99th percentile response time for all transactions.
|
|
87
|
+
_Why:_ P99 catches worst-case latency that average metrics hide. Degradation here means the slowest 1% of users are experiencing a significant problem even if the average looks fine.
|
|
88
|
+
_Baseline:_ Compare against the same time range exactly 1 week ago (same day of week).
|
|
89
|
+
_Flag if:_ Current P99 is more than 50% higher than the baseline.
|
|
90
|
+
|
|
91
|
+
**[ ] Check 2 \u2014 Transaction Error Rate**
|
|
92
|
+
Measure the percentage of all transactions that ended in an error.
|
|
93
|
+
_Why:_ Error rate directly reflects the fraction of user requests that are failing. Even a small uptick is significant because it represents real users hitting errors.
|
|
94
|
+
_Baseline:_ Compare against the same time range 1 week ago (same day of week).
|
|
95
|
+
_Flag if:_ Current rate is more than 50% relatively higher than baseline, OR the absolute increase is more than 1 percentage point when the baseline was near zero.
|
|
96
|
+
|
|
97
|
+
**[ ] Check 3 \u2014 External Services Errors**
|
|
98
|
+
Measure the error rate of outbound HTTP calls this service makes to external services or APIs.
|
|
99
|
+
_Why:_ A failing downstream dependency will cascade into this service's errors. This distinguishes "our code is broken" from "something we depend on is broken."
|
|
100
|
+
_Applicability:_ Only run this check if the service makes external calls. If it does not, mark as N/A.
|
|
101
|
+
_Baseline:_ Compare non-2xx response rate against the same time range 1 week ago (same day of week).
|
|
102
|
+
_Flag if:_ Non-2xx rate is more than 2\xD7 the baseline, or it newly appears where the baseline was near zero.
|
|
103
|
+
|
|
104
|
+
**[ ] Check 4 \u2014 Success Transactions Drop**
|
|
105
|
+
Measure the count of transactions that completed successfully (no error).
|
|
106
|
+
_Why:_ A drop in successful transactions signals a blockage or anomaly even when the error rate is stable \u2014 requests may simply not be arriving or being processed at expected volumes.
|
|
107
|
+
_Baseline:_ Compare count against the same time range exactly 1 week ago (same day of week). **Never compare different days of the week \u2014 traffic patterns differ significantly between weekdays and weekends.**
|
|
108
|
+
_Flag if:_ Count is more than 20% lower than the baseline.
|
|
109
|
+
|
|
110
|
+
---
|
|
111
|
+
|
|
112
|
+
### Browser Health Checklist
|
|
113
|
+
Run all 5 checks for any Browser application entity.
|
|
114
|
+
|
|
115
|
+
**[ ] Check 1 \u2014 Browser JS Errors**
|
|
116
|
+
Measure the count of JavaScript errors recorded in users' browsers.
|
|
117
|
+
_Why:_ JS errors silently break features and flows without producing HTTP errors \u2014 they are completely invisible to server-side APM.
|
|
118
|
+
_Baseline:_ Compare count against the same time range 1 week ago (same day of week).
|
|
119
|
+
_Flag if:_ Count is more than 50% higher than baseline.
|
|
120
|
+
|
|
121
|
+
**[ ] Check 2 \u2014 AJAX Request Error Rate**
|
|
122
|
+
Measure the percentage of AJAX/XHR requests from the browser that received a non-2xx HTTP response.
|
|
123
|
+
_Why:_ Catches API failures from the client's perspective, including calls to third-party APIs or CDN endpoints that server-side APM may miss entirely.
|
|
124
|
+
_Baseline:_ Compare against the same time range 1 week ago (same day of week).
|
|
125
|
+
_Flag if:_ Error rate is more than 50% relatively higher than baseline, or the absolute increase is more than 1 percentage point when baseline was near zero.
|
|
126
|
+
|
|
127
|
+
**[ ] Check 3 \u2014 Browser LCP P75**
|
|
128
|
+
Measure the 75th percentile of Largest Contentful Paint (LCP) \u2014 the time until the page's primary content is visible to the user.
|
|
129
|
+
_Why:_ LCP is the primary user-perceived load speed signal. P75 means 3 out of 4 users experience this load time or better. A regression here directly impacts perceived performance for the majority of users.
|
|
130
|
+
_Baseline:_ Compare against the same time range 1 week ago (same day of week).
|
|
131
|
+
_Flag if:_ P75 is more than 30% higher (slower) than baseline.
|
|
132
|
+
|
|
133
|
+
**[ ] Check 4 \u2014 AJAX Response Time P75**
|
|
134
|
+
Measure the 75th percentile of AJAX request round-trip time as measured from the browser.
|
|
135
|
+
_Why:_ Slow AJAX responses degrade interactivity even when the initial page load looks fine. This catches API latency that users feel during active interactions.
|
|
136
|
+
_Baseline:_ Compare against the same time range 1 week ago (same day of week).
|
|
137
|
+
_Flag if:_ P75 is more than 30% higher than baseline.
|
|
138
|
+
|
|
139
|
+
**[ ] Check 5 \u2014 AJAX Success Requests Drop**
|
|
140
|
+
Measure the count of AJAX requests that completed with a successful (2xx) response.
|
|
141
|
+
_Why:_ A drop can indicate a broken feature preventing users from reaching certain flows, a routing or CDN blockage, or an authentication/session issue causing requests to be rejected upstream.
|
|
142
|
+
_Baseline:_ Compare against the same time range exactly 1 week ago (same day of week). **Never compare different days of the week.**
|
|
143
|
+
_Flag if:_ Count is more than 20% lower than the baseline.
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
### Baseline Comparison Rule
|
|
148
|
+
**Always compare the same day of week, same time range, 7 days prior.**
|
|
149
|
+
- Correct: "Tuesday 14:00\u201315:00" vs "last Tuesday 14:00\u201315:00"
|
|
150
|
+
- **WRONG: comparing Monday vs Tuesday, or weekday vs weekend.** Traffic patterns differ significantly \u2014 cross-day comparisons produce false positives and miss real issues.
|
|
151
|
+
|
|
152
|
+
---
|
|
153
|
+
|
|
154
|
+
### Reporting Format
|
|
155
|
+
After completing all applicable checks, report results as a table:
|
|
156
|
+
|
|
157
|
+
| Check | Current | Baseline (1w ago) | Delta | Status |
|
|
158
|
+
|-------|---------|-------------------|-------|--------|
|
|
159
|
+
| Response Time P99 | ... | ... | +X% | \u2705 Pass / \u26A0\uFE0F Flagged |
|
|
160
|
+
| Transaction Error Rate | ... | ... | +X% | \u2705 Pass / \u26A0\uFE0F Flagged |
|
|
161
|
+
| External Services Errors | ... | ... | ... | \u2705 Pass / \u26A0\uFE0F Flagged / \u2014 N/A |
|
|
162
|
+
| Success Transactions | ... | ... | -X% | \u2705 Pass / \u26A0\uFE0F Flagged |
|
|
163
|
+
|
|
164
|
+
End with an overall verdict on its own line:
|
|
165
|
+
- \`Service appears healthy.\` \u2014 all checks pass
|
|
166
|
+
- \`X check(s) flagged: [list check names].\` \u2014 one or more checks flagged
|
|
167
|
+
|
|
168
|
+
If a check could not be evaluated (no baseline data available, service has no external calls, etc.), mark it as **N/A** with a brief note. Never silently skip a check.`;
|
|
169
|
+
var NR_QUERY_DEFAULTS = `## Query Defaults
|
|
170
|
+
- Always include \`SINCE\`. Default: \`SINCE 24 hours ago\`. "recent" \u2192 1 hour. "today" \u2192 today.
|
|
171
|
+
- Default: \`LIMIT 10\`. Increase to 20\u201350 only when needed. Never start with 100+.`;
|
|
172
|
+
var NR_EVENT_TYPES = `## Event Types & Field Reference
|
|
173
|
+
|
|
174
|
+
### CRITICAL: Field Name Mismatches Across Event Types
|
|
175
|
+
| Concept | Transaction/Span | Log | Infrastructure |
|
|
176
|
+
|---------|-----------------|-----|----------------|
|
|
177
|
+
| Trace ID | \`traceId\` (camelCase) | \`trace.id\` (dotted, needs backticks) | \u2014 |
|
|
178
|
+
| Span ID | \`guid\` | \`span.id\` (dotted, needs backticks) | \u2014 |
|
|
179
|
+
| App/Service | \`appName\` | \`entity.name\` | \u2014 |
|
|
180
|
+
| Host | \`host\` | \`hostname\` | \`hostname\` |
|
|
181
|
+
| Txn name | \`name\` | \u2014 | \u2014 |
|
|
182
|
+
| Txn name (error) | \u2014 (use \`transactionName\` in TransactionError) | \u2014 | \u2014 |
|
|
183
|
+
|
|
184
|
+
Same values, different field names. \`FROM Transaction WHERE traceId = 'abc'\` but \`FROM Log WHERE \\\`trace.id\\\` = 'abc'\`.
|
|
185
|
+
|
|
186
|
+
### Event Types
|
|
187
|
+
|
|
188
|
+
**Transaction** \u2014 One event per request per service. ~2000 events/min/instance before sampling.
|
|
189
|
+
Fields: \`duration\`, \`name\`, \`appName\`, \`traceId\`, \`guid\`, \`httpResponseCode\`, \`request.uri\`, \`request.method\`, \`error\` (boolean), \`host\`, \`entity.guid\`, \`databaseDuration\`, \`externalDuration\`, \`parent.app\`/\`parent.type\`/\`parent.transportDuration\`.
|
|
190
|
+
Use for: health checks, error rates, latency, throughput, cross-service tracing.
|
|
191
|
+
|
|
192
|
+
**TransactionError** \u2014 Error details. Separate sampling pool (~100/harvest cycle).
|
|
193
|
+
Fields: \`error.message\`, \`error.class\`, \`error.expected\`, \`transactionName\`, \`request.uri\`, \`traceId\`, \`appName\`. Plus most Transaction fields.
|
|
194
|
+
Key difference: Transaction only has boolean \`error\`; TransactionError has the actual error message/class.
|
|
195
|
+
|
|
196
|
+
**Log** \u2014 Forensic detail. Auto-decorated with trace linking when logs-in-context is enabled.
|
|
197
|
+
Fields: \`message\`, \`level\` / \`log.level\`, \`entity.name\`, \`hostname\`, \`trace.id\`, \`span.id\`, \`entity.guid\`.
|
|
198
|
+
Note: Linking fields (\`trace.id\`, \`span.id\`) require agent logs-in-context. Missing = agent too old or feature not enabled.
|
|
199
|
+
|
|
200
|
+
**Span** \u2014 Sub-transaction operations (DB, HTTP, method timings). **HEAVILY SAMPLED: ~10 traces/min (120 for Java), max 2000 spans/min.** Never use for aggregate counts/averages \u2014 counts will be far too low.
|
|
201
|
+
Fields: \`name\`, \`duration\`, \`category\` (generic/http/datastore/external), \`span.kind\`, \`traceId\`, \`parentId\`, \`nr.entryPoint\`, \`http.url\`, \`http.statusCode\`, \`db.statement\`, \`error.class\`, \`error.message\`.
|
|
202
|
+
Use ONLY for: tracing individual request paths when Transaction's \`duration\` isn't granular enough.
|
|
203
|
+
|
|
204
|
+
**Metric** \u2014 Dimensional metrics. **NEVER SAMPLED** \u2014 always accurate. Use when event sampling is suspected.
|
|
205
|
+
Query: \`FROM Metric SELECT average(apm.service.transaction.duration) WHERE appName = 'x'\`.
|
|
206
|
+
|
|
207
|
+
**SystemSample / ProcessSample** \u2014 Infrastructure host/process metrics. **Requires separate infrastructure agent** (not guaranteed with APM).
|
|
208
|
+
Fields: \`cpuPercent\`, \`memoryUsedPercent\`, \`diskUsedPercent\`, \`hostname\`, \`processDisplayName\`.`;
|
|
209
|
+
var NR_CROSS_SIGNAL = `## Cross-Signal Correlation
|
|
210
|
+
- **Transaction \u2192 TransactionError**: Same \`traceId\`. Transaction has boolean \`error\`; TransactionError has \`error.message\`/\`error.class\`.
|
|
211
|
+
- **Transaction \u2192 Log**: \`traceId\` (Transaction) = \`trace.id\` (Log, backtick-quoted). \`appName\` (Transaction) = \`entity.name\` (Log).
|
|
212
|
+
- **Transaction \u2192 Span**: Same \`traceId\`. Use Span ONLY for sub-request breakdown \u2014 never for aggregates (heavily sampled).
|
|
213
|
+
- **Any \u2192 Metric**: When event counts seem low, compare: \`FROM Transaction SELECT count(*)\` vs \`FROM Metric SELECT rate(count(apm.service.transaction.duration), 1 minute)\`. If Metric is significantly higher, events are sampled \u2014 add \`EXTRAPOLATE\`.
|
|
214
|
+
- **\`entity.guid\`**: Universal cross-type linker across Transaction, Span, Log, Metric, Infrastructure.
|
|
215
|
+
- **Diagnostic shortcut**: Health \u2192 Transaction | Error cause \u2192 TransactionError | Slow? \u2192 Transaction FACET name, then Span for breakdown | Which service? \u2192 Transaction WHERE traceId | Infra \u2192 SystemSample WHERE hostname = '...'`;
|
|
216
|
+
var NR_DOMAIN_KNOWLEDGE = `${NR_QUERY_DEFAULTS}
|
|
217
|
+
|
|
218
|
+
${NR_EVENT_TYPES}
|
|
219
|
+
|
|
220
|
+
${NR_CROSS_SIGNAL}
|
|
221
|
+
|
|
222
|
+
${NRQL_QUICK_REFERENCE}
|
|
223
|
+
|
|
224
|
+
${NR_ANTI_PATTERNS}
|
|
225
|
+
|
|
226
|
+
${NR_SERVICE_HEALTH_RUNBOOK}`;
|
|
227
|
+
|
|
228
|
+
export {
|
|
229
|
+
NR_AUTH_STOP_RULE,
|
|
230
|
+
NR_INSIDE_OUT_DEBUGGING,
|
|
231
|
+
NR_SERVICE_HEALTH_RUNBOOK,
|
|
232
|
+
NR_DOMAIN_KNOWLEDGE
|
|
233
|
+
};
|