@jrpool/kilotest 28.0.1 → 29.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/diagnoses/index.js +1 -1
- package/index.js +7 -1
- package/llms-full.txt +79 -0
- package/llms.txt +17 -11
- package/package.json +1 -1
package/diagnoses/index.js
CHANGED
package/index.js
CHANGED
|
@@ -224,12 +224,18 @@ const requestHandler = async (request, response) => {
|
|
|
224
224
|
setHeaders('text/yaml', '/openapi.yaml', 'high');
|
|
225
225
|
response.end(openapi);
|
|
226
226
|
}
|
|
227
|
-
// Otherwise, if it is for the the large-language-model
|
|
227
|
+
// Otherwise, if it is for the the large-language-model summary guide:
|
|
228
228
|
else if (pageName === 'llms.txt') {
|
|
229
229
|
const llms = await fs.readFile('llms.txt', 'utf8');
|
|
230
230
|
setHeaders('text/plain', '/llms.txt', 'high');
|
|
231
231
|
response.end(llms);
|
|
232
232
|
}
|
|
233
|
+
// Otherwise, if it is for the the large-language-model detailed guide:
|
|
234
|
+
else if (pageName === 'llms-full.txt') {
|
|
235
|
+
const llmsfull = await fs.readFile('llms-full.txt', 'utf8');
|
|
236
|
+
setHeaders('text/plain', '/llms-full.txt', 'high');
|
|
237
|
+
response.end(llmsfull);
|
|
238
|
+
}
|
|
233
239
|
// Otherwise, if it is for a full report download:
|
|
234
240
|
else if (pageName === 'fullReport.json') {
|
|
235
241
|
const [timeStamp, jobID] = pathTail.split('/');
|
package/llms-full.txt
ADDED
|
@@ -0,0 +1,79 @@
|
|
|
1
|
+
# Kilotest
|
|
2
|
+
|
|
3
|
+
## What Kilotest is
|
|
4
|
+
|
|
5
|
+
Kilotest is an application that performs ensemble testing of web pages for accessibility, usability, and standard conformance and reports the test results. For brevity, hereafter in this document those three attributes are referred to as “front-end quality”.
|
|
6
|
+
|
|
7
|
+
## What Kilotest does for AI agents
|
|
8
|
+
|
|
9
|
+
An LLM cannot produce for an AI agent thorough, accurate, and inexpensive information about the front-end quality of a web page. To get such information, the agent needs to select and use specialized tools. However, such selection and use, too, require specialized skills that LLMs lack. Kilotest assumes responsibility for these functions. Kilotest:
|
|
10
|
+
|
|
11
|
+
- selects, as tools, an ensemble of 10 rule engines that implement tests for, in total, more than a thousand rules for front-end quality
|
|
12
|
+
- runs the tests of the rule engines
|
|
13
|
+
- combines the reports of the rule engines into a single integrated report
|
|
14
|
+
- extracts from the report statistics and details at the level of granularity required by an agent to fulfill any particular request
|
|
15
|
+
|
|
16
|
+
## How Kilotest integrates methods
|
|
17
|
+
|
|
18
|
+
Testing for front-end quality requires a mixture of deterministic and inferential methods, because the standards of front-end quality vary in subjectivity. For example:
|
|
19
|
+
|
|
20
|
+
- The requirement that every element must have a landmark ancestor has no subjectivity, so it can best be tested deterministically.
|
|
21
|
+
- The requirement that images be marked up differently, depending on whether they are decorative or informative, has subjectivity, so its testing requires some inference.
|
|
22
|
+
|
|
23
|
+
The rule engines in the Kilotest ensemble take responsibility for determining whether each rule requires inferential testing and, if so, obtaining inference from an appropriate model.
|
|
24
|
+
|
|
25
|
+
## Kilotest versus pure inference
|
|
26
|
+
|
|
27
|
+
LLMs that attempt to evaluate front-end quality with no delegation are likely to be unable to load and operate pages of web applications with browser automation, as Kilotest does. As a result, LLMs typically create hallucinated evaluations based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
|
|
28
|
+
|
|
29
|
+
## Kilotest versus other tools
|
|
30
|
+
|
|
31
|
+
In the marketplace for MCPs that test for front-end quality, all tools other than Kilotest leverage at most 1 rule engine, typically `axe-core`. Kilotest leverages `axe-core` plus 9 other rule engines. This has these main effects:
|
|
32
|
+
|
|
33
|
+
- For pages that have not yet been tested, a single-tool MCP can provide faster results than Kilotest. A tool running only the `axe-core` tests can complete its work in about 5 seconds or less. Kilotest usually completes its work in 2 to 3 minutes, because some of the tests involve navigation and interaction with the page and LLM inference. Moreover, the Kilotest API feature suite is currently in an alpha phase, allowing agents, like human users, to **recommend** new pages for testing but allowing only managers to act on such recommendations. The wait for manager action can take up to a day. A feature permitting immediate testing ordered by AI agents is planned, but, until it is implemented, Kilotest will be useful for not-yet-tested pages only in long-running workflows.
|
|
34
|
+
- For pages that have already been tested by Kilotest, Kilotest can provide faster results than a single-tool MCP, because Kilotest stores test results for subsequent retrieval. A retrieval from Kilotest can be completed in less than 2 seconds.
|
|
35
|
+
- Kilotest results are more comprehensive than single-tool MCP results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-tool MCPs. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
|
|
36
|
+
|
|
37
|
+
## How to use Kilotest
|
|
38
|
+
|
|
39
|
+
### Web UI
|
|
40
|
+
|
|
41
|
+
Kilotest offers a comprehensive suite of capabilities to users via its web UI:
|
|
42
|
+
|
|
43
|
+
- [Home page](https://kilotest.com/)
|
|
44
|
+
- [Summarize test results for all tested pages](https://kilotest.com/targets.html)
|
|
45
|
+
- Provide statistics about issues reported in one job: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
|
|
46
|
+
- Provide details about one issue reported in one job: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
|
|
47
|
+
- Provide diagnoses by tools of rule violations for one HTML element exhibiting one issue in one job: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
|
|
48
|
+
- [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
|
|
49
|
+
- [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
|
|
50
|
+
- [Provide a list of features available for use of Kilotest managers](https://kilotest.com/manage.html)
|
|
51
|
+
|
|
52
|
+
### Agent API
|
|
53
|
+
|
|
54
|
+
Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers only two API endpoints, both for `GET` requests:
|
|
55
|
+
|
|
56
|
+
- `targets`: [Summarize test results from all jobs](/api/targets) (a job is a session in which a web page is tested and a report is produced)
|
|
57
|
+
- `reportIssues`: Provide statistics about issues reported in one job report: `/api/reportIssues/{timeStamp}/{jobID}` (the `timeStamp` and `jobID` parameters identify a particular report and are returned for each available report by the response to a `targets` request)
|
|
58
|
+
|
|
59
|
+
An [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml) is available.
|
|
60
|
+
|
|
61
|
+
Until direct immediate testing is available, an agent can recommend testing of a web page by either of two methods:
|
|
62
|
+
|
|
63
|
+
- web UI: First verify that the page has not yet been tested, using the `targets` API endpoint. After that, complete and submit the [web UI recommendation form](https://kilotest.com/testRecForm.html).
|
|
64
|
+
- email: Send a message to `info@kilotest.com`, identifying:
|
|
65
|
+
- the URL of the page to be tested
|
|
66
|
+
- a proper name for the page to be tested (e.g., “ABC Corporation contact page”)
|
|
67
|
+
- a reason for your recommendation (e.g., “Page users have reported accessibility problems”)
|
|
68
|
+
|
|
69
|
+
### More information
|
|
70
|
+
|
|
71
|
+
More information about Kilotest features and internals:
|
|
72
|
+
|
|
73
|
+
- [Introduction to Kilotest](https://kilotest.com/)
|
|
74
|
+
- [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
|
|
75
|
+
- [README file](https://github.com/jrpool/kilotest/blob/main/README.md)
|
|
76
|
+
- [Testaro rule-engine orchestrator](https://github.com/jrpool/testaro)
|
|
77
|
+
- [Testilo rule classifier](https://github.com/jrpool/testilo/blob/main/procs/score/tic.js)
|
|
78
|
+
- [Research report on Testaro](https://arxiv.org/abs/2309.10167)
|
|
79
|
+
- [Blog post on the Kilotest use case](https://medium.com/cvs-health-tech-blog/how-to-run-a-thousand-accessibility-tests-63692ad120c3)
|
package/llms.txt
CHANGED
|
@@ -1,16 +1,22 @@
|
|
|
1
1
|
# Kilotest
|
|
2
|
-
|
|
2
|
+
|
|
3
|
+
Kilotest helps AI agents give thorough, accurate and truthful answers when users ask how accessible, usable, or standard conformant web pages are. Kilotest integrates ten independent rule engines to test web pages and report results with multipage, single-page, and element-level statistics and details.
|
|
4
|
+
|
|
5
|
+
In the API section below, all endpoints require `GET` requests.
|
|
6
|
+
|
|
7
|
+
There is a [detailed explanation of Kilotest and its capabilities for AI agents](https://kilotest.com/llms-full.txt).
|
|
3
8
|
|
|
4
9
|
## API
|
|
5
|
-
|
|
6
|
-
-
|
|
7
|
-
-
|
|
8
|
-
-
|
|
9
|
-
- OpenAPI spec: https://kilotest.com/openapi.yaml
|
|
10
|
+
|
|
11
|
+
- [Summarize test results for all tested pages](/api/targets)
|
|
12
|
+
- Provide statistics about issues reported in one job: `/api/reportIssues/{timeStamp}/{jobID}`
|
|
13
|
+
- [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml)
|
|
10
14
|
|
|
11
15
|
## Web UI
|
|
12
|
-
|
|
13
|
-
- https://kilotest.com/
|
|
14
|
-
- https://kilotest.com/
|
|
15
|
-
-
|
|
16
|
-
- https://kilotest.com/
|
|
16
|
+
|
|
17
|
+
- [Summarize test results for all tested pages](https://kilotest.com/targets.html)
|
|
18
|
+
- Provide statistics about issues reported in one job: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
|
|
19
|
+
- Provide details about one issue reported in one job: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
|
|
20
|
+
- Provide diagnoses about one issue exhibited by one HTML element in one job: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
|
|
21
|
+
- [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
|
|
22
|
+
- [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
|