@jrpool/kilotest 31.2.2 → 33.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AI-TOOL.md +82 -20
- package/IDEAS.md +1 -1
- package/README.md +1 -1
- package/SERVICE.md +1 -1
- package/diagnoses/index.html +1 -1
- package/index.html +12 -12
- package/index.js +286 -286
- package/llms-full.txt +39 -30
- package/mcp.js +96 -0
- package/openapi.yaml +38 -38
- package/package.json +4 -2
- package/pm2.config.js +14 -12
- package/reportIssue/index.js +1 -1
- package/reportIssues/api.js +12 -10
- package/reportIssues/index.js +2 -2
- package/targets/api.js +7 -6
- package/testRecForm/api.js +4 -3
- package/tutorial/index.html +1 -1
- package/util.js +16 -7
package/llms-full.txt
CHANGED
|
@@ -2,19 +2,34 @@
|
|
|
2
2
|
|
|
3
3
|
## What Kilotest is
|
|
4
4
|
|
|
5
|
-
Kilotest is an application that performs ensemble testing of web pages for accessibility, usability, and standard conformance and reports the test results.
|
|
5
|
+
Kilotest is an application that performs ensemble testing of web pages for front-end quality (i.e. accessibility, usability, and standard conformance) and reports the test results.
|
|
6
6
|
|
|
7
|
-
## What Kilotest does for
|
|
7
|
+
## What Kilotest does for language models
|
|
8
8
|
|
|
9
|
-
Kilotest is deployed as a service
|
|
9
|
+
Kilotest is deployed on the public Internet as a service at `kilotest.com`. Kilotest can test only web pages that can be accessed from the public Internet and that are not protected by authentication or other access controls. Kilotest has not implemented any mechanism for testing private, internal, password-protected, or otherwise restricted pages.
|
|
10
10
|
|
|
11
|
-
An LLM
|
|
11
|
+
An LLM can, under some conditions, load web pages and use its knowledge to assess successes and failures of front-end quality. But such an assessment is almost always:
|
|
12
12
|
|
|
13
|
-
-
|
|
13
|
+
- expensive, substituting inference for deterministic rule application
|
|
14
|
+
- fragmentary, because of capability limitations, such as inability to run browsers and interact with web pages
|
|
15
|
+
- inaccurate, because of the tendency to replace missing information with hallucinations
|
|
16
|
+
|
|
17
|
+
To provide a more inexpensive, thorough, and accurate assessment, the agent could ask an ensemble of specialized rule engines to test the page. However, selecting and running such rule engines and consolidating their results is itself expensive and difficult. Kilotest assumes responsibility for these functions. Kilotest:
|
|
18
|
+
|
|
19
|
+
- defines a tool that tests web pages for front-end quality
|
|
20
|
+
- defines tools that report the test results
|
|
21
|
+
- defines connectors to those tools, compatible with major AI platforms
|
|
22
|
+
|
|
23
|
+
The tool that tests web pages for front-end quality has attributes that no other such tool has. It:
|
|
24
|
+
|
|
25
|
+
- selects an ensemble of 10 rule engines that implement tests for, in total, more than a thousand rules for front-end quality
|
|
14
26
|
- runs the tests of the rule engines
|
|
15
27
|
- combines the reports of the rule engines into a single integrated report
|
|
16
28
|
- consolidates the 1000+ rules having 10 different naming systems into 300+ “issues” with a uniform naming system
|
|
17
|
-
|
|
29
|
+
|
|
30
|
+
The tools that report test results extract from reports statistics and details at various levels of granularity, as required by an agent to fulfill particular requests.
|
|
31
|
+
|
|
32
|
+
You can infer from this description that the Kilotest tools are (currently) best suited for help in answering comprehensive audit-type requests (e.g., “How can I improve this page?”) and, in the case of web pages for which reports are already available, more specific questions (e.g., “Is the heading structure of this page OK?”), but, if no report is available for a page, recommending that Kilotest test the page is overkill when the purpose is to answer a specific one-issue question.
|
|
18
33
|
|
|
19
34
|
## How Kilotest integrates methods
|
|
20
35
|
|
|
@@ -25,17 +40,13 @@ Testing for front-end quality requires a mixture of deterministic and inferentia
|
|
|
25
40
|
|
|
26
41
|
The rule engines in the Kilotest ensemble take responsibility for determining whether each rule requires inferential testing and, if so, obtaining inference from an appropriate model.
|
|
27
42
|
|
|
28
|
-
## Kilotest versus
|
|
29
|
-
|
|
30
|
-
LLM platforms asked to evaluate front-end quality with no delegation are not all able to load pages of web applications for inspection. When they can do so, they typically cannot operate on the pages with browser automation, as Kilotest does. As a result, LLMs typically give less comprehensive answers than they would if they called Kilotest as a tool. When even browsing to a page is impossible, an LLM is likely to give speculative evaluations with hallucinated examples, based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
|
|
43
|
+
## Kilotest versus other MCP servers
|
|
31
44
|
|
|
32
|
-
|
|
45
|
+
In the marketplace for MCP servers that test for front-end quality, all servers other than that of Kilotest leverage at most 1 rule engine, most often `axe-core`. The Kilotest testing tool leverages `axe-core` plus 9 other rule engines. This has these main effects:
|
|
33
46
|
|
|
34
|
-
|
|
35
|
-
|
|
36
|
-
-
|
|
37
|
-
- For pages that have already been tested by Kilotest, Kilotest can provide faster results than a single-tool MCP server, because Kilotest stores test results for subsequent retrieval. A retrieval from Kilotest can be completed in less than 2 seconds.
|
|
38
|
-
- Kilotest results are more comprehensive than single-tool MCP server results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-tool MCP servers. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
|
|
47
|
+
- For web pages that have not yet been tested, a single-rule-engine MCP can provide faster results than Kilotest. A tool running only the `axe-core` tests can complete its work in about 5 seconds or less. The testing tool of Kilotest usually completes its work in 2 to 3 minutes, because all 10 rule engines are run, and some of the tests involve navigation and interaction with the page and LLM inference. Moreover, in its current alpha phase, the Kilotest testing tool allows users (both humans and models) to **recommend** new pages for testing but authorizes only Kilotest managers to make the tool proceed with the testing. The wait for manager action can take up to a day. A feature permitting immediate testing ordered by AI agents is planned, but, until it is implemented, Kilotest will be useful for not-yet-tested pages only in long-running workflows.
|
|
48
|
+
- For web pages about which reports are already available, Kilotest can provide faster results than a single-rule-engine MCP server, because Kilotest stores test results for subsequent retrieval. A retrieval from a Kilotest reporting tool can be completed in less than 2 seconds.
|
|
49
|
+
- Kilotest results are more comprehensive than single-rule-engine MCP server results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-rule-engine MCP servers. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
|
|
39
50
|
|
|
40
51
|
## How to use Kilotest
|
|
41
52
|
|
|
@@ -45,9 +56,9 @@ Kilotest offers a comprehensive suite of capabilities to users via its web UI:
|
|
|
45
56
|
|
|
46
57
|
- [Home page](https://kilotest.com/)
|
|
47
58
|
- [Summarize test results for all tested pages](https://kilotest.com/targets.html)
|
|
48
|
-
- Provide statistics about issues reported in one
|
|
49
|
-
- Provide details about one issue reported in one
|
|
50
|
-
- Provide diagnoses by
|
|
59
|
+
- Provide statistics about issues reported in one report: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
|
|
60
|
+
- Provide details about one issue reported in one report: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
|
|
61
|
+
- Provide diagnoses by rule engines of rule violations for one HTML element exhibiting one issue in one report: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
|
|
51
62
|
- [Receive a recommendation to test a not-yet-tested page](`https://kilotest.com/testRecForm.html`)
|
|
52
63
|
- Receive a recommendation to retest a previously tested page: `https://kilotest.com/retestRecForm.html/{timeStamp}/{jobID}`
|
|
53
64
|
- [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
|
|
@@ -57,34 +68,32 @@ Kilotest offers a comprehensive suite of capabilities to users via its web UI:
|
|
|
57
68
|
|
|
58
69
|
### Agent API
|
|
59
70
|
|
|
60
|
-
Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers 3
|
|
71
|
+
Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers 3 tools:
|
|
61
72
|
|
|
62
|
-
- `
|
|
73
|
+
- `summarizeQualityOfAllTestedWebPages`
|
|
63
74
|
- method: `GET`
|
|
64
|
-
- purpose: summarize test results from all
|
|
75
|
+
- purpose: summarize front-end quality test results from all reports (a report contains the records of one session in which a web page is tested)
|
|
65
76
|
- path: `/api/targets`
|
|
66
|
-
- `
|
|
77
|
+
- `describeQualityOfOneWebPage`
|
|
67
78
|
- method: `GET`
|
|
68
|
-
- purpose: provide statistics about issues reported in one
|
|
79
|
+
- purpose: provide statistics about front-end quality issues reported in one report
|
|
69
80
|
- path: `/api/reportIssues/{timeStamp}/{jobID}`
|
|
70
81
|
- parameters
|
|
71
|
-
- `timeStamp`: initial segment of
|
|
72
|
-
- `jobID`: final segment of
|
|
82
|
+
- `timeStamp`: initial segment of report identifier
|
|
83
|
+
- `jobID`: final segment of report identifier
|
|
73
84
|
- source of parameters: response to a `targets` request
|
|
74
|
-
- `
|
|
85
|
+
- `recommendQualityTestingOfOneWebPage`
|
|
75
86
|
- method: `POST`
|
|
76
|
-
- purpose:
|
|
87
|
+
- purpose: recommend quality testing of one web page
|
|
77
88
|
- path: `/api/testRecForm`
|
|
78
89
|
- payload properties
|
|
79
90
|
- `what`: description of the page to be tested
|
|
80
91
|
- `url`: URL of the page to be tested
|
|
81
92
|
- `why`: reason for testing the page
|
|
82
|
-
- how to verify disposition of the recommendation: submit a `
|
|
93
|
+
- how to verify disposition of the recommendation: submit a `summarizeQualityOfAllTestedWebPages` request and inspect the response to determine whether a report on the page is now available
|
|
83
94
|
|
|
84
95
|
An [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml) is available.
|
|
85
96
|
|
|
86
|
-
Until direct immediate testing is available, an agent can recommend testing of a web page with a `testRecForm` request.
|
|
87
|
-
|
|
88
97
|
### More information
|
|
89
98
|
|
|
90
99
|
More information about Kilotest features and internals:
|
package/mcp.js
ADDED
|
@@ -0,0 +1,96 @@
|
|
|
1
|
+
/*
|
|
2
|
+
mcp.js
|
|
3
|
+
Handles MCP (Model Context Protocol) requests for Kilotest tools.
|
|
4
|
+
*/
|
|
5
|
+
|
|
6
|
+
// IMPORTS
|
|
7
|
+
|
|
8
|
+
const {McpServer} = require('@modelcontextprotocol/sdk/server/mcp.js');
|
|
9
|
+
const {StreamableHTTPServerTransport} = require('@modelcontextprotocol/sdk/server/streamableHttp.js');
|
|
10
|
+
const {z} = require('zod');
|
|
11
|
+
const {isReportAvailable, isURL} = require('./util');
|
|
12
|
+
const targetsAPI = require('./targets/api');
|
|
13
|
+
const reportIssuesAPI = require('./reportIssues/api');
|
|
14
|
+
const testRecFormAPI = require('./testRecForm/api');
|
|
15
|
+
|
|
16
|
+
// FUNCTIONS
|
|
17
|
+
|
|
18
|
+
// Creates and returns an McpServer with Kilotest tools registered.
|
|
19
|
+
const createMCPServer = () => {
|
|
20
|
+
const server = new McpServer({name: 'Kilotest', version: '1.0.0'});
|
|
21
|
+
server.registerTool(
|
|
22
|
+
'summarizeQualityOfAllTestedWebPages',
|
|
23
|
+
{
|
|
24
|
+
description: 'Returns summary data from every available Kilotest report about the front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. Before calling describeQualityIssuesOfOneWebPage, call this tool to check whether a report about the page is available.',
|
|
25
|
+
inputSchema: {},
|
|
26
|
+
annotations: {
|
|
27
|
+
title: 'Summarize quality of all tested web pages',
|
|
28
|
+
readOnlyHint: true,
|
|
29
|
+
idempotentHint: true,
|
|
30
|
+
destructiveHint: false,
|
|
31
|
+
openWorldHint: false
|
|
32
|
+
}
|
|
33
|
+
},
|
|
34
|
+
async () => {
|
|
35
|
+
const result = await targetsAPI.response();
|
|
36
|
+
return {content: [{type: 'text', text: JSON.stringify(result)}]};
|
|
37
|
+
}
|
|
38
|
+
);
|
|
39
|
+
server.registerTool(
|
|
40
|
+
'describeQualityOfOneWebPage',
|
|
41
|
+
{
|
|
42
|
+
description: 'Returns data from a specified Kilotest report about issues of front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. The required timeStamp and jobID parameters identify the report and are obtained from a summarizeQualityOfAllTestedWebPages response.',
|
|
43
|
+
inputSchema: {
|
|
44
|
+
timeStamp: z.string().describe('Report timestamp in YYMMDDTHHMM format, e.g. 260503T0432'),
|
|
45
|
+
jobID: z.string().describe('Job identifier, e.g. x9z')
|
|
46
|
+
},
|
|
47
|
+
annotations: {
|
|
48
|
+
title: 'Describe the quality of one web page',
|
|
49
|
+
readOnlyHint: true,
|
|
50
|
+
idempotentHint: true,
|
|
51
|
+
destructiveHint: false,
|
|
52
|
+
openWorldHint: false
|
|
53
|
+
}
|
|
54
|
+
},
|
|
55
|
+
async ({timeStamp, jobID}) => {
|
|
56
|
+
const result = await reportIssuesAPI.response([timeStamp, jobID]);
|
|
57
|
+
return {content: [{type: 'text', text: JSON.stringify(result)}]};
|
|
58
|
+
}
|
|
59
|
+
);
|
|
60
|
+
server.registerTool(
|
|
61
|
+
'recommendQualityTestingOfOneWebPage',
|
|
62
|
+
{
|
|
63
|
+
description: 'Recommends a web page for Kilotest to test for front-end quality (i.e. accessibility, usability, and standard conformity). Do not call this tool unless summarizeQualityOfAllTestedWebPages discloses that no report about the page or a related page that satisfies your requirements is available.',
|
|
64
|
+
inputSchema: {
|
|
65
|
+
what: z.string().describe('Short description of the page, following the naming conventions visible in the summarizeQualityOfAllTestedWebPages response'),
|
|
66
|
+
url: z.string().describe('Full HTTPS URL of the page to test'),
|
|
67
|
+
why: z.string().describe('Reason for recommending this page for testing')
|
|
68
|
+
},
|
|
69
|
+
annotations: {
|
|
70
|
+
title: 'Recommend quality testing of one web page',
|
|
71
|
+
readOnlyHint: false,
|
|
72
|
+
idempotentHint: false,
|
|
73
|
+
destructiveHint: false,
|
|
74
|
+
openWorldHint: false
|
|
75
|
+
}
|
|
76
|
+
},
|
|
77
|
+
async ({what, url, why}) => {
|
|
78
|
+
if (!isURL(url)) {
|
|
79
|
+
return {content: [{type: 'text', text: JSON.stringify({error: 'Invalid URL'})}], isError: true};
|
|
80
|
+
}
|
|
81
|
+
if (await isReportAvailable(what, url)) {
|
|
82
|
+
return {content: [{type: 'text', text: JSON.stringify({error: 'A report about the page is already available'})}], isError: true};
|
|
83
|
+
}
|
|
84
|
+
const result = await testRecFormAPI.response(what, url, why);
|
|
85
|
+
return {content: [{type: 'text', text: JSON.stringify(result)}]};
|
|
86
|
+
}
|
|
87
|
+
);
|
|
88
|
+
return server;
|
|
89
|
+
};
|
|
90
|
+
// Handles an MCP request.
|
|
91
|
+
exports.handleMCP = async (request, response) => {
|
|
92
|
+
const transport = new StreamableHTTPServerTransport({sessionIdGenerator: undefined});
|
|
93
|
+
const server = createMCPServer();
|
|
94
|
+
await server.connect(transport);
|
|
95
|
+
await transport.handleRequest(request, response);
|
|
96
|
+
};
|
package/openapi.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
openapi: 3.1.0
|
|
2
2
|
info:
|
|
3
3
|
title: Kilotest Agent API
|
|
4
|
-
description: Kilotest tests web pages for accessibility, usability, and standard conformity using an ensemble of ten independent
|
|
4
|
+
description: Kilotest tests web pages for front-end quality (i.e. accessibility, usability, and standard conformity) using an ensemble of ten independent rule engines that employ rule-based and machine-learning-based methods. This API enables AI agents to recommend web pages for testing, discover available test reports, and retrieve data from reports. For background on Kilotest and the advantages of ensemble testing, visit https://kilotest.com.
|
|
5
5
|
version: 1.0.0
|
|
6
6
|
contact:
|
|
7
7
|
name: Kilotest
|
|
@@ -15,9 +15,9 @@ servers:
|
|
|
15
15
|
paths:
|
|
16
16
|
/api/targets:
|
|
17
17
|
get:
|
|
18
|
-
operationId:
|
|
19
|
-
summary:
|
|
20
|
-
description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many accessibility, usability, and standard-conformity issues were reported, and URLs for retrieving more detailed data from the report. This is the first endpoint to call if you want data about a particular web page. The result will tell you whether a report on that page already exists. If so, you can retrieve data from it. If not, you can use the submitWebAccessibilityTestRequest endpoint to recommend the page for testing.
|
|
18
|
+
operationId: summarizeQualityOfAllTestedWebPages
|
|
19
|
+
summary: Summarize quality of all tested web pages
|
|
20
|
+
description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many front-end quality (i.e. accessibility, usability, and standard-conformity) issues were reported, and URLs for retrieving more detailed data from the report. This is the first endpoint to call if you want data about a particular web page. The result will tell you whether a report on that page already exists. If so, you can retrieve data from it. If not, you can use the submitWebAccessibilityTestRequest endpoint to recommend the page for testing.
|
|
21
21
|
responses:
|
|
22
22
|
'200':
|
|
23
23
|
description: Summaries of available reports
|
|
@@ -28,9 +28,9 @@ paths:
|
|
|
28
28
|
|
|
29
29
|
/api/testRecForm:
|
|
30
30
|
post:
|
|
31
|
-
operationId:
|
|
32
|
-
summary:
|
|
33
|
-
description:
|
|
31
|
+
operationId: recommendQualityTestingOfOneWebPage
|
|
32
|
+
summary: Recommend quality testing of one web page
|
|
33
|
+
description: Submit a recommendation for Kilotest to test, for the first time, a particular web page for front-end quality (i.e. accessibility, usability, and standard conformity). Recommendations are typically approved and the testing completed within a day, whereupon the results can be found with the summarizeQualityOfAllTestedWebPages operation. Before submitting a recommendation, use the summarizeQualityOfAllTestedWebPages operation to ensure that no report about the page is available, and also to see the stylistic rules for the naming of pages. An attempt to recommend a page for testing will fail if a report about the page is already available.
|
|
34
34
|
requestBody:
|
|
35
35
|
description: Test recommendation specifications
|
|
36
36
|
required: true
|
|
@@ -48,9 +48,9 @@ paths:
|
|
|
48
48
|
|
|
49
49
|
/api/reportIssues/{timeStamp}/{jobID}:
|
|
50
50
|
get:
|
|
51
|
-
operationId:
|
|
52
|
-
summary:
|
|
53
|
-
description:
|
|
51
|
+
operationId: describeQualityOfOneWebPage
|
|
52
|
+
summary: Describe quality of one web page
|
|
53
|
+
description: Get data about the quality of a specified Kilotest report. The data about each issue include its priority, the rule engines that reported it, the number of HTML elements exhibiting it, and URLs for retrieving element-level detail. The timeStamp and jobID parameters identify the report and are available in the response from the summarizeQualityOfAllTestedWebPages operation.
|
|
54
54
|
parameters:
|
|
55
55
|
- name: timeStamp
|
|
56
56
|
in: path
|
|
@@ -81,14 +81,14 @@ paths:
|
|
|
81
81
|
|
|
82
82
|
/api/reportIssue/{issueID}/{timeStamp}/{jobID}:
|
|
83
83
|
get:
|
|
84
|
-
operationId:
|
|
85
|
-
summary:
|
|
86
|
-
description:
|
|
84
|
+
operationId: describeHTMLElementsHavingOneQualityIssue
|
|
85
|
+
summary: Describe HTML elements having one quality issue (NOT YET IMPLEMENTED)
|
|
86
|
+
description: Get issue-specific data from a specified report about the front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. The data describe the issue, all of the HTML elements of the page that have the issue, and, for each such element, URLs for retrieving diagnoses by rule engines of the issue on the element. NOT YET IMPLEMENTED.
|
|
87
87
|
parameters:
|
|
88
88
|
- name: issueID
|
|
89
89
|
in: path
|
|
90
90
|
required: true
|
|
91
|
-
description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the
|
|
91
|
+
description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the describeQualityOfOneWebPage response.
|
|
92
92
|
schema:
|
|
93
93
|
type: string
|
|
94
94
|
examples:
|
|
@@ -191,29 +191,29 @@ components:
|
|
|
191
191
|
TestRecFormResponse:
|
|
192
192
|
$ref: '#/components/schemas/CommonResponseFields'
|
|
193
193
|
|
|
194
|
-
|
|
194
|
+
RuleEngineInfo:
|
|
195
195
|
type: object
|
|
196
|
-
description:
|
|
196
|
+
description: A rule engine in the Kilotest ensemble.
|
|
197
197
|
properties:
|
|
198
198
|
identifier:
|
|
199
199
|
type: string
|
|
200
|
-
description: Short programmatic identifier for the
|
|
200
|
+
description: Short programmatic identifier for the rule engine.
|
|
201
201
|
examples:
|
|
202
202
|
- alfa
|
|
203
203
|
name:
|
|
204
204
|
type: string
|
|
205
|
-
description: Display name of the
|
|
205
|
+
description: Display name of the rule engine.
|
|
206
206
|
examples:
|
|
207
207
|
- Alfa
|
|
208
208
|
sponsor:
|
|
209
209
|
type: string
|
|
210
|
-
description: Organization that created, initially sponsored, or now sponsors the
|
|
210
|
+
description: Organization that created, initially sponsored, or now sponsors the rule engine.
|
|
211
211
|
examples:
|
|
212
212
|
- Siteimprove
|
|
213
213
|
|
|
214
|
-
|
|
214
|
+
RuleEngineFailure:
|
|
215
215
|
type: object
|
|
216
|
-
description: A
|
|
216
|
+
description: A rule engine that was unable to complete testing of the page.
|
|
217
217
|
properties:
|
|
218
218
|
name:
|
|
219
219
|
type: string
|
|
@@ -224,9 +224,9 @@ components:
|
|
|
224
224
|
examples:
|
|
225
225
|
- Not enough credits.
|
|
226
226
|
|
|
227
|
-
|
|
227
|
+
RuleEnginesSummary:
|
|
228
228
|
type: object
|
|
229
|
-
description: Count and names of a set of
|
|
229
|
+
description: Count and names of a set of rule engines.
|
|
230
230
|
properties:
|
|
231
231
|
number:
|
|
232
232
|
type: integer
|
|
@@ -276,12 +276,12 @@ components:
|
|
|
276
276
|
type: integer
|
|
277
277
|
number of HTML elements reported as exhibiting issues:
|
|
278
278
|
type: integer
|
|
279
|
-
|
|
280
|
-
$ref: '#/components/schemas/
|
|
281
|
-
|
|
282
|
-
$ref: '#/components/schemas/
|
|
283
|
-
|
|
284
|
-
$ref: '#/components/schemas/
|
|
279
|
+
rule engines that tried to test the page:
|
|
280
|
+
$ref: '#/components/schemas/RuleEnginesSummary'
|
|
281
|
+
rule engines that were unable to test the page:
|
|
282
|
+
$ref: '#/components/schemas/RuleEnginesSummary'
|
|
283
|
+
rule engines that reported issues:
|
|
284
|
+
$ref: '#/components/schemas/RuleEnginesSummary'
|
|
285
285
|
URLs for getting data on the reported issues:
|
|
286
286
|
$ref: '#/components/schemas/NextTierURLs'
|
|
287
287
|
URL for getting the full technical report as JSON:
|
|
@@ -301,7 +301,7 @@ components:
|
|
|
301
301
|
|
|
302
302
|
IssueEntry:
|
|
303
303
|
type: object
|
|
304
|
-
description: Details about a specific
|
|
304
|
+
description: Details about a specific front-end quality issue found on a page.
|
|
305
305
|
properties:
|
|
306
306
|
identifier:
|
|
307
307
|
type: string
|
|
@@ -329,8 +329,8 @@ components:
|
|
|
329
329
|
impact on a user:
|
|
330
330
|
type: string
|
|
331
331
|
description: How this issue is likely to affect users.
|
|
332
|
-
|
|
333
|
-
$ref: '#/components/schemas/
|
|
332
|
+
rule engines reporting the issue:
|
|
333
|
+
$ref: '#/components/schemas/RuleEnginesSummary'
|
|
334
334
|
number of HTML elements reported as exhibiting the issue:
|
|
335
335
|
type: integer
|
|
336
336
|
URLs for details about the issue on the page:
|
|
@@ -379,16 +379,16 @@ components:
|
|
|
379
379
|
URL:
|
|
380
380
|
type: string
|
|
381
381
|
format: uri
|
|
382
|
-
|
|
382
|
+
rule engines that tried to test the page:
|
|
383
383
|
type: array
|
|
384
384
|
items:
|
|
385
|
-
$ref: '#/components/schemas/
|
|
386
|
-
|
|
385
|
+
$ref: '#/components/schemas/RuleEngineInfo'
|
|
386
|
+
rule engines that were unable to test the page:
|
|
387
387
|
type: array
|
|
388
388
|
items:
|
|
389
|
-
$ref: '#/components/schemas/
|
|
390
|
-
|
|
391
|
-
$ref: '#/components/schemas/
|
|
389
|
+
$ref: '#/components/schemas/RuleEngineFailure'
|
|
390
|
+
rule engines that reported issues:
|
|
391
|
+
$ref: '#/components/schemas/RuleEnginesSummary'
|
|
392
392
|
number of issues reported:
|
|
393
393
|
type: object
|
|
394
394
|
properties:
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@jrpool/kilotest",
|
|
3
|
-
"version": "
|
|
3
|
+
"version": "33.0.0",
|
|
4
4
|
"description": "An ensemble testing service with a focus on accessibility",
|
|
5
5
|
"main": "index.js",
|
|
6
6
|
"scripts": {
|
|
@@ -24,8 +24,10 @@
|
|
|
24
24
|
},
|
|
25
25
|
"homepage": "https://github.com/jrpool/kilotest",
|
|
26
26
|
"dependencies": {
|
|
27
|
+
"@modelcontextprotocol/sdk": "*",
|
|
27
28
|
"dotenv": "*",
|
|
28
|
-
"testilo": "*"
|
|
29
|
+
"testilo": "*",
|
|
30
|
+
"zod": "*"
|
|
29
31
|
},
|
|
30
32
|
"devDependencies": {
|
|
31
33
|
"@eslint/css": "^1.0.0",
|
package/pm2.config.js
CHANGED
|
@@ -1,15 +1,17 @@
|
|
|
1
1
|
module.exports = {
|
|
2
|
-
apps: [
|
|
3
|
-
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
|
|
9
|
-
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
2
|
+
apps: [
|
|
3
|
+
{
|
|
4
|
+
name: 'kilotest',
|
|
5
|
+
script: 'index.js',
|
|
6
|
+
instances: 1,
|
|
7
|
+
autorestart: true,
|
|
8
|
+
watch: false,
|
|
9
|
+
max_memory_restart: '500M',
|
|
10
|
+
env: {
|
|
11
|
+
NODE_ENV: 'production',
|
|
12
|
+
BASE_PATH: '/',
|
|
13
|
+
DEMO_SSE_DELAY_MS: '100'
|
|
14
|
+
}
|
|
13
15
|
}
|
|
14
|
-
|
|
16
|
+
]
|
|
15
17
|
};
|
package/reportIssue/index.js
CHANGED
|
@@ -82,7 +82,7 @@ const populateQuery = async (issueID, timeStamp, jobID, query) => {
|
|
|
82
82
|
violatorData.reporters = getToolNamesString(violatorData.reporters);
|
|
83
83
|
});
|
|
84
84
|
const reporterCount = query.reporters.size;
|
|
85
|
-
query.reporterCount = reporterCount === 1 ? '1
|
|
85
|
+
query.reporterCount = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
|
|
86
86
|
// Convert the set of issue reporters to a string.
|
|
87
87
|
query.reporters = getToolNamesString(query.reporters);
|
|
88
88
|
// Convert the violator data to an array.
|
package/reportIssues/api.js
CHANGED
|
@@ -29,7 +29,7 @@ const getIssueFacts = (thisHost, timeStamp, jobID, issue) => {
|
|
|
29
29
|
'numeric identifier': wcag
|
|
30
30
|
},
|
|
31
31
|
'impact on a user': why,
|
|
32
|
-
'
|
|
32
|
+
'rule engines reporting the issue': {
|
|
33
33
|
'number': reporterCount,
|
|
34
34
|
'names': reporters.map(tool => tool.toolName)
|
|
35
35
|
},
|
|
@@ -40,7 +40,7 @@ const getIssueFacts = (thisHost, timeStamp, jobID, issue) => {
|
|
|
40
40
|
}
|
|
41
41
|
};
|
|
42
42
|
};
|
|
43
|
-
// Returns a response to a
|
|
43
|
+
// Returns a response to a report-issues request.
|
|
44
44
|
exports.response = async args => {
|
|
45
45
|
const [timeStamp, jobID] = args;
|
|
46
46
|
const reportIsHidden = await isHidden(timeStamp, jobID);
|
|
@@ -63,12 +63,13 @@ exports.response = async args => {
|
|
|
63
63
|
const thisHost = process.env.THIS_KILOTEST_HOST;
|
|
64
64
|
// Get a response.
|
|
65
65
|
const content = {
|
|
66
|
-
summary: `This document fulfills a request made by
|
|
67
|
-
'tool name': 'Kilotest',
|
|
66
|
+
summary: `This document fulfills a request made by a language model to a Kilotest tool. The model requested data from a Kilotest report about the front-end quality (i.e. accessibility, usability, and standard-conformity) of a web page. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten rule engines, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes several API endpoints recommend web pages for testing and to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
|
|
67
|
+
'tool collection name': 'Kilotest',
|
|
68
|
+
'tool name': 'describeQualityOfOneWebPage',
|
|
68
69
|
request: {
|
|
69
70
|
'type of request': {
|
|
70
71
|
identifier: 'reportIssues',
|
|
71
|
-
description: '
|
|
72
|
+
description: 'Describe the quality of one web page.'
|
|
72
73
|
},
|
|
73
74
|
method: 'GET',
|
|
74
75
|
URLs: {
|
|
@@ -76,9 +77,10 @@ exports.response = async args => {
|
|
|
76
77
|
'equivalent URL for humans': `${thisHost}/reportIssues.html/${timeStamp}/${jobID}`
|
|
77
78
|
},
|
|
78
79
|
'closest ancestor request': {
|
|
79
|
-
|
|
80
|
+
identifier: 'summarizeQualityOfAllTestedWebPages',
|
|
81
|
+
description: 'Summarize the quality of all tested web pages.',
|
|
80
82
|
URLs: {
|
|
81
|
-
'for you': `${thisHost}/api/targets
|
|
83
|
+
'for you': `${thisHost}/api/targets`,
|
|
82
84
|
'for humans': `${thisHost}/targets.html`
|
|
83
85
|
}
|
|
84
86
|
}
|
|
@@ -96,9 +98,9 @@ exports.response = async args => {
|
|
|
96
98
|
description: what,
|
|
97
99
|
URL: url
|
|
98
100
|
},
|
|
99
|
-
'
|
|
100
|
-
'
|
|
101
|
-
'
|
|
101
|
+
'rule engines that tried to test the page': getToolsFacts(Object.keys(tools)),
|
|
102
|
+
'rule engines that were unable to test the page': preventedTools,
|
|
103
|
+
'rule engines that reported issues': {
|
|
102
104
|
number: reporterCount,
|
|
103
105
|
names: reporters.map(tool => tool.toolName)
|
|
104
106
|
},
|
package/reportIssues/index.js
CHANGED
|
@@ -65,7 +65,7 @@ const populateQuery = async (timeStamp, jobID, query) => {
|
|
|
65
65
|
query.timeStamp = timeStamp;
|
|
66
66
|
query.jobID = jobID;
|
|
67
67
|
// Add reporter information to the query.
|
|
68
|
-
query.reporterCount = reporterCount === 1 ? '1
|
|
68
|
+
query.reporterCount = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
|
|
69
69
|
query.reporters = reporterList;
|
|
70
70
|
// Add a summary of the issues to the query.
|
|
71
71
|
query.issueCount = issueCount === 1 ? '1 issue was' : `${issueCount} issues were`;
|
|
@@ -107,7 +107,7 @@ const populateQuery = async (timeStamp, jobID, query) => {
|
|
|
107
107
|
// Add the issue facts to the lines.
|
|
108
108
|
detailsLines.push(`${margin} <li>Why it matters: ${why}`);
|
|
109
109
|
detailsLines.push(`${margin} <li>Related WCAG standard: ${wcagLink}`);
|
|
110
|
-
const reporterCountString = reporterCount === 1 ? '1
|
|
110
|
+
const reporterCountString = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
|
|
111
111
|
detailsLines.push(
|
|
112
112
|
`${margin} <li>Reported by ${reporterCountString} (${reporterList})</li>`
|
|
113
113
|
);
|
package/targets/api.js
CHANGED
|
@@ -51,15 +51,15 @@ exports.response = async () => {
|
|
|
51
51
|
URL: url
|
|
52
52
|
},
|
|
53
53
|
'whether a later report about the same page exists': !! superseded,
|
|
54
|
-
'
|
|
54
|
+
'rule engines that tried to test the page': {
|
|
55
55
|
number: toolCount,
|
|
56
56
|
names: toolNames
|
|
57
57
|
},
|
|
58
|
-
'
|
|
58
|
+
'rule engines that were unable to test the page': {
|
|
59
59
|
number: preventedToolCount,
|
|
60
60
|
names: preventedToolNames
|
|
61
61
|
},
|
|
62
|
-
'
|
|
62
|
+
'rule engines that reported issues': {
|
|
63
63
|
number: reporterCount,
|
|
64
64
|
names: reporterNames
|
|
65
65
|
},
|
|
@@ -74,12 +74,13 @@ exports.response = async () => {
|
|
|
74
74
|
}
|
|
75
75
|
// Get a response.
|
|
76
76
|
const content = {
|
|
77
|
-
summary: `This document fulfills a request made by an agent to the Kilotest service. The agent requested data about the web pages that Kilotest had tested for accessibility, usability, and standard-conformity and, for each page, statistics about the results of the tests. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten
|
|
78
|
-
'tool name': 'Kilotest',
|
|
77
|
+
summary: `This document fulfills a request made by an agent to the Kilotest service. The agent requested data about the web pages that Kilotest had tested for accessibility, usability, and standard-conformity and, for each page, statistics about the results of the tests. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten rule engines, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes API endpoints for agents and web UI URLs for humans to recommend web pages for testing and to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), whose home page contains an introduction and a link to a tutorial.`,
|
|
78
|
+
'tool collection name': 'Kilotest',
|
|
79
|
+
'tool name': 'summarizeQualityOfAllTestedWebPages',
|
|
79
80
|
request: {
|
|
80
81
|
'type of request': {
|
|
81
82
|
identifier: 'targets',
|
|
82
|
-
description: '
|
|
83
|
+
description: 'Summarize the quality of all tested web pages.'
|
|
83
84
|
},
|
|
84
85
|
method: 'GET',
|
|
85
86
|
URLs: {
|
package/testRecForm/api.js
CHANGED
|
@@ -20,12 +20,13 @@ exports.response = async (what, url, why) => {
|
|
|
20
20
|
await updateRecs(what, url, why);
|
|
21
21
|
// Get a response.
|
|
22
22
|
const content = {
|
|
23
|
-
summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test, for the first time, the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten
|
|
24
|
-
'tool name': 'Kilotest',
|
|
23
|
+
summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test, for the first time, the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten rule engines, using a combination of rule- and machine-learning-based methods. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
|
|
24
|
+
'tool collection name': 'Kilotest',
|
|
25
|
+
'tool name': 'recommendQualityTestingOfOneWebPage',
|
|
25
26
|
request: {
|
|
26
27
|
'type of request': {
|
|
27
28
|
identifier: 'testRecForm',
|
|
28
|
-
description: '
|
|
29
|
+
description: 'Recommend quality testing of one web page.'
|
|
29
30
|
},
|
|
30
31
|
method: 'POST',
|
|
31
32
|
payload: {
|
package/tutorial/index.html
CHANGED
|
@@ -433,7 +433,7 @@
|
|
|
433
433
|
<h2>Further reading</h2>
|
|
434
434
|
<ul>
|
|
435
435
|
<li><a href="https://www.w3.org/WAI/WCAG22/Understanding/">Understanding WCAG 2.2</a> — W3C explanations of each success criterion</li>
|
|
436
|
-
<li><a href="https://www.w3.org/WAI/test-evaluate/tools/list/">Web Accessibility Evaluation Tools List</a> — W3C registry of
|
|
436
|
+
<li><a href="https://www.w3.org/WAI/test-evaluate/tools/list/">Web Accessibility Evaluation Tools List</a> — W3C registry of software that performs accessibility testing</li>
|
|
437
437
|
<li><a href="https://www.w3.org/WAI/WCAG22/Understanding/identify-input-purpose">Understanding SC 1.3.5: Identify Input Purpose</a> — detailed guidance on <code>autocomplete</code> requirements</li>
|
|
438
438
|
<li><a href="https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#autofill">HTML Living Standard: Autofill</a> — the definitive list of valid <code>autocomplete</code> tokens</li>
|
|
439
439
|
<li><a href="https://arxiv.org/abs/2304.07591">Accessibility Metatesting: Comparing Nine Testing Tools</a> — research on rule-engine coverage variation</li>
|