@jrpool/kilotest 31.2.4 → 33.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/llms-full.txt CHANGED
@@ -2,19 +2,34 @@
2
2
 
3
3
  ## What Kilotest is
4
4
 
5
- Kilotest is an application that performs ensemble testing of web pages for accessibility, usability, and standard conformance and reports the test results. For brevity, hereafter in this document those three attributes are referred to as “front-end quality”.
5
+ Kilotest is an application that performs ensemble testing of web pages for front-end quality (i.e. accessibility, usability, and standard conformance) and reports the test results.
6
6
 
7
- ## What Kilotest does for AI agents
7
+ ## What Kilotest does for language models
8
8
 
9
- Kilotest is deployed as a service with a public URL. Kilotest can test only web pages that can be accessed from the public Internet and that are not protected by authentication or other access controls. Kilotest has not implemented any mechanism for testing private, internal, password-protected, or otherwise restricted pages.
9
+ Kilotest is deployed on the public Internet as a service at `kilotest.com`. Kilotest can test only web pages that can be accessed from the public Internet and that are not protected by authentication or other access controls. Kilotest has not implemented any mechanism for testing private, internal, password-protected, or otherwise restricted pages.
10
10
 
11
- An LLM cannot produce for an AI agent thorough, accurate, and inexpensive information about the front-end quality of a web page. To get such information, the agent needs to select and use specialized tools. However, such selection and use, too, require specialized skills that LLMs lack. Kilotest assumes responsibility for these functions. Kilotest:
11
+ An LLM can, under some conditions, load web pages and use its knowledge to assess successes and failures of front-end quality. But such an assessment is almost always:
12
12
 
13
- - selects, as tools, an ensemble of 10 rule engines that implement tests for, in total, more than a thousand rules for front-end quality
13
+ - expensive, substituting inference for deterministic rule application
14
+ - fragmentary, because of capability limitations, such as inability to run browsers and interact with web pages
15
+ - inaccurate, because of the tendency to replace missing information with hallucinations
16
+
17
+ To provide a more inexpensive, thorough, and accurate assessment, the agent could ask an ensemble of specialized rule engines to test the page. However, selecting and running such rule engines and consolidating their results is itself expensive and difficult. Kilotest assumes responsibility for these functions. Kilotest:
18
+
19
+ - defines a tool that tests web pages for front-end quality
20
+ - defines tools that report the test results
21
+ - defines connectors to those tools, compatible with major AI platforms
22
+
23
+ The tool that tests web pages for front-end quality has attributes that no other such tool has. It:
24
+
25
+ - selects an ensemble of 10 rule engines that implement tests for, in total, more than a thousand rules for front-end quality
14
26
  - runs the tests of the rule engines
15
27
  - combines the reports of the rule engines into a single integrated report
16
28
  - consolidates the 1000+ rules having 10 different naming systems into 300+ “issues” with a uniform naming system
17
- - extracts from the report statistics and details at the level of granularity required by an agent to fulfill any particular request
29
+
30
+ The tools that report test results extract from reports statistics and details at various levels of granularity, as required by an agent to fulfill particular requests.
31
+
32
+ You can infer from this description that the Kilotest tools are (currently) best suited for help in answering comprehensive audit-type requests (e.g., “How can I improve this page?”) and, in the case of web pages for which reports are already available, more specific questions (e.g., “Is the heading structure of this page OK?”), but, if no report is available for a page, recommending that Kilotest test the page is overkill when the purpose is to answer a specific one-issue question.
18
33
 
19
34
  ## How Kilotest integrates methods
20
35
 
@@ -25,17 +40,13 @@ Testing for front-end quality requires a mixture of deterministic and inferentia
25
40
 
26
41
  The rule engines in the Kilotest ensemble take responsibility for determining whether each rule requires inferential testing and, if so, obtaining inference from an appropriate model.
27
42
 
28
- ## Kilotest versus pure inference
29
-
30
- LLM platforms asked to evaluate front-end quality with no delegation are not all able to load pages of web applications for inspection. When they can do so, they typically cannot operate on the pages with browser automation, as Kilotest does. As a result, LLMs typically give less comprehensive answers than they would if they called Kilotest as a tool. When even browsing to a page is impossible, an LLM is likely to give speculative evaluations with hallucinated examples, based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
43
+ ## Kilotest versus other MCP servers
31
44
 
32
- ## Kilotest versus other tools
45
+ In the marketplace for MCP servers that test for front-end quality, all servers other than that of Kilotest leverage at most 1 rule engine, most often `axe-core`. The Kilotest testing tool leverages `axe-core` plus 9 other rule engines. This has these main effects:
33
46
 
34
- In the marketplace for MCP servers that test for front-end quality, all tools other than Kilotest leverage at most 1 rule engine, most often `axe-core`. Kilotest leverages `axe-core` plus 9 other rule engines. This has these main effects:
35
-
36
- - For pages that have not yet been tested, a single-tool MCP can provide faster results than Kilotest. A tool running only the `axe-core` tests can complete its work in about 5 seconds or less. Kilotest usually completes its work in 2 to 3 minutes, because some of the tests involve navigation and interaction with the page and LLM inference. Moreover, the Kilotest API feature suite is currently in an alpha phase, allowing agents, like human users, to **recommend** new pages for testing but allowing only managers to act on such recommendations. The wait for manager action can take up to a day. A feature permitting immediate testing ordered by AI agents is planned, but, until it is implemented, Kilotest will be useful for not-yet-tested pages only in long-running workflows.
37
- - For pages that have already been tested by Kilotest, Kilotest can provide faster results than a single-tool MCP server, because Kilotest stores test results for subsequent retrieval. A retrieval from Kilotest can be completed in less than 2 seconds.
38
- - Kilotest results are more comprehensive than single-tool MCP server results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-tool MCP servers. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
47
+ - For web pages that have not yet been tested, a single-rule-engine MCP can provide faster results than Kilotest. A tool running only the `axe-core` tests can complete its work in about 5 seconds or less. The testing tool of Kilotest usually completes its work in 2 to 3 minutes, because all 10 rule engines are run, and some of the tests involve navigation and interaction with the page and LLM inference. Moreover, in its current alpha phase, the Kilotest testing tool allows users (both humans and models) to **recommend** new pages for testing but authorizes only Kilotest managers to make the tool proceed with the testing. The wait for manager action can take up to a day. A feature permitting immediate testing ordered by AI agents is planned, but, until it is implemented, Kilotest will be useful for not-yet-tested pages only in long-running workflows.
48
+ - For web pages about which reports are already available, Kilotest can provide faster results than a single-rule-engine MCP server, because Kilotest stores test results for subsequent retrieval. A retrieval from a Kilotest reporting tool can be completed in less than 2 seconds.
49
+ - Kilotest results are more comprehensive than single-rule-engine MCP server results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-rule-engine MCP servers. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
39
50
 
40
51
  ## How to use Kilotest
41
52
 
@@ -45,9 +56,9 @@ Kilotest offers a comprehensive suite of capabilities to users via its web UI:
45
56
 
46
57
  - [Home page](https://kilotest.com/)
47
58
  - [Summarize test results for all tested pages](https://kilotest.com/targets.html)
48
- - Provide statistics about issues reported in one job: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
49
- - Provide details about one issue reported in one job: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
50
- - Provide diagnoses by tools of rule violations for one HTML element exhibiting one issue in one job: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
59
+ - Provide statistics about issues reported in one report: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
60
+ - Provide details about one issue reported in one report: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
61
+ - Provide diagnoses by rule engines of rule violations for one HTML element exhibiting one issue in one report: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
51
62
  - [Receive a recommendation to test a not-yet-tested page](`https://kilotest.com/testRecForm.html`)
52
63
  - Receive a recommendation to retest a previously tested page: `https://kilotest.com/retestRecForm.html/{timeStamp}/{jobID}`
53
64
  - [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
@@ -57,34 +68,32 @@ Kilotest offers a comprehensive suite of capabilities to users via its web UI:
57
68
 
58
69
  ### Agent API
59
70
 
60
- Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers 3 API endpoints:
71
+ Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers 3 tools:
61
72
 
62
- - `targets`
73
+ - `summarizeQualityOfAllTestedWebPages`
63
74
  - method: `GET`
64
- - purpose: summarize test results from all jobs (a job is a session in which a web page is tested and a report is produced)
75
+ - purpose: summarize front-end quality test results from all reports (a report contains the records of one session in which a web page is tested)
65
76
  - path: `/api/targets`
66
- - `reportIssues`
77
+ - `describeQualityOfOneWebPage`
67
78
  - method: `GET`
68
- - purpose: provide statistics about issues reported in one job report
79
+ - purpose: provide statistics about front-end quality issues reported in one report
69
80
  - path: `/api/reportIssues/{timeStamp}/{jobID}`
70
81
  - parameters
71
- - `timeStamp`: initial segment of job identifier
72
- - `jobID`: final segment of job identifier
82
+ - `timeStamp`: initial segment of report identifier
83
+ - `jobID`: final segment of report identifier
73
84
  - source of parameters: response to a `targets` request
74
- - `testRecForm`
85
+ - `recommendQualityTestingOfOneWebPage`
75
86
  - method: `POST`
76
- - purpose: receive a recommendation to test a not-yet-tested page
87
+ - purpose: recommend quality testing of one web page
77
88
  - path: `/api/testRecForm`
78
89
  - payload properties
79
90
  - `what`: description of the page to be tested
80
91
  - `url`: URL of the page to be tested
81
92
  - `why`: reason for testing the page
82
- - how to verify disposition of the recommendation: submit a `targets` request and inspect the response to determine whether a report on the page is now available
93
+ - how to verify disposition of the recommendation: submit a `summarizeQualityOfAllTestedWebPages` request and inspect the response to determine whether a report on the page is now available
83
94
 
84
95
  An [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml) is available.
85
96
 
86
- Until direct immediate testing is available, an agent can recommend testing of a web page with a `testRecForm` request.
87
-
88
97
  ### More information
89
98
 
90
99
  More information about Kilotest features and internals:
package/mcp.js ADDED
@@ -0,0 +1,96 @@
1
+ /*
2
+ mcp.js
3
+ Handles MCP (Model Context Protocol) requests for Kilotest tools.
4
+ */
5
+
6
+ // IMPORTS
7
+
8
+ const {McpServer} = require('@modelcontextprotocol/sdk/server/mcp.js');
9
+ const {StreamableHTTPServerTransport} = require('@modelcontextprotocol/sdk/server/streamableHttp.js');
10
+ const {z} = require('zod');
11
+ const {isReportAvailable, isURL} = require('./util');
12
+ const targetsAPI = require('./targets/api');
13
+ const reportIssuesAPI = require('./reportIssues/api');
14
+ const testRecFormAPI = require('./testRecForm/api');
15
+
16
+ // FUNCTIONS
17
+
18
+ // Creates and returns an McpServer with Kilotest tools registered.
19
+ const createMCPServer = () => {
20
+ const server = new McpServer({name: 'Kilotest', version: '1.0.0'});
21
+ server.registerTool(
22
+ 'summarizeQualityOfAllTestedWebPages',
23
+ {
24
+ description: 'Returns summary data from every available Kilotest report about the front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. Before calling describeQualityIssuesOfOneWebPage, call this tool to check whether a report about the page is available.',
25
+ inputSchema: {},
26
+ annotations: {
27
+ title: 'Summarize quality of all tested web pages',
28
+ readOnlyHint: true,
29
+ idempotentHint: true,
30
+ destructiveHint: false,
31
+ openWorldHint: false
32
+ }
33
+ },
34
+ async () => {
35
+ const result = await targetsAPI.response();
36
+ return {content: [{type: 'text', text: JSON.stringify(result)}]};
37
+ }
38
+ );
39
+ server.registerTool(
40
+ 'describeQualityOfOneWebPage',
41
+ {
42
+ description: 'Returns data from a specified Kilotest report about issues of front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. The required timeStamp and jobID parameters identify the report and are obtained from a summarizeQualityOfAllTestedWebPages response.',
43
+ inputSchema: {
44
+ timeStamp: z.string().describe('Report timestamp in YYMMDDTHHMM format, e.g. 260503T0432'),
45
+ jobID: z.string().describe('Job identifier, e.g. x9z')
46
+ },
47
+ annotations: {
48
+ title: 'Describe the quality of one web page',
49
+ readOnlyHint: true,
50
+ idempotentHint: true,
51
+ destructiveHint: false,
52
+ openWorldHint: false
53
+ }
54
+ },
55
+ async ({timeStamp, jobID}) => {
56
+ const result = await reportIssuesAPI.response([timeStamp, jobID]);
57
+ return {content: [{type: 'text', text: JSON.stringify(result)}]};
58
+ }
59
+ );
60
+ server.registerTool(
61
+ 'recommendQualityTestingOfOneWebPage',
62
+ {
63
+ description: 'Recommends a web page for Kilotest to test for front-end quality (i.e. accessibility, usability, and standard conformity). Do not call this tool unless summarizeQualityOfAllTestedWebPages discloses that no report about the page or a related page that satisfies your requirements is available.',
64
+ inputSchema: {
65
+ what: z.string().describe('Short description of the page, following the naming conventions visible in the summarizeQualityOfAllTestedWebPages response'),
66
+ url: z.string().describe('Full HTTPS URL of the page to test'),
67
+ why: z.string().describe('Reason for recommending this page for testing')
68
+ },
69
+ annotations: {
70
+ title: 'Recommend quality testing of one web page',
71
+ readOnlyHint: false,
72
+ idempotentHint: false,
73
+ destructiveHint: false,
74
+ openWorldHint: false
75
+ }
76
+ },
77
+ async ({what, url, why}) => {
78
+ if (!isURL(url)) {
79
+ return {content: [{type: 'text', text: JSON.stringify({error: 'Invalid URL'})}], isError: true};
80
+ }
81
+ if (await isReportAvailable(what, url)) {
82
+ return {content: [{type: 'text', text: JSON.stringify({error: 'A report about the page is already available'})}], isError: true};
83
+ }
84
+ const result = await testRecFormAPI.response(what, url, why);
85
+ return {content: [{type: 'text', text: JSON.stringify(result)}]};
86
+ }
87
+ );
88
+ return server;
89
+ };
90
+ // Handles an MCP request.
91
+ exports.handleMCP = async (request, response) => {
92
+ const transport = new StreamableHTTPServerTransport({sessionIdGenerator: undefined});
93
+ const server = createMCPServer();
94
+ await server.connect(transport);
95
+ await transport.handleRequest(request, response);
96
+ };
package/openapi.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  openapi: 3.1.0
2
2
  info:
3
3
  title: Kilotest Agent API
4
- description: Kilotest tests web pages for accessibility, usability, and standard conformity using an ensemble of ten independent tools that employ rule-based and machine-learning-based methods. This API enables AI agents to recommend web pages for testing, discover available test reports, and retrieve data from reports. For background on Kilotest and the advantages of ensemble testing, visit https://kilotest.com.
4
+ description: Kilotest tests web pages for front-end quality (i.e. accessibility, usability, and standard conformity) using an ensemble of ten independent rule engines that employ rule-based and machine-learning-based methods. This API enables AI agents to recommend web pages for testing, discover available test reports, and retrieve data from reports. For background on Kilotest and the advantages of ensemble testing, visit https://kilotest.com.
5
5
  version: 1.0.0
6
6
  contact:
7
7
  name: Kilotest
@@ -15,9 +15,9 @@ servers:
15
15
  paths:
16
16
  /api/targets:
17
17
  get:
18
- operationId: summarizeAccessibilityOfAllTestedWebPages
19
- summary: Summarizes all available reports
20
- description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many accessibility, usability, and standard-conformity issues were reported, and URLs for retrieving more detailed data from the report. This is the first endpoint to call if you want data about a particular web page. The result will tell you whether a report on that page already exists. If so, you can retrieve data from it. If not, you can use the submitWebAccessibilityTestRequest endpoint to recommend the page for testing.
18
+ operationId: summarizeQualityOfAllTestedWebPages
19
+ summary: Summarize quality of all tested web pages
20
+ description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many front-end quality (i.e. accessibility, usability, and standard-conformity) issues were reported, and URLs for retrieving more detailed data from the report. This is the first endpoint to call if you want data about a particular web page. The result will tell you whether a report on that page already exists. If so, you can retrieve data from it. If not, you can use the submitWebAccessibilityTestRequest endpoint to recommend the page for testing.
21
21
  responses:
22
22
  '200':
23
23
  description: Summaries of available reports
@@ -28,9 +28,9 @@ paths:
28
28
 
29
29
  /api/testRecForm:
30
30
  post:
31
- operationId: submitWebAccessibilityTestRequest
32
- summary: Receives a new testing recommendation
33
- description: Receives a recommendation for Kilotest to test, for the first time, a particular web page for accessibility, usability, and standard conformity. Recommendations are typically approved and the testing completed within a day, whereupon the results can be found with the summarizeAccessibilityOfAllTestedWebPages operation. Before submitting a recommendation, use the summarizeAccessibilityOfAllTestedWebPages operation to ensure that the page has not yet been tested, and also to see the stylistic rules for the naming of pages. An attempt to recommend an already tested page for testing will fail.
31
+ operationId: recommendQualityTestingOfOneWebPage
32
+ summary: Recommend quality testing of one web page
33
+ description: Submit a recommendation for Kilotest to test, for the first time, a particular web page for front-end quality (i.e. accessibility, usability, and standard conformity). Recommendations are typically approved and the testing completed within a day, whereupon the results can be found with the summarizeQualityOfAllTestedWebPages operation. Before submitting a recommendation, use the summarizeQualityOfAllTestedWebPages operation to ensure that no report about the page is available, and also to see the stylistic rules for the naming of pages. An attempt to recommend a page for testing will fail if a report about the page is already available.
34
34
  requestBody:
35
35
  description: Test recommendation specifications
36
36
  required: true
@@ -48,9 +48,9 @@ paths:
48
48
 
49
49
  /api/reportIssues/{timeStamp}/{jobID}:
50
50
  get:
51
- operationId: listAccessibilityIssuesOnOneWebPage
52
- summary: Gets data on issues from a specific report
53
- description: Returns data about the issues reported in a specific Kilotest report, grouped by priority. The data on each issue include the tools that reported it, the number of HTML elements exhibiting it, and URLs for retrieving element-level detail. The timeStamp and jobID components identify the report and are available in the response from the summarizeAccessibilityOfAllTestedWebPages operation.
51
+ operationId: describeQualityOfOneWebPage
52
+ summary: Describe quality of one web page
53
+ description: Get data about the quality of a specified Kilotest report. The data about each issue include its priority, the rule engines that reported it, the number of HTML elements exhibiting it, and URLs for retrieving element-level detail. The timeStamp and jobID parameters identify the report and are available in the response from the summarizeQualityOfAllTestedWebPages operation.
54
54
  parameters:
55
55
  - name: timeStamp
56
56
  in: path
@@ -81,14 +81,14 @@ paths:
81
81
 
82
82
  /api/reportIssue/{issueID}/{timeStamp}/{jobID}:
83
83
  get:
84
- operationId: listHTMLElementsHavingOneAccessibilityIssue
85
- summary: Gets details about a specific issue in a specific report (NOT YET IMPLEMENTED)
86
- description: Returns details about a single issue within a specific report, including which HTML elements exhibit the issue and, for each such element, URLs for retrieving tool-by-tool diagnoses of the issue on the element. NOT YET IMPLEMENTED.
84
+ operationId: describeHTMLElementsHavingOneQualityIssue
85
+ summary: Describe HTML elements having one quality issue (NOT YET IMPLEMENTED)
86
+ description: Get issue-specific data from a specified report about the front-end quality (i.e. accessibility, usability, and standard conformity) of a web page. The data describe the issue, all of the HTML elements of the page that have the issue, and, for each such element, URLs for retrieving diagnoses by rule engines of the issue on the element. NOT YET IMPLEMENTED.
87
87
  parameters:
88
88
  - name: issueID
89
89
  in: path
90
90
  required: true
91
- description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the listAccessibilityIssuesOnOneWebPage response.
91
+ description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the describeQualityOfOneWebPage response.
92
92
  schema:
93
93
  type: string
94
94
  examples:
@@ -191,29 +191,29 @@ components:
191
191
  TestRecFormResponse:
192
192
  $ref: '#/components/schemas/CommonResponseFields'
193
193
 
194
- ToolInfo:
194
+ RuleEngineInfo:
195
195
  type: object
196
- description: An accessibility testing tool in the Kilotest ensemble.
196
+ description: A rule engine in the Kilotest ensemble.
197
197
  properties:
198
198
  identifier:
199
199
  type: string
200
- description: Short programmatic identifier for the tool.
200
+ description: Short programmatic identifier for the rule engine.
201
201
  examples:
202
202
  - alfa
203
203
  name:
204
204
  type: string
205
- description: Display name of the tool.
205
+ description: Display name of the rule engine.
206
206
  examples:
207
207
  - Alfa
208
208
  sponsor:
209
209
  type: string
210
- description: Organization that created, initially sponsored, or now sponsors the tool.
210
+ description: Organization that created, initially sponsored, or now sponsors the rule engine.
211
211
  examples:
212
212
  - Siteimprove
213
213
 
214
- ToolFailure:
214
+ RuleEngineFailure:
215
215
  type: object
216
- description: A tool that was unable to complete testing of the page.
216
+ description: A rule engine that was unable to complete testing of the page.
217
217
  properties:
218
218
  name:
219
219
  type: string
@@ -224,9 +224,9 @@ components:
224
224
  examples:
225
225
  - Not enough credits.
226
226
 
227
- ToolsSummary:
227
+ RuleEnginesSummary:
228
228
  type: object
229
- description: Count and names of a set of tools.
229
+ description: Count and names of a set of rule engines.
230
230
  properties:
231
231
  number:
232
232
  type: integer
@@ -276,12 +276,12 @@ components:
276
276
  type: integer
277
277
  number of HTML elements reported as exhibiting issues:
278
278
  type: integer
279
- tools that tried to test the page:
280
- $ref: '#/components/schemas/ToolsSummary'
281
- tools that were unable to test the page:
282
- $ref: '#/components/schemas/ToolsSummary'
283
- tools that reported issues:
284
- $ref: '#/components/schemas/ToolsSummary'
279
+ rule engines that tried to test the page:
280
+ $ref: '#/components/schemas/RuleEnginesSummary'
281
+ rule engines that were unable to test the page:
282
+ $ref: '#/components/schemas/RuleEnginesSummary'
283
+ rule engines that reported issues:
284
+ $ref: '#/components/schemas/RuleEnginesSummary'
285
285
  URLs for getting data on the reported issues:
286
286
  $ref: '#/components/schemas/NextTierURLs'
287
287
  URL for getting the full technical report as JSON:
@@ -301,7 +301,7 @@ components:
301
301
 
302
302
  IssueEntry:
303
303
  type: object
304
- description: Details about a specific accessibility issue found on a page.
304
+ description: Details about a specific front-end quality issue found on a page.
305
305
  properties:
306
306
  identifier:
307
307
  type: string
@@ -329,8 +329,8 @@ components:
329
329
  impact on a user:
330
330
  type: string
331
331
  description: How this issue is likely to affect users.
332
- tools reporting the issue:
333
- $ref: '#/components/schemas/ToolsSummary'
332
+ rule engines reporting the issue:
333
+ $ref: '#/components/schemas/RuleEnginesSummary'
334
334
  number of HTML elements reported as exhibiting the issue:
335
335
  type: integer
336
336
  URLs for details about the issue on the page:
@@ -379,16 +379,16 @@ components:
379
379
  URL:
380
380
  type: string
381
381
  format: uri
382
- tools that tried to test the page:
382
+ rule engines that tried to test the page:
383
383
  type: array
384
384
  items:
385
- $ref: '#/components/schemas/ToolInfo'
386
- tools that were unable to test the page:
385
+ $ref: '#/components/schemas/RuleEngineInfo'
386
+ rule engines that were unable to test the page:
387
387
  type: array
388
388
  items:
389
- $ref: '#/components/schemas/ToolFailure'
390
- tools that reported issues:
391
- $ref: '#/components/schemas/ToolsSummary'
389
+ $ref: '#/components/schemas/RuleEngineFailure'
390
+ rule engines that reported issues:
391
+ $ref: '#/components/schemas/RuleEnginesSummary'
392
392
  number of issues reported:
393
393
  type: object
394
394
  properties:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@jrpool/kilotest",
3
- "version": "31.2.4",
3
+ "version": "33.0.0",
4
4
  "description": "An ensemble testing service with a focus on accessibility",
5
5
  "main": "index.js",
6
6
  "scripts": {
@@ -24,8 +24,10 @@
24
24
  },
25
25
  "homepage": "https://github.com/jrpool/kilotest",
26
26
  "dependencies": {
27
+ "@modelcontextprotocol/sdk": "*",
27
28
  "dotenv": "*",
28
- "testilo": "*"
29
+ "testilo": "*",
30
+ "zod": "*"
29
31
  },
30
32
  "devDependencies": {
31
33
  "@eslint/css": "^1.0.0",
package/pm2.config.js CHANGED
@@ -1,15 +1,17 @@
1
1
  module.exports = {
2
- apps: [{
3
- name: 'kilotest',
4
- script: 'index.js',
5
- instances: 1,
6
- autorestart: true,
7
- watch: false,
8
- max_memory_restart: '500M',
9
- env: {
10
- NODE_ENV: 'production',
11
- BASE_PATH: '/',
12
- DEMO_SSE_DELAY_MS: '100'
2
+ apps: [
3
+ {
4
+ name: 'kilotest',
5
+ script: 'index.js',
6
+ instances: 1,
7
+ autorestart: true,
8
+ watch: false,
9
+ max_memory_restart: '500M',
10
+ env: {
11
+ NODE_ENV: 'production',
12
+ BASE_PATH: '/',
13
+ DEMO_SSE_DELAY_MS: '100'
14
+ }
13
15
  }
14
- }]
16
+ ]
15
17
  };
@@ -82,7 +82,7 @@ const populateQuery = async (issueID, timeStamp, jobID, query) => {
82
82
  violatorData.reporters = getToolNamesString(violatorData.reporters);
83
83
  });
84
84
  const reporterCount = query.reporters.size;
85
- query.reporterCount = reporterCount === 1 ? '1 tool' : `${reporterCount} tools`;
85
+ query.reporterCount = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
86
86
  // Convert the set of issue reporters to a string.
87
87
  query.reporters = getToolNamesString(query.reporters);
88
88
  // Convert the violator data to an array.
@@ -29,7 +29,7 @@ const getIssueFacts = (thisHost, timeStamp, jobID, issue) => {
29
29
  'numeric identifier': wcag
30
30
  },
31
31
  'impact on a user': why,
32
- 'tools reporting the issue': {
32
+ 'rule engines reporting the issue': {
33
33
  'number': reporterCount,
34
34
  'names': reporters.map(tool => tool.toolName)
35
35
  },
@@ -40,7 +40,7 @@ const getIssueFacts = (thisHost, timeStamp, jobID, issue) => {
40
40
  }
41
41
  };
42
42
  };
43
- // Returns a response to a target-issues request.
43
+ // Returns a response to a report-issues request.
44
44
  exports.response = async args => {
45
45
  const [timeStamp, jobID] = args;
46
46
  const reportIsHidden = await isHidden(timeStamp, jobID);
@@ -63,12 +63,13 @@ exports.response = async args => {
63
63
  const thisHost = process.env.THIS_KILOTEST_HOST;
64
64
  // Get a response.
65
65
  const content = {
66
- summary: `This document fulfills a request made by an agent to the Kilotest service. The agent requested data from a Kilotest report about the accessibility, usability, and standard-conformity of a web page. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten testing tools, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of tools, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
67
- 'tool name': 'Kilotest',
66
+ summary: `This document fulfills a request made by a language model to a Kilotest tool. The model requested data from a Kilotest report about the front-end quality (i.e. accessibility, usability, and standard-conformity) of a web page. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten rule engines, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes several API endpoints recommend web pages for testing and to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
67
+ 'tool collection name': 'Kilotest',
68
+ 'tool name': 'describeQualityOfOneWebPage',
68
69
  request: {
69
70
  'type of request': {
70
71
  identifier: 'reportIssues',
71
- description: 'What issues does the specified report describe?'
72
+ description: 'Describe the quality of one web page.'
72
73
  },
73
74
  method: 'GET',
74
75
  URLs: {
@@ -76,9 +77,10 @@ exports.response = async args => {
76
77
  'equivalent URL for humans': `${thisHost}/reportIssues.html/${timeStamp}/${jobID}`
77
78
  },
78
79
  'closest ancestor request': {
79
- description: 'Which web pages are reports available about, and what are the statistics about the issues reported for each page?',
80
+ identifier: 'summarizeQualityOfAllTestedWebPages',
81
+ description: 'Summarize the quality of all tested web pages.',
80
82
  URLs: {
81
- 'for you': `${thisHost}/api/targets.html`,
83
+ 'for you': `${thisHost}/api/targets`,
82
84
  'for humans': `${thisHost}/targets.html`
83
85
  }
84
86
  }
@@ -96,9 +98,9 @@ exports.response = async args => {
96
98
  description: what,
97
99
  URL: url
98
100
  },
99
- 'tools that tried to test the page': getToolsFacts(Object.keys(tools)),
100
- 'tools that were unable to test the page': preventedTools,
101
- 'tools that reported issues': {
101
+ 'rule engines that tried to test the page': getToolsFacts(Object.keys(tools)),
102
+ 'rule engines that were unable to test the page': preventedTools,
103
+ 'rule engines that reported issues': {
102
104
  number: reporterCount,
103
105
  names: reporters.map(tool => tool.toolName)
104
106
  },
@@ -65,7 +65,7 @@ const populateQuery = async (timeStamp, jobID, query) => {
65
65
  query.timeStamp = timeStamp;
66
66
  query.jobID = jobID;
67
67
  // Add reporter information to the query.
68
- query.reporterCount = reporterCount === 1 ? '1 tool' : `${reporterCount} tools`;
68
+ query.reporterCount = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
69
69
  query.reporters = reporterList;
70
70
  // Add a summary of the issues to the query.
71
71
  query.issueCount = issueCount === 1 ? '1 issue was' : `${issueCount} issues were`;
@@ -107,7 +107,7 @@ const populateQuery = async (timeStamp, jobID, query) => {
107
107
  // Add the issue facts to the lines.
108
108
  detailsLines.push(`${margin} <li>Why it matters: ${why}`);
109
109
  detailsLines.push(`${margin} <li>Related WCAG standard: ${wcagLink}`);
110
- const reporterCountString = reporterCount === 1 ? '1 tool' : `${reporterCount} tools`;
110
+ const reporterCountString = reporterCount === 1 ? '1 rule engine' : `${reporterCount} rule engines`;
111
111
  detailsLines.push(
112
112
  `${margin} <li>Reported by ${reporterCountString} (${reporterList})</li>`
113
113
  );
package/targets/api.js CHANGED
@@ -51,15 +51,15 @@ exports.response = async () => {
51
51
  URL: url
52
52
  },
53
53
  'whether a later report about the same page exists': !! superseded,
54
- 'tools that tried to test the page': {
54
+ 'rule engines that tried to test the page': {
55
55
  number: toolCount,
56
56
  names: toolNames
57
57
  },
58
- 'tools that were unable to test the page': {
58
+ 'rule engines that were unable to test the page': {
59
59
  number: preventedToolCount,
60
60
  names: preventedToolNames
61
61
  },
62
- 'tools that reported issues': {
62
+ 'rule engines that reported issues': {
63
63
  number: reporterCount,
64
64
  names: reporterNames
65
65
  },
@@ -74,12 +74,13 @@ exports.response = async () => {
74
74
  }
75
75
  // Get a response.
76
76
  const content = {
77
- summary: `This document fulfills a request made by an agent to the Kilotest service. The agent requested data about the web pages that Kilotest had tested for accessibility, usability, and standard-conformity and, for each page, statistics about the results of the tests. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten testing tools, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes API endpoints for agents and web UI URLs for humans to recommend web pages for testing and to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of tools, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), whose home page contains an introduction and a link to a tutorial.`,
78
- 'tool name': 'Kilotest',
77
+ summary: `This document fulfills a request made by an agent to the Kilotest service. The agent requested data about the web pages that Kilotest had tested for accessibility, usability, and standard-conformity and, for each page, statistics about the results of the tests. Kilotest, with the help of Testaro, Testilo, and an ensemble of ten rule engines, performs tests on web pages, using a combination of rule- and machine-learning-based methods, and produces reports. Kilotest exposes API endpoints for agents and web UI URLs for humans to recommend web pages for testing and to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), whose home page contains an introduction and a link to a tutorial.`,
78
+ 'tool collection name': 'Kilotest',
79
+ 'tool name': 'summarizeQualityOfAllTestedWebPages',
79
80
  request: {
80
81
  'type of request': {
81
82
  identifier: 'targets',
82
- description: 'Give me summary data about each available report.'
83
+ description: 'Summarize the quality of all tested web pages.'
83
84
  },
84
85
  method: 'GET',
85
86
  URLs: {
@@ -20,12 +20,13 @@ exports.response = async (what, url, why) => {
20
20
  await updateRecs(what, url, why);
21
21
  // Get a response.
22
22
  const content = {
23
- summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test, for the first time, the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten testing tools, using a combination of rule- and machine-learning-based methods. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of tools, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
24
- 'tool name': 'Kilotest',
23
+ summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test, for the first time, the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten rule engines, using a combination of rule- and machine-learning-based methods. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of rule engines, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
24
+ 'tool collection name': 'Kilotest',
25
+ 'tool name': 'recommendQualityTestingOfOneWebPage',
25
26
  request: {
26
27
  'type of request': {
27
28
  identifier: 'testRecForm',
28
- description: 'I recommend that Kilotest test a particular web page.'
29
+ description: 'Recommend quality testing of one web page.'
29
30
  },
30
31
  method: 'POST',
31
32
  payload: {
@@ -433,7 +433,7 @@
433
433
  <h2>Further reading</h2>
434
434
  <ul>
435
435
  <li><a href="https://www.w3.org/WAI/WCAG22/Understanding/">Understanding WCAG 2.2</a> — W3C explanations of each success criterion</li>
436
- <li><a href="https://www.w3.org/WAI/test-evaluate/tools/list/">Web Accessibility Evaluation Tools List</a> — W3C registry of rule-engine tools</li>
436
+ <li><a href="https://www.w3.org/WAI/test-evaluate/tools/list/">Web Accessibility Evaluation Tools List</a> — W3C registry of software that performs accessibility testing</li>
437
437
  <li><a href="https://www.w3.org/WAI/WCAG22/Understanding/identify-input-purpose">Understanding SC 1.3.5: Identify Input Purpose</a> — detailed guidance on <code>autocomplete</code> requirements</li>
438
438
  <li><a href="https://html.spec.whatwg.org/multipage/form-control-infrastructure.html#autofill">HTML Living Standard: Autofill</a> — the definitive list of valid <code>autocomplete</code> tokens</li>
439
439
  <li><a href="https://arxiv.org/abs/2304.07591">Accessibility Metatesting: Comparing Nine Testing Tools</a> — research on rule-engine coverage variation</li>