@jrpool/kilotest 28.0.1 → 30.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,18 @@
1
+ {
2
+ "schema_version": "v1",
3
+ "name_for_human": "Kilotest",
4
+ "name_for_model": "kilotest",
5
+ "description_for_human": "Ensemble testing of web pages for accessibility, usability, and standard conformance.",
6
+ "description_for_model": "Use Kilotest to retrieve test results about the accessibility, usability, and standard conformance of any web page that has been tested by Kilotest at selectable levels of granularity.",
7
+ "auth": {
8
+ "type": "none"
9
+ },
10
+ "api": {
11
+ "type": "openapi",
12
+ "url": "https://kilotest.com/openapi.yaml",
13
+ "is_user_authenticated": false
14
+ },
15
+ "logo_url": "https://kilotest.com/favicon.ico",
16
+ "contact_email": "info@kilotest.com",
17
+ "legal_info_url": "https://github.com/jrpool/kilotest/blob/main/LICENSE"
18
+ }
@@ -1,6 +1,6 @@
1
1
  /*
2
2
  index.js
3
- Answers the report-issues question.
3
+ Answers the diagnoses question.
4
4
  */
5
5
 
6
6
  // IMPORTS
package/index.js CHANGED
@@ -224,12 +224,18 @@ const requestHandler = async (request, response) => {
224
224
  setHeaders('text/yaml', '/openapi.yaml', 'high');
225
225
  response.end(openapi);
226
226
  }
227
- // Otherwise, if it is for the the large-language-model specification:
227
+ // Otherwise, if it is for the the large-language-model summary guide:
228
228
  else if (pageName === 'llms.txt') {
229
229
  const llms = await fs.readFile('llms.txt', 'utf8');
230
230
  setHeaders('text/plain', '/llms.txt', 'high');
231
231
  response.end(llms);
232
232
  }
233
+ // Otherwise, if it is for the the large-language-model detailed guide:
234
+ else if (pageName === 'llms-full.txt') {
235
+ const llmsfull = await fs.readFile('llms-full.txt', 'utf8');
236
+ setHeaders('text/plain', '/llms-full.txt', 'high');
237
+ response.end(llmsfull);
238
+ }
233
239
  // Otherwise, if it is for a full report download:
234
240
  else if (pageName === 'fullReport.json') {
235
241
  const [timeStamp, jobID] = pathTail.split('/');
package/llms-full.txt ADDED
@@ -0,0 +1,81 @@
1
+ # Kilotest
2
+
3
+ ## What Kilotest is
4
+
5
+ Kilotest is an application that performs ensemble testing of web pages for accessibility, usability, and standard conformance and reports the test results. For brevity, hereafter in this document those three attributes are referred to as “front-end quality”.
6
+
7
+ ## What Kilotest does for AI agents
8
+
9
+ Kilotest is deployed as a service with a public URL. Kilotest can test only web pages that can be accessed from the public Internet and that are not protected by authentication or other access controls. Kilotest has not implemented any mechanism for testing private, internal, password-protected, or otherwise restricted pages.
10
+
11
+ An LLM cannot produce for an AI agent thorough, accurate, and inexpensive information about the front-end quality of a web page. To get such information, the agent needs to select and use specialized tools. However, such selection and use, too, require specialized skills that LLMs lack. Kilotest assumes responsibility for these functions. Kilotest:
12
+
13
+ - selects, as tools, an ensemble of 10 rule engines that implement tests for, in total, more than a thousand rules for front-end quality
14
+ - runs the tests of the rule engines
15
+ - combines the reports of the rule engines into a single integrated report
16
+ - extracts from the report statistics and details at the level of granularity required by an agent to fulfill any particular request
17
+
18
+ ## How Kilotest integrates methods
19
+
20
+ Testing for front-end quality requires a mixture of deterministic and inferential methods, because the standards of front-end quality vary in subjectivity. For example:
21
+
22
+ - The requirement that every element must have a landmark ancestor has no subjectivity, so it can best be tested deterministically.
23
+ - The requirement that images be marked up differently, depending on whether they are decorative or informative, has subjectivity, so its testing requires some inference.
24
+
25
+ The rule engines in the Kilotest ensemble take responsibility for determining whether each rule requires inferential testing and, if so, obtaining inference from an appropriate model.
26
+
27
+ ## Kilotest versus pure inference
28
+
29
+ LLMs that attempt to evaluate front-end quality with no delegation are likely to be unable to load and operate pages of web applications with browser automation, as Kilotest does. As a result, LLMs typically create hallucinated evaluations based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
30
+
31
+ ## Kilotest versus other tools
32
+
33
+ In the marketplace for MCPs that test for front-end quality, all tools other than Kilotest leverage at most 1 rule engine, typically `axe-core`. Kilotest leverages `axe-core` plus 9 other rule engines. This has these main effects:
34
+
35
+ - For pages that have not yet been tested, a single-tool MCP can provide faster results than Kilotest. A tool running only the `axe-core` tests can complete its work in about 5 seconds or less. Kilotest usually completes its work in 2 to 3 minutes, because some of the tests involve navigation and interaction with the page and LLM inference. Moreover, the Kilotest API feature suite is currently in an alpha phase, allowing agents, like human users, to **recommend** new pages for testing but allowing only managers to act on such recommendations. The wait for manager action can take up to a day. A feature permitting immediate testing ordered by AI agents is planned, but, until it is implemented, Kilotest will be useful for not-yet-tested pages only in long-running workflows.
36
+ - For pages that have already been tested by Kilotest, Kilotest can provide faster results than a single-tool MCP, because Kilotest stores test results for subsequent retrieval. A retrieval from Kilotest can be completed in less than 2 seconds.
37
+ - Kilotest results are more comprehensive than single-tool MCP results. Every rule engine provides limited coverage of front-end quality, so false negatives (missed defects) are more common with single-tool MCPs. This difference [has been documented in research](https://arxiv.org/pdf/2304.07591).
38
+
39
+ ## How to use Kilotest
40
+
41
+ ### Web UI
42
+
43
+ Kilotest offers a comprehensive suite of capabilities to users via its web UI:
44
+
45
+ - [Home page](https://kilotest.com/)
46
+ - [Summarize test results for all tested pages](https://kilotest.com/targets.html)
47
+ - Provide statistics about issues reported in one job: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
48
+ - Provide details about one issue reported in one job: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
49
+ - Provide diagnoses by tools of rule violations for one HTML element exhibiting one issue in one job: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
50
+ - [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
51
+ - [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
52
+ - [Provide a list of features available for use of Kilotest managers](https://kilotest.com/manage.html)
53
+
54
+ ### Agent API
55
+
56
+ Kilotest is implementing a richer suite of capabilities optimized for AI agents, including direct immediate testing. The implementation is currently in an alpha phase and offers only two API endpoints, both for `GET` requests:
57
+
58
+ - `targets`: [Summarize test results from all jobs](/api/targets) (a job is a session in which a web page is tested and a report is produced)
59
+ - `reportIssues`: Provide statistics about issues reported in one job report: `/api/reportIssues/{timeStamp}/{jobID}` (the `timeStamp` and `jobID` parameters identify a particular report and are returned for each available report by the response to a `targets` request)
60
+
61
+ An [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml) is available.
62
+
63
+ Until direct immediate testing is available, an agent can recommend testing of a web page by either of two methods:
64
+
65
+ - web UI: First verify that the page has not yet been tested, using the `targets` API endpoint. After that, complete and submit the [web UI recommendation form](https://kilotest.com/testRecForm.html).
66
+ - email: Send a message to `info@kilotest.com`, identifying:
67
+ - the URL of the page to be tested
68
+ - a proper name for the page to be tested (e.g., “ABC Corporation contact page”)
69
+ - a reason for your recommendation (e.g., “Page users have reported accessibility problems”)
70
+
71
+ ### More information
72
+
73
+ More information about Kilotest features and internals:
74
+
75
+ - [Introduction to Kilotest](https://kilotest.com/)
76
+ - [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
77
+ - [README file](https://github.com/jrpool/kilotest/blob/main/README.md)
78
+ - [Testaro rule-engine orchestrator](https://github.com/jrpool/testaro)
79
+ - [Testilo rule classifier](https://github.com/jrpool/testilo/blob/main/procs/score/tic.js)
80
+ - [Research report on Testaro](https://arxiv.org/abs/2309.10167)
81
+ - [Blog post on the Kilotest use case](https://medium.com/cvs-health-tech-blog/how-to-run-a-thousand-accessibility-tests-63692ad120c3)
package/llms.txt CHANGED
@@ -1,16 +1,22 @@
1
1
  # Kilotest
2
- Kilotest helps AI agents give accurate and truthful answers when users ask how accessible, usable, or standard conformant web pages are. Kilotest integrates ten independent rule engines to provide comprehensive multipage, single-page, and element-level statistics and details via its API and web UI.
2
+
3
+ Kilotest helps AI agents give thorough, accurate and truthful answers when users ask how accessible, usable, or standard conformant web pages are. Kilotest integrates ten independent rule engines to test web pages and report results with multipage, single-page, and element-level statistics and details.
4
+
5
+ In the API section below, all endpoints require `GET` requests.
6
+
7
+ There is a [detailed explanation of Kilotest and its capabilities for AI agents](https://kilotest.com/llms-full.txt).
3
8
 
4
9
  ## API
5
- - GET /api/targets: summarize test results for all tested pages
6
- - GET /api/reportIssues/{timeStamp}/{jobID}: statistics on issues reported in a job
7
- - GET /api/reportIssue/{issueID}/{timeStamp}/{jobID}: details on issues reported in a job
8
- - GET /api/issues: statistics on frequently reported issues across all tested pages
9
- - OpenAPI spec: https://kilotest.com/openapi.yaml
10
+
11
+ - [Summarize test results for all tested pages](/api/targets)
12
+ - Provide statistics about issues reported in one job: `/api/reportIssues/{timeStamp}/{jobID}`
13
+ - [OpenAPI specification for Kilotest](https://kilotest.com/openapi.yaml)
10
14
 
11
15
  ## Web UI
12
- - https://kilotest.com/targets.html: summarize test results for all tested pages
13
- - https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}: details on issues reported in a job
14
- - https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}: details on issues reported in a job
15
- - https://kilotest.com/issues.html: statistics on frequently reported issues across all tested pages
16
- - https://kilotest.com/tutorial.html: tutorial, “Accessibility testing strategies”
16
+
17
+ - [Summarize test results for all tested pages](https://kilotest.com/targets.html)
18
+ - Provide statistics about issues reported in one job: `https://kilotest.com/reportIssues.html/{timeStamp}/{jobID}`
19
+ - Provide details about one issue reported in one job: `https://kilotest.com/reportIssue.html/{issueID}/{timeStamp}/{jobID}`
20
+ - Provide diagnoses about one issue exhibited by one HTML element in one job: `https://kilotest.com/diagnoses.html/{issueID}/{timeStamp}/{jobID}/{catalogIndex}`
21
+ - [Provide statistics about frequently reported issues across all pages](https://kilotest.com/issues.html)
22
+ - [Tutorial, “Accessibility testing strategies”](https://kilotest.com/tutorial.html)
package/openapi.yaml CHANGED
@@ -9,7 +9,7 @@ info:
9
9
  repository: https://github.com/jrpool/kilotest
10
10
 
11
11
  servers:
12
- - url: https://kilotest.com/api
12
+ - url: https://kilotest.com
13
13
  description: Kilotest production server
14
14
 
15
15
  paths:
@@ -60,8 +60,8 @@ paths:
60
60
  /api/reportIssue/{issueID}/{timeStamp}/{jobID}:
61
61
  post:
62
62
  operationId: getReportIssue
63
- summary: Get details about a specific issue in a specific report
64
- description: Returns details about a single issue within a specific report, including which HTML elements exhibit the issue and, for each such element, URLs for retrieving tool-by-tool diagnoses of the issue on the element.
63
+ summary: Get details about a specific issue in a specific report (NOT YET IMPLEMENTED)
64
+ description: Returns details about a single issue within a specific report, including which HTML elements exhibit the issue and, for each such element, URLs for retrieving tool-by-tool diagnoses of the issue on the element. NOT YET IMPLEMENTED.
65
65
  parameters:
66
66
  - name: issueID
67
67
  in: path
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@jrpool/kilotest",
3
- "version": "28.0.1",
3
+ "version": "30.0.0",
4
4
  "description": "An ensemble testing service with a focus on accessibility",
5
5
  "main": "index.js",
6
6
  "scripts": {