@jrpool/kilotest 31.0.1 → 31.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AI-TOOL.md ADDED
@@ -0,0 +1,66 @@
1
+ # Kilotest as an AI Tool
2
+
3
+ ## Introduction
4
+
5
+ Until 2026 Kilotest was intended, and implemented, as a web application for human users.
6
+
7
+ Beginning in May 2026, it [became evident](https://github.com/jrpool/kilotest/issues/2) that Kilotest could also act as a tool for use by language models. Language models are asked for help in all domains, including the domain of software quality. When asked about the front-end quality (accessibility, usability, and standard conformity) of specific web pages, models gave answers without the use of tools, and the answers were inferior: simplistic, speculative, and fabricated. If a model were to use Kilotest as a tool, the model could give comprehensive, authoritative, grounded, and factual answers. Every reported defect could be documented and ascribed to one or more specific tools in the Kilotest ensemble.
8
+
9
+ Given the potential of Kilotest as an AI tool and the expected continued growth in the share of questions that are directed to language models, a decision was made to make Kilotest discoverable and usable as a tool for language models.
10
+
11
+ ## Internal additions
12
+
13
+ The internal changes that have been made in the Kilotest codebase to support the use of Kilotest as an AI tool are:
14
+
15
+ - An API, consisting of:
16
+ - Additions to `index.js`.
17
+ - Addition to `util.js`.
18
+ - Within directories providing API functionality:
19
+ - `api.js` modules.
20
+ - `util.js` modules, if needed, containing resources shared by the `index.js` and `api.js` modules in those directories.
21
+ - Additions to the `env.example` file.
22
+ - An `llms.txt` file and an `llms-full.txt` file, documenting the use of Kilotest by language models, conforming to the [llms-txt](https://llmstxt.org/) specification.
23
+ - An `openapi.yaml` file, documenting the Kilotest API, conforming to the [OpenAPI specification](https://spec.openapis.org/oas/v3.1.0).
24
+ - A `sitemap.xml` file.
25
+ - A `researchAgent.js` file, testing the API.
26
+
27
+ ## External actions
28
+
29
+ The external actions that have been taken to support the use of Kilotest as an AI tool are:
30
+
31
+ - A [pull request](https://github.com/public-apis/public-apis/pull/6346/changes) to add Kilotest to the list of public APIs in the `public-apis` repository.
32
+ - A [request](https://rapidapi.com/studio/api_91f2ce07-2572-48bd-a34d-ff01ed6cd039/publish/general) to add Kilotest to the Rapid API Hub.
33
+ - An [issue](https://github.com/APIs-guru/openapi-directory/issues/2677) to add Kilotest to `openapi-directory`.
34
+ - A [pull request](https://github.com/w3c/wai-evaluation-tools-list/pull/1153) to add Kilotest to the WAI evaluation tools list.
35
+ - [Configuration of Claude Desktop](https://github.com/ivo-toby/mcp-openapi-server#option-1-using-with-claude-desktop-stdio-transport) on the local development host to let Claude models find Kilotest.
36
+
37
+ ## Use cases
38
+
39
+ The rationale for Kilotest as a tool for language models is set forth in the `llms-full.txt` file and is not repeated here.
40
+
41
+ Some common anticipated use cases for this role are:
42
+
43
+ - A user of a web-builder platform with responsibility for a website asks an AI platform for help in creating or improving the website.
44
+ - A prospective customer of a web development service asks an AI platform to evaluate the quality of websites in the portfolios of candidate vendors.
45
+ - A professional web developer within an organization asks an AI platform for a code review.
46
+ - A risk-management professional within an organization asks an AI platform to report on any web accessibility defects that could expose the organization to prosecution or civil litigation.
47
+ - An person who depends on web accessibility because of disabilities asks an AI platform to provide documentary support for a complaint to the owner of a website about accessibility defects.
48
+ - A disability-rights advocate or attorney concerned with inaccessibility in a particular industry asks an AI platform to perform a front-end-quality comparison of some websites in that industry.
49
+
50
+ ## Strategy
51
+
52
+ ### Ease of use
53
+
54
+ Any use of Kilotest as a tool of language models will fail for some of the anticipated users if they are obligated to be aware of Kilotest and to translate that awareness into instructions or documentation. Therefore, for success in all the use cases, users should be able to tell models what is wanted and rely on models to figure out whether they need tools and, if so, which ones and how to use them.
55
+
56
+ ### Increments
57
+
58
+ The standardization of tool discovery and utilization by and for language models and AI platforms is partial. Major differences in protocols exist among model and platform families. Therefore, small testable increments of improvement in the tool functionality of Kilotest can best be defined by model and platform family. For example, working on discoverability first and then on usability would not be effective, because both are complex and testability would be postponed until both were complete. Instead, a coherent model and platform family should be identified and any and all improvements to make use cases successful within that family should be made and tested, before development proceeds to the next family.
59
+
60
+ One benefit of this type of incrementalism is that, after the first increment succeeds, it becomes possible to make and test external changes publicizing the fact that a particular platform-model combination delivers unprecedentedly comprehensive, truthful, and low-cost answers to questions about front-end web quality.
61
+
62
+ Another benefit is that subsequent increments can be defined incrementally rather than in advance. Lessons learned from the work on each increment can inform the choice of what to work on next.
63
+
64
+ #### Increment 1
65
+
66
+ In the first increment, the objective is to make Kilotest automatically discovered and used by Anthropic Claude models when used within the Claude Desktop application. That investigation is under way. Results will be summarized here.
package/eslint.config.mjs CHANGED
@@ -8,7 +8,7 @@ import { defineConfig } from "eslint/config";
8
8
  export default defineConfig([
9
9
  {
10
10
  ignores: [
11
- "DEVELOPMENT.md",
11
+ "IDEAS.md",
12
12
  "package-lock.json"
13
13
  ]
14
14
  },
package/index.js CHANGED
@@ -171,6 +171,17 @@ const checkBalancesForAlerts = async report => {
171
171
  }
172
172
  }
173
173
  };
174
+ // Minifies a URL.
175
+ const minifyURL = url => url.replace(/www\.|\/$/g, '');
176
+ // Returns whether a report on a page is available.
177
+ const isReportAvailable = async (what, url) => {
178
+ const logs = await getLogs();
179
+ const whats = logs.map(log => log.what);
180
+ const urls = logs.map(log => log.url);
181
+ const miniURLs = urls.map(url => minifyURL(url));
182
+ const miniURL = minifyURL(url);
183
+ return whats.includes(what) || miniURLs.includes(miniURL);
184
+ };
174
185
  // Handles a request.
175
186
  const requestHandler = async (request, response) => {
176
187
  // Sets response headers.
@@ -214,13 +225,6 @@ const requestHandler = async (request, response) => {
214
225
  setHeaders('text/html', '/index.html', 'medium');
215
226
  response.end(homePage);
216
227
  }
217
- // Otherwise, if it is for the AI plugin specification:
218
- else if (pathname === '/.well-known/ai-plugin.json') {
219
- const aiPlugin = await fs.readFile('.well-known/ai-plugin.json', 'utf8');
220
- // Serve it.
221
- setHeaders('application/json', '/.well-known/ai-plugin.json', 'low');
222
- response.end(aiPlugin);
223
- }
224
228
  // Otherwise, if it is for the crawler specification:
225
229
  else if (pageName === 'robots.txt') {
226
230
  const robots = await fs.readFile('robots.txt', 'utf8');
@@ -457,20 +461,28 @@ const requestHandler = async (request, response) => {
457
461
  const {what, url, why} = postData;
458
462
  // If the request is valid:
459
463
  if (what && url.startsWith('https://') && why) {
460
- // Serve headers for a response.
461
- setHeaders('text/html', `${pathname}${search}`, 'high');
462
- // Get the answer data.
463
- const answerData = await require(path.join(__dirname, 'testRec', 'index'))
464
- .answer(what, url, why);
465
- // If they are valid:
466
- if (answerData.status === 'ok') {
467
- // Serve the answer page.
468
- response.end(answerData.answerPage);
464
+ // If a report on the page is already available:
465
+ if (await isReportAvailable(what, url)) {
466
+ // Report the error.
467
+ await serveError({message: 'ERROR: Page has already been tested'}, response, true);
469
468
  }
470
- // Otherwise, i.e. if they are invalid:
469
+ // Otherwise, i.e. if no report on the page is available:
471
470
  else {
472
- // Report the error.
473
- await serveError({message: answerData.error}, response, true);
471
+ // Serve headers for a response.
472
+ setHeaders('text/html', `${pathname}${search}`, 'high');
473
+ // Get the answer data.
474
+ const answerData = await require(path.join(__dirname, 'testRec', 'index'))
475
+ .answer(what, url, why);
476
+ // If they are valid:
477
+ if (answerData.status === 'ok') {
478
+ // Serve the answer page.
479
+ response.end(answerData.answerPage);
480
+ }
481
+ // Otherwise, i.e. if they are invalid:
482
+ else {
483
+ // Report the error.
484
+ await serveError({message: answerData.error}, response, true);
485
+ }
474
486
  }
475
487
  }
476
488
  // Otherwise, i.e. if the request is invalid:
@@ -683,19 +695,12 @@ const requestHandler = async (request, response) => {
683
695
  // Otherwise, if the first segment is the test recommendation service:
684
696
  else if (segments[0] === 'testRecForm') {
685
697
  const {what, url, why} = postData;
686
- const logs = await getLogs();
687
- const whats = logs.map(log => log.what);
688
- const urls = logs.map(log => log.url);
689
698
  // If the payload is a valid test recommendation:
690
699
  if (what && isURL(url) && why) {
691
- // If an available report has the same URL or the same page description:
692
- if (whats.includes(what) || urls.includes(url)) {
700
+ // If a report on the page is already available:
701
+ if (await isReportAvailable(what, url)) {
693
702
  // Report this.
694
- await serveError(
695
- 'ERROR: A report with the same page description or URL is already available',
696
- response,
697
- false
698
- );
703
+ await serveError({message: 'ERROR: A report on the page is already available'}, response, false);
699
704
  }
700
705
  // Otherwise, i.e. if no report on the page is available:
701
706
  else {
@@ -707,6 +712,11 @@ const requestHandler = async (request, response) => {
707
712
  response.end(JSON.stringify(responseData));
708
713
  }
709
714
  }
715
+ // Otherwise, i.e. if it is not a valid test recommendation:
716
+ else {
717
+ // Report this.
718
+ await serveError({message: 'ERROR: Invalid test recommendation'}, response, false);
719
+ }
710
720
  }
711
721
  // Otherwise, i.e. if the request is invalid:
712
722
  else {
package/llms-full.txt CHANGED
@@ -27,7 +27,7 @@ The rule engines in the Kilotest ensemble take responsibility for determining wh
27
27
 
28
28
  ## Kilotest versus pure inference
29
29
 
30
- LLMs that attempt to evaluate front-end quality with no delegation are likely to be unable to load and operate pages of web applications with browser automation, as Kilotest does. As a result, LLMs typically create hallucinated evaluations based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
30
+ LLM platforms asked to evaluate front-end quality with no delegation are not all able to load pages of web applications for inspection. When they can do so, they typically cannot operate on the pages with browser automation, as Kilotest does. As a result, LLMs typically give less comprehensive answers than they would if they called Kilotest as a tool. When even browsing to a page is impossible, an LLM is likely to give speculative evaluations with hallucinated examples, based on statistical frequency, with no mention of issues that are rarely tested for, instead of providing the truthful and comprehensive assessments that users require.
31
31
 
32
32
  ## Kilotest versus other tools
33
33
 
package/openapi.yaml CHANGED
@@ -13,11 +13,24 @@ servers:
13
13
  description: Kilotest production server
14
14
 
15
15
  paths:
16
+ /api/targets:
17
+ get:
18
+ operationId: summarizeAccessibilityOfAllTestedWebPages
19
+ summary: Summarizes all available reports
20
+ description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many accessibility, usability, and standard-conformity issues were reported, and URLs for retrieving more detailed data from the report. This is the first endpoint to call if you want data about a particular web page. The result will tell you whether a report on that page already exists. If so, you can retrieve data from it. If not, you can use the submitWebAccessibilityTestRequest endpoint to recommend the page for testing.
21
+ responses:
22
+ '200':
23
+ description: Summaries of available reports
24
+ content:
25
+ application/json:
26
+ schema:
27
+ $ref: '#/components/schemas/TargetsResponse'
28
+
16
29
  /api/testRecForm:
17
30
  post:
18
- operationId: testRecForm
19
- summary: Receives a testing recommendation
20
- description: Receives a recommendation for Kilotest to test a particular web page.
31
+ operationId: submitWebAccessibilityTestRequest
32
+ summary: Receives a new testing recommendation
33
+ description: Receives a recommendation for Kilotest to test, for the first time, a particular web page for accessibility, usability, and standard conformity. Recommendations are typically approved and the testing completed within a day, whereupon the results can be found with the summarizeAccessibilityOfAllTestedWebPages operation. Before submitting a recommendation, use the summarizeAccessibilityOfAllTestedWebPages operation to ensure that the page has not yet been tested, and also to see the stylistic rules for the naming of pages. An attempt to recommend an already tested page for testing will fail.
21
34
  requestBody:
22
35
  description: Test recommendation specifications
23
36
  required: true
@@ -33,24 +46,11 @@ paths:
33
46
  schema:
34
47
  $ref: '#/components/schemas/TestRecFormResponse'
35
48
 
36
- /api/targets:
37
- get:
38
- operationId: targets
39
- summary: Summarizes all available reports
40
- description: Returns summary data about every non-hidden report available from Kilotest, including the name and URL of the tested web page, when the testing was performed, how many issues were reported, and URLs for retrieving more detailed data from the report.
41
- responses:
42
- '200':
43
- description: Summaries of available reports
44
- content:
45
- application/json:
46
- schema:
47
- $ref: '#/components/schemas/TargetsResponse'
48
-
49
49
  /api/reportIssues/{timeStamp}/{jobID}:
50
50
  get:
51
- operationId: getReportIssues
51
+ operationId: listAccessibilityIssuesOnOneWebPage
52
52
  summary: Gets data on issues from a specific report
53
- description: Returns data about the issues reported in a specific Kilotest report, grouped by priority. The data on each issue include the tools that reported it, the number of HTML elements exhibiting it, and URLs for retrieving element-level detail. The timeStamp and jobID components identify the report and are available in the targets response.
53
+ description: Returns data about the issues reported in a specific Kilotest report, grouped by priority. The data on each issue include the tools that reported it, the number of HTML elements exhibiting it, and URLs for retrieving element-level detail. The timeStamp and jobID components identify the report and are available in the response from the summarizeAccessibilityOfAllTestedWebPages operation.
54
54
  parameters:
55
55
  - name: timeStamp
56
56
  in: path
@@ -81,14 +81,14 @@ paths:
81
81
 
82
82
  /api/reportIssue/{issueID}/{timeStamp}/{jobID}:
83
83
  get:
84
- operationId: getReportIssue
84
+ operationId: listHTMLElementsHavingOneAccessibilityIssue
85
85
  summary: Gets details about a specific issue in a specific report (NOT YET IMPLEMENTED)
86
86
  description: Returns details about a single issue within a specific report, including which HTML elements exhibit the issue and, for each such element, URLs for retrieving tool-by-tool diagnoses of the issue on the element. NOT YET IMPLEMENTED.
87
87
  parameters:
88
88
  - name: issueID
89
89
  in: path
90
90
  required: true
91
- description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the getReportIssues response.
91
+ description: Issue identifier (e.g., imageNoText). Available under "issues reported" > priority level > "identifier" in the listAccessibilityIssuesOnOneWebPage response.
92
92
  schema:
93
93
  type: string
94
94
  examples:
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@jrpool/kilotest",
3
- "version": "31.0.1",
3
+ "version": "31.2.0",
4
4
  "description": "An ensemble testing service with a focus on accessibility",
5
5
  "main": "index.js",
6
6
  "scripts": {
package/researchAgent.js CHANGED
@@ -65,15 +65,16 @@ const requestService = async () => {
65
65
  const content = chunks.join('');
66
66
  try {
67
67
  // Output it.
68
- const contentObj = JSON.parse(content);
69
- console.log(JSON.stringify(contentObj, null, 2));
68
+ const targetsObj = JSON.parse(content);
69
+ console.log(JSON.stringify(targetsObj, null, 2));
70
+ const reports = targetsObj['available reports'];
70
71
  // Get the IDs of the available reports.
71
- const reportIDs = contentObj['available reports'].map(report => report.identifier);
72
+ const reportIDs = reports.map(report => report.identifier);
72
73
  // Choose one at random.
73
74
  const [timeStamp, jobID] = reportIDs[Math.floor(Math.random() * reportIDs.length)]
74
75
  .split('-');
75
76
  const path = `/api/reportIssues/${timeStamp}/${jobID}`;
76
- console.log('======================');
77
+ console.log('======================');
77
78
  console.log(`About to submit ${scheme} request as JSON on port ${port} to ${host}${path}`);
78
79
  const requestOptions = getRequestOptions(path);
79
80
  // Submit an issues request for it.
@@ -98,13 +99,13 @@ const requestService = async () => {
98
99
  // Output it.
99
100
  const contentObj = JSON.parse(content);
100
101
  console.log(JSON.stringify(contentObj, null, 2));
101
- const testRecPath = `/api/testRecForm`;
102
+ const testRecGoodPath = `/api/testRecForm`;
102
103
  console.log('======================');
103
104
  console.log(
104
- `About to submit ${scheme} POST request as JSON on port ${port} to ${host}${testRecPath}`
105
+ `About to submit good ${scheme} POST request as JSON on port ${port} to ${host}${testRecGoodPath}`
105
106
  );
106
- const testRecOptions = getRequestOptions(testRecPath, 'POST');
107
- // Submit a test recommendation.
107
+ const testRecOptions = getRequestOptions(testRecGoodPath, 'POST');
108
+ // Submit a good test recommendation.
108
109
  client.request(testRecOptions, response => {
109
110
  // Initialize a collection of data from the response.
110
111
  const chunks = [];
@@ -126,14 +127,59 @@ const requestService = async () => {
126
127
  // Output it.
127
128
  const contentObj = JSON.parse(content);
128
129
  console.log(JSON.stringify(contentObj, null, 2));
130
+ const testRecBadPath = `/api/testRecForm`;
131
+ console.log('======================');
132
+ console.log(
133
+ `About to submit bad ${scheme} POST request as JSON on port ${port} to ${host}${testRecBadPath}`
134
+ );
135
+ const testRecOptions = getRequestOptions(testRecBadPath, 'POST');
136
+ // Submit a bad test recommendation.
137
+ client.request(testRecOptions, response => {
138
+ // Initialize a collection of data from the response.
139
+ const chunks = [];
140
+ response
141
+ // If the response throws an error:
142
+ .on('error', async error => {
143
+ // Report it.
144
+ console.log(error.message);
145
+ })
146
+ // If the response delivers data:
147
+ .on('data', chunk => {
148
+ // Add them to the collection.
149
+ chunks.push(chunk);
150
+ })
151
+ // When the response is completed:
152
+ .on('end', async () => {
153
+ const content = chunks.join('');
154
+ try {
155
+ // Output it.
156
+ const contentObj = JSON.parse(content);
157
+ console.log(JSON.stringify(contentObj, null, 2));
158
+ }
159
+ catch (error) {
160
+ console.log(error.message);
161
+ console.log(
162
+ `Test recommendation response content: ${content || 'No content'}`
163
+ );
164
+ }
165
+ })
166
+ })
167
+ // Finish sending the bad test recommendation request.
168
+ .end(JSON.stringify({
169
+ what: 'Page Wrongly Recommended',
170
+ url: reports[Math.floor(Math.random() * reports.length)]
171
+ ['tested web page']
172
+ .URL,
173
+ why: 'This URL has already been tested'
174
+ }));
129
175
  }
130
176
  catch (error) {
131
177
  console.log(error.message);
132
178
  console.log(`Test recommendation response content: ${content || 'No content'}`);
133
179
  }
134
- });
180
+ })
135
181
  })
136
- // Finish sending the test recommendation request.
182
+ // Finish sending the good test recommendation request.
137
183
  .end(JSON.stringify({
138
184
  what: 'Page Not Already Tested',
139
185
  url: 'https://pagenotalreadytested.info',
package/targets/api.js CHANGED
@@ -9,8 +9,7 @@ const {
9
9
  getLogs,
10
10
  getNowStamp,
11
11
  getRandomString,
12
- getReportData,
13
- researchAgents
12
+ getReportData
14
13
  } = require('../util');
15
14
 
16
15
  // CONSTANTS
@@ -20,7 +20,7 @@ exports.response = async (what, url, why) => {
20
20
  await updateRecs(what, url, why);
21
21
  // Get a response.
22
22
  const content = {
23
- summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten testing tools, using a combination of rule- and machine-learning-based methods. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of tools, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
23
+ summary: `This response acknowledges a request made by an agent to the Kilotest service. The agent recommended that Kilotest test, for the first time, the ${what} web page at ${url} for accessibility, usability, and standard-conformity. A Kilotest manager usually approves a recommendation within a day. When the recommendation is approved, the testing will be performed and results will become available. You can check for the availability of the results at ${thisHost}/api/targets. Kilotest performs its testing with the help of Testaro, Testilo, and an ensemble of ten testing tools, using a combination of rule- and machine-learning-based methods. Kilotest exposes several API endpoints for agents and several web UI URLs for humans to obtain information from Kilotest reports. To learn more about Kilotest and the advangages of testing with an ensemble of tools, visit the deployed instance of Kilotest (${process.env.DEPLOYED_KILOTEST_HOST}), which contains an introduction on its home page and a tutorial.`,
24
24
  'tool name': 'Kilotest',
25
25
  request: {
26
26
  'type of request': {
@@ -1,18 +0,0 @@
1
- {
2
- "schema_version": "v1",
3
- "name_for_human": "Kilotest",
4
- "name_for_model": "kilotest",
5
- "description_for_human": "Ensemble testing of web pages for accessibility, usability, and standard conformance.",
6
- "description_for_model": "Use Kilotest to retrieve test results about the accessibility, usability, and standard conformance of any web page that has been tested by Kilotest at selectable levels of granularity.",
7
- "auth": {
8
- "type": "none"
9
- },
10
- "api": {
11
- "type": "openapi",
12
- "url": "https://kilotest.com/openapi.yaml",
13
- "is_user_authenticated": false
14
- },
15
- "logo_url": "https://kilotest.com/favicon.ico",
16
- "contact_email": "info@kilotest.com",
17
- "legal_info_url": "https://github.com/jrpool/kilotest/blob/main/LICENSE"
18
- }
File without changes