@effect-ak/tg-bot-api 0.9.3 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/dist/index.d.ts +3733 -0
  2. package/package.json +16 -8
  3. package/readme.md +260 -4
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@effect-ak/tg-bot-api",
3
- "version": "0.9.3",
3
+ "version": "1.3.0",
4
4
  "type": "module",
5
5
  "description": "TypeScript types for Telegram Bot Api and Telegram Mini Apps",
6
6
  "license": "MIT",
@@ -13,13 +13,9 @@
13
13
  "name": "Aleksandr Kondaurov",
14
14
  "email": "kondaurov.dev@gmail.com"
15
15
  },
16
- "scripts": {
17
- "build": "tsup",
18
- "typecheck": "tsc"
19
- },
20
16
  "repository": {
21
17
  "type": "git",
22
- "url": "https://github.com/effect-ak/tg-bot-client",
18
+ "url": "https://github.com/kondaurovDev/tg-bot-client",
23
19
  "directory": "packages/api"
24
20
  },
25
21
  "bugs": {
@@ -41,5 +37,17 @@
41
37
  "dist/*.js",
42
38
  "dist/*.cjs",
43
39
  "dist/*.d.ts"
44
- ]
45
- }
40
+ ],
41
+ "devDependencies": {
42
+ "node-html-parser": "^7.0.1",
43
+ "ts-morph": "^27.0.2"
44
+ },
45
+ "scripts": {
46
+ "build": "tsup",
47
+ "typecheck": "tsc",
48
+ "gen": "pnpm run gen:bot:api && pnpm run gen:webapp:api",
49
+ "gen:bot:api": "MODULE_NAME=bot_api tsx ./codegen/main.ts",
50
+ "gen:webapp:api": "MODULE_NAME=webapp tsx ./codegen/main.ts",
51
+ "test": "vitest run"
52
+ }
53
+ }
package/readme.md CHANGED
@@ -1,17 +1,16 @@
1
- ![Telegram Bot API](https://img.shields.io/badge/BotApi-9.2-blue?link=)
1
+ [![NPM Version](https://img.shields.io/npm/v/%40effect-ak%2Ftg-bot-api)](https://www.npmjs.com/package/@effect-ak/tg-bot-api)
2
+ ![Telegram Bot API](https://img.shields.io/badge/BotApi-9.4-blue?link=)
2
3
  ![Telegram WebApp](https://img.shields.io/badge/Telegram.WebApp-9.1-blue?link=)
3
- [![OpenAPI](https://img.shields.io/badge/OpenAPI-3.1-blue.svg)](https://effect-ak.github.io/telegram-bot-api/)
4
4
 
5
5
  ## Highlights:
6
6
 
7
7
  - **Complete and Up-to-Date Telegram Bot API**: The entire API is generated from [the official documentation](https://core.telegram.org/bots/api) using a [code generator](./codegen/main.ts), ensuring this client remains in sync and supports every method and type provided by the **Telegram Bot API**.
8
8
  - **[Types for Webapps](#webapps-typings)** Types that describe `Telegram.WebApp`. Created by [code generator](./codegen/main.ts) as well.
9
- - **[ChatBot runner](#chatbot-runner)**: Focus on the logic of your chat bot
10
9
  - **Type Mapping**: Types from the documentation are converted to TypeScript types:
11
10
  - `Integer` → `number`
12
11
  - `True` → `boolean`
13
12
  - `String or Number` → `string | number`
14
- - Enumerated types, such as `"Type of the chat can be either private”, group”, supergroup or channel"`, are converted to a standard union of literal types `"private" | "group" | "supergroup" | "channel"`
13
+ - Enumerated types, such as `"Type of the chat can be either "private", "group", "supergroup" or "channel""`, are converted to a standard union of literal types `"private" | "group" | "supergroup" | "channel"`
15
14
  - And more...
16
15
 
17
16
  ## Webapps typings
@@ -35,3 +34,260 @@ const saveData = () => {
35
34
  })
36
35
  }
37
36
  ```
37
+
38
+ ## Code generation
39
+
40
+ Scrapes the official Telegram documentation HTML and generates:
41
+
42
+ - **TypeScript types** for Bot API and Mini Apps (`src/specification/`)
43
+ - **Markdown docs** with method/type reference pages (`docs/`)
44
+
45
+ ### Pipeline
46
+
47
+ ```
48
+ core.telegram.org/bots/api core.telegram.org/bots/webapps
49
+ | |
50
+ fetch & cache fetch & cache
51
+ (input/api.html) (input/webapp.html)
52
+ | |
53
+ v v
54
+ DocPage WebAppPage
55
+ | |
56
+ ┌──────┴──────┐ |
57
+ v v v
58
+ ExtractedType ExtractedMethod ExtractedWebApp
59
+ | | | |
60
+ v v v v
61
+ types.ts api.ts webapp.ts (types)
62
+ | |
63
+ └──────┬──────┘
64
+ v
65
+ docs/ (markdown)
66
+ ```
67
+
68
+ 1. **Fetch** — `PageProviderService` downloads HTML from `core.telegram.org` and caches it locally in `input/`.
69
+
70
+ 2. **Parse** — `DocPage` / `WebAppPage` parse the HTML into a DOM tree (`node-html-parser`). Each entity is located by its `<a class="anchor">` tag.
71
+
72
+ 3. **Extract entities** — for every `<h4>` heading on the page the extraction pipeline deterministically converts HTML into typed structures. See [Extraction semantics](#extraction-semantics) below for the full details.
73
+
74
+ 4. **Generate TypeScript** — `BotApiCodeWriterService` and `WebAppCodeWriterService` use `ts-morph` to emit `.ts` files with interfaces, type aliases, and method signatures.
75
+
76
+ 5. **Generate Markdown** — `MarkdownWriterService` converts the same extracted data into browsable `.md` files with cross-linked types.
77
+
78
+ ### Usage
79
+
80
+ ```bash
81
+ # generate everything (Bot API types + docs + Mini Apps types)
82
+ pnpm gen
83
+
84
+ # or individually
85
+ pnpm gen:bot:api
86
+ pnpm gen:webapp:api
87
+ ```
88
+
89
+ ### Tests
90
+
91
+ ```bash
92
+ pnpm test
93
+ ```
94
+
95
+ Tests use cached HTML fixtures from `input/` — no network requests during test runs after the first download.
96
+
97
+ ## Extraction semantics
98
+
99
+ The Telegram Bot API documentation page (`core.telegram.org/bots/api`) is a single long HTML page with a regular, predictable structure. Every type and method is defined as a sequence of sibling HTML elements under its heading. The codegen exploits this regularity to extract everything deterministically, without any heuristic guessing.
100
+
101
+ ### Page layout: H3 groups and H4 entities
102
+
103
+ ```
104
+ <h3>Getting updates</h3> ← group name (section header)
105
+ <h4><a class="anchor" name="getUpdates"/>getUpdates</h4> ← entity
106
+ <p>...</p> ← description paragraph(s)
107
+ <table>...</table> ← field definitions (or <ul> for union types)
108
+ <h4><a class="anchor" name="update"/>Update</h4>
109
+ <p>...</p>
110
+ <table>...</table>
111
+ ```
112
+
113
+ The batch extractor (`entities.ts`) walks all `<h3>` and `<h4>` nodes in document order:
114
+ - **`<h3>`** sets the current section group (e.g. "Getting updates", "Available types").
115
+ - **`<h4>`** is an individual entity — a type or a method.
116
+
117
+ ### Locating a single entity
118
+
119
+ Every `<h4>` heading contains an `<a class="anchor" name="...">` child. This gives a stable, lowercase slug for the entity (e.g. `name="sendmessage"`). To look up a specific entity, `DocPage.getEntity()` does:
120
+
121
+ ```ts
122
+ this.node.querySelector(`a.anchor[name="${name.toLowerCase()}"]`)
123
+ ```
124
+
125
+ then extracts from the anchor's parent `<h4>` node.
126
+
127
+ ### Type vs Method — first character case
128
+
129
+ There is no HTML attribute that distinguishes a type from a method. Instead, Telegram consistently uses:
130
+
131
+ - **Uppercase first letter** → Type definition (e.g. `User`, `Chat`, `Message`)
132
+ - **Lowercase first letter** → API method (e.g. `sendMessage`, `getUpdates`, `setWebhook`)
133
+
134
+ The `isComplexType` / `startsWithUpperCase` helper makes this check. It is the single decision point: uppercase entities are extracted as `ExtractedType`, lowercase as `ExtractedMethod`.
135
+
136
+ ### Entity structure: walking siblings from H4
137
+
138
+ Starting from the `<h4>` node, the extractor walks `nextElementSibling` to collect:
139
+
140
+ 1. **`<p>` paragraphs** — entity description, and (for methods) the return type sentence.
141
+ 2. **`<table>`** — field/parameter definitions → produces `EntityFields` (an object type with named properties).
142
+ 3. **`<ul>`** — union type members → produces `NormalType` (a union `A | B | C`).
143
+
144
+ The walk stops when it hits another `<h4>` (next entity), a `<table>`, or a `<ul>`. A safety limit of 10 sibling steps prevents runaway walks on malformed HTML.
145
+
146
+ ### Description and return type extraction
147
+
148
+ Description `<p>` paragraphs are split on `. ` (period + space) or `.<br>` into individual sentences. Each sentence is stripped of HTML tags to get plain text.
149
+
150
+ For **methods**, return type sentences are identified by three patterns:
151
+
152
+ | Pattern | Example |
153
+ |---------|---------|
154
+ | Starts with `"On success"` | "On success, a Message object is returned" |
155
+ | Starts with `"Returns "` | "Returns an Array of Update objects" |
156
+ | Ends with `"is returned"` | "The sent Message is returned" |
157
+
158
+ Type names are extracted from `<a>` and `<em>` tags inside these sentences using the regex `/\w+(?=<\/(a\|em)>)/g` — this matches the word just before a closing `</a>` or `</em>` tag. Only uppercase-initial names pass (primitives like "true" are skipped).
159
+
160
+ Array return types are detected by checking for `"an array of <TypeName>"` (case-insensitive) in the plain text.
161
+
162
+ ### Table parsing: 3-column vs 4-column
163
+
164
+ Telegram uses two table layouts:
165
+
166
+ **4-column tables** (method parameters):
167
+
168
+ | Field | Type | Required | Description |
169
+ |-------|------|----------|-------------|
170
+ | chat_id | Integer or String | Yes | Unique identifier... |
171
+ | text | String | Yes | Text of the message... |
172
+ | parse_mode | String | Optional | Mode for parsing... |
173
+
174
+ Columns: `name` (0), `type` (1), `required` (2), `description` (3).
175
+ Column 2 is either `"Yes"` or `"Optional"`.
176
+
177
+ **3-column tables** (type fields):
178
+
179
+ | Field | Type | Description |
180
+ |-------|------|-------------|
181
+ | message_id | Integer | Unique message identifier... |
182
+ | from | User | Optional. Sender of the message... |
183
+
184
+ Columns: `name` (0), `type` (1), `description` (2).
185
+ There is no explicit "Required" column — instead, optional fields have their description start with `"Optional."`. The extractor checks:
186
+
187
+ ```ts
188
+ if (all.length == 3) {
189
+ const isOptional = description[0].includes("Optional")
190
+ required = !isOptional
191
+ }
192
+ ```
193
+
194
+ Required fields are sorted before optional ones in the output.
195
+
196
+ ### Union types from `<ul>` lists
197
+
198
+ Some types are defined not by a table but by a bulleted list:
199
+
200
+ ```html
201
+ <h4>ChatMemberStatus</h4>
202
+ <p>This object represents ...</p>
203
+ <ul>
204
+ <li>ChatMemberOwner</li>
205
+ <li>ChatMemberAdministrator</li>
206
+ <li>ChatMemberMember</li>
207
+ ...
208
+ </ul>
209
+ ```
210
+
211
+ Each `<li>` text becomes one arm of the TypeScript union:
212
+
213
+ ```ts
214
+ type ChatMemberStatus = ChatMemberOwner | ChatMemberAdministrator | ChatMemberMember | ...
215
+ ```
216
+
217
+ ### Pseudo-type mapping
218
+
219
+ Telegram uses its own type names (pseudo-types). The mapping is straightforward:
220
+
221
+ | Telegram pseudo-type | TypeScript |
222
+ |---------------------|------------|
223
+ | `String` | `string` |
224
+ | `Integer`, `Int` | `number` |
225
+ | `Float` | `number` |
226
+ | `Boolean`, `True`, `False` | `boolean` |
227
+ | `Array of X` | `X[]` |
228
+ | `X or Y` | `X \| Y` |
229
+ | `InputFile` | `{ file_content: Uint8Array, file_name: string }` |
230
+
231
+ Nested arrays (`Array of Array of PhotoSize`) are handled by counting `Array of` occurrences and appending the matching number of `[]` brackets.
232
+
233
+ ### Enum extraction from descriptions
234
+
235
+ Some `String` fields actually represent a finite set of values. The descriptions contain natural-language hints:
236
+
237
+ > Type of the emoji, currently one of "dice", "darts", "bowling", "basketball", "football", "slot_machine"
238
+
239
+ The extractor looks for **indicator keywords**: `"must be"`, `"always"`, `"one of"`, `"can be"`. When found, it extracts quoted values using a regex that handles both straight quotes (`"`) and Unicode curly quotes (`\u201C`/`\u201D`), which appear in different parts of the Telegram docs.
240
+
241
+ These are emitted as string literal unions:
242
+
243
+ ```ts
244
+ emoji: "dice" | "darts" | "bowling" | "basketball" | "football" | "slot_machine"
245
+ ```
246
+
247
+ The `parse_mode` field is a special case — it is always overridden to `"HTML" | "MarkdownV2"` regardless of the description text.
248
+
249
+ ### The `Function` pseudo-type (Mini Apps)
250
+
251
+ The Mini Apps documentation uses `Function` as a pseudo-type for callable fields. When the table says:
252
+
253
+ | Field | Type | Description |
254
+ |-------|------|-------------|
255
+ | sendData() | Function | ... |
256
+ | isVersionAtLeast() | Function | ... |
257
+
258
+ The extractor detects `pseudoType == "Function"` and converts:
259
+ - Fields ending with `()` → function signatures (return type derived from the entity name)
260
+ - The field name is trimmed to remove the `()` suffix
261
+
262
+ Since the documentation doesn't specify parameter types for these functions, most webapp methods have **manual type overrides** in `typeOverrides` (see `type-system.ts`).
263
+
264
+ ### Manual overrides
265
+
266
+ Some types cannot be inferred from the HTML structure alone. Three override mechanisms handle these:
267
+
268
+ 1. **`typeOverrides`** — per-entity, per-field type replacements. Used extensively for Mini Apps methods (callbacks, complex parameter types) and a few Bot API fields like `allowed_updates` and `sendChatAction.action`.
269
+
270
+ 2. **`typeAliasOverrides`** — for types that have no `<table>` or `<ul>` (e.g. `InputFile`). When `findTypeNode` hits the next `<h4>` without finding a type definition, the override provides the TypeScript type.
271
+
272
+ 3. **`returnTypeOverrides`** — for methods whose return type sentence is ambiguous or missing (e.g. `sendMediaGroup` returns `Message[]`).
273
+
274
+ ## Module map
275
+
276
+ ```
277
+ codegen/
278
+ ├── main.ts entry point, reads MODULE_NAME env var
279
+ ├── runtime.ts Effect runtimes (DI wiring)
280
+ ├── types.ts shared type aliases (HtmlElement, TsSourceFile)
281
+
282
+ ├── scrape/
283
+ │ ├── type-system.ts type representation & conversion (NormalType, EntityFields)
284
+ │ ├── entity.ts single-entity extraction from HTML nodes
285
+ │ ├── page.ts DocPage & WebAppPage — HTML page models
286
+ │ └── entities.ts batch extraction — walks full page, collects all types & methods
287
+
288
+ └── service/
289
+ ├── index.ts barrel re-exports
290
+ ├── page-provider.ts fetches & caches documentation HTML
291
+ ├── code-writers.ts ts-morph TypeScript code generation
292
+ └── markdown.ts markdown docs generation
293
+ ```