monadic-chat 0.1.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.rspec +3 -0
- data/CHANGELOG.md +9 -0
- data/Gemfile +4 -0
- data/Gemfile.lock +172 -0
- data/LICENSE.txt +21 -0
- data/README.md +652 -0
- data/Rakefile +12 -0
- data/apps/chat/chat.json +4 -0
- data/apps/chat/chat.md +42 -0
- data/apps/chat/chat.rb +79 -0
- data/apps/code/code.json +4 -0
- data/apps/code/code.md +42 -0
- data/apps/code/code.rb +77 -0
- data/apps/novel/novel.json +4 -0
- data/apps/novel/novel.md +36 -0
- data/apps/novel/novel.rb +77 -0
- data/apps/translate/translate.json +4 -0
- data/apps/translate/translate.md +37 -0
- data/apps/translate/translate.rb +81 -0
- data/assets/github.css +1036 -0
- data/assets/pigments-default.css +69 -0
- data/bin/monadic-chat +122 -0
- data/doc/img/code-example-time-html.png +0 -0
- data/doc/img/code-example-time.png +0 -0
- data/doc/img/example-translation.png +0 -0
- data/doc/img/how-research-mode-works.svg +1 -0
- data/doc/img/input-acess-token.png +0 -0
- data/doc/img/langacker-2001.svg +41 -0
- data/doc/img/linguistic-html.png +0 -0
- data/doc/img/monadic-chat-main-menu.png +0 -0
- data/doc/img/monadic-chat.svg +13 -0
- data/doc/img/readme-example-beatles-html.png +0 -0
- data/doc/img/readme-example-beatles.png +0 -0
- data/doc/img/research-mode-template.svg +198 -0
- data/doc/img/select-app-menu.png +0 -0
- data/doc/img/select-feature-menu.png +0 -0
- data/doc/img/state-monad.svg +154 -0
- data/doc/img/syntree-sample.png +0 -0
- data/lib/monadic_app.rb +115 -0
- data/lib/monadic_chat/console.rb +29 -0
- data/lib/monadic_chat/formatting.rb +110 -0
- data/lib/monadic_chat/helper.rb +72 -0
- data/lib/monadic_chat/interaction.rb +41 -0
- data/lib/monadic_chat/internals.rb +269 -0
- data/lib/monadic_chat/menu.rb +189 -0
- data/lib/monadic_chat/open_ai.rb +150 -0
- data/lib/monadic_chat/parameters.rb +109 -0
- data/lib/monadic_chat/version.rb +5 -0
- data/lib/monadic_chat.rb +190 -0
- data/monadic_chat.gemspec +54 -0
- data/samples/linguistic/linguistic.json +17 -0
- data/samples/linguistic/linguistic.md +39 -0
- data/samples/linguistic/linguistic.rb +74 -0
- metadata +343 -0
data/README.md
ADDED
@@ -0,0 +1,652 @@
|
|
1
|
+
<p align="center"><img src="./doc/img/monadic-chat.svg" width="500px"/></p>
|
2
|
+
|
3
|
+
<p align="center"><b>Highly configurable CLI client app for OpenAI chat/text-completion API</b></p>
|
4
|
+
|
5
|
+
<p align="center">
|
6
|
+
<img src="https://user-images.githubusercontent.com/18207/224493072-9720b341-c70d-43b9-b996-ba7e9a7a6806.gif" width="900" />
|
7
|
+
</p>
|
8
|
+
|
9
|
+
> **Note**
|
10
|
+
> This software is *under active development*, and the latest version may behave slightly differently than this documentation. The specifications may change in the future.
|
11
|
+
|
12
|
+
## Table of Contents
|
13
|
+
|
14
|
+
<!-- vim-markdown-toc GFM -->
|
15
|
+
|
16
|
+
* [Introduction](#introduction)
|
17
|
+
* [Dependencies](#dependencies)
|
18
|
+
* [Installation](#installation)
|
19
|
+
* [Using RubyGems](#using-rubygems)
|
20
|
+
* [Clone the GitHub Repository](#clone-the-github-repository)
|
21
|
+
* [Usage](#usage)
|
22
|
+
* [Authentication](#authentication)
|
23
|
+
* [Select Main Menu Item](#select-main-menu-item)
|
24
|
+
* [Roles](#roles)
|
25
|
+
* [System-Wide Functions](#system-wide-functions)
|
26
|
+
* [Apps](#apps)
|
27
|
+
* [Chat](#chat)
|
28
|
+
* [Code](#code)
|
29
|
+
* [Novel](#novel)
|
30
|
+
* [Translate](#translate)
|
31
|
+
* [Modes](#modes)
|
32
|
+
* [Normal Mode](#normal-mode)
|
33
|
+
* [Research Mode](#research-mode)
|
34
|
+
* [What is Research Mode?](#what-is-research-mode)
|
35
|
+
* [How Research Mode Works](#how-research-mode-works)
|
36
|
+
* [Accumulator](#accumulator)
|
37
|
+
* [Reducer](#reducer)
|
38
|
+
* [Creating New App](#creating-new-app)
|
39
|
+
* [Folder/File Structure](#folderfile-structure)
|
40
|
+
* [Reducer Code](#reducer-code)
|
41
|
+
* [Template for `Normal` Mode](#template-for-normal-mode)
|
42
|
+
* [Template for `Research` Mode](#template-for-research-mode)
|
43
|
+
* [What is Monadic about Monadic Chat?](#what-is-monadic-about-monadic-chat)
|
44
|
+
* [Unit, Map, and Join](#unit-map-and-join)
|
45
|
+
* [Discourse Management Object](#discourse-management-object)
|
46
|
+
* [Future Plans](#future-plans)
|
47
|
+
* [Bibliographical Data](#bibliographical-data)
|
48
|
+
* [Acknowledgments](#acknowledgments)
|
49
|
+
* [Contributing](#contributing)
|
50
|
+
* [Author](#author)
|
51
|
+
* [License](#license)
|
52
|
+
|
53
|
+
<!-- vim-markdown-toc -->
|
54
|
+
|
55
|
+
## Introduction
|
56
|
+
|
57
|
+
**Monadic Chat** is a command-line client application program that uses OpenAI's Text Completion API and Chat API to enable chat-style conversations with OpenAI's artificial intelligence system in a ChatGPT-like style.
|
58
|
+
|
59
|
+
The conversation with the AI can be saved in a JSON file, and the saved JSON file can be loaded later to continue the conversation. The conversation data can also be converted to HTML and displayed in a web browser.
|
60
|
+
|
61
|
+
Monadic Chat comes with four apps (`Chat`, `Code`, `Novel`, and `Translate`). Each can generate a different kind of text through interactive conversation between the user and OpenAI's large-scale language model. Users can also create new apps.
|
62
|
+
|
63
|
+
## Dependencies
|
64
|
+
|
65
|
+
- Ruby 2.6.10 or greater
|
66
|
+
- OpenAI API Token
|
67
|
+
- A command line terminal app such as:
|
68
|
+
- Terminal or [iTerm2](https://iterm2.com/) (MacOS)
|
69
|
+
- [Windows Terminal](https://apps.microsoft.com/store/detail/windows-terminal) (Windows 11)
|
70
|
+
- GNOME Terminal (Linux)
|
71
|
+
- [Alacritty](https://alacritty.org/) (Multi-platform)
|
72
|
+
|
73
|
+
## Installation
|
74
|
+
|
75
|
+
### Using RubyGems
|
76
|
+
|
77
|
+
Execute the following command in an environment where Ruby 2.6 or higher is installed.
|
78
|
+
|
79
|
+
```text
|
80
|
+
gem install monadic-chat
|
81
|
+
```
|
82
|
+
|
83
|
+
### Clone the GitHub Repository
|
84
|
+
|
85
|
+
Alternatively, clone the code from the GitHub repository and follow the steps below. At this time, you must take this option to create a new app for Monadic Chat.
|
86
|
+
|
87
|
+
1. Clone the repo
|
88
|
+
|
89
|
+
```text
|
90
|
+
git clone https://github.com/yohasebe/monadic-chat.git
|
91
|
+
```
|
92
|
+
|
93
|
+
2. Install dependencies
|
94
|
+
|
95
|
+
```text
|
96
|
+
cd monadic-chat
|
97
|
+
bundle update
|
98
|
+
```
|
99
|
+
|
100
|
+
3. Grant permission to the executable
|
101
|
+
|
102
|
+
```text
|
103
|
+
chmod +x ./bin/monadic-chat
|
104
|
+
```
|
105
|
+
|
106
|
+
4. Run the executable
|
107
|
+
|
108
|
+
```text
|
109
|
+
./bin/monadic-chat
|
110
|
+
```
|
111
|
+
|
112
|
+
## Usage
|
113
|
+
|
114
|
+
### Authentication
|
115
|
+
|
116
|
+
When you start Monadic Chat with the `monadic-chat` command for the first time, you will be asked for an OpenAI access token. If you do not have one, create an account on the [OpenAI](https://platform.openai.com/) website and obtain an access token.
|
117
|
+
|
118
|
+
If the environment variable `OPENAI_API_KEY` is set in the system, its value will be used automatically.
|
119
|
+
|
120
|
+
<br />
|
121
|
+
|
122
|
+
<kbd><img src="./doc/img/input-acess-token.png" width="700px" style="border: thin solid darkgray;"/></kbd>
|
123
|
+
|
124
|
+
<br />
|
125
|
+
|
126
|
+
Once the correct access token is verified, the access token is saved in the configuration file below and will automatically be used the next time the app is started.
|
127
|
+
|
128
|
+
`$HOME/monadic_chat.conf`
|
129
|
+
|
130
|
+
### Select Main Menu Item
|
131
|
+
|
132
|
+
Upon successful authentication, a menu to select a specific app will appear. Each app generates different types of text through an interactive chat-style conversation between the user and the AI. Four apps are available by default: [`chat`](#chat), [`code`](#code), [`novel`](#novel), and [`translate`](#translate).
|
133
|
+
|
134
|
+
Selecting the `mode` menu item allows you to change the [modes](#modes) from `normal` to `research` and vice versa.
|
135
|
+
|
136
|
+
Selecting `readme` will take you to the README on the GitHub repository (the document you are looking at now). Selecting `quit` will exit Monadic Chat.
|
137
|
+
|
138
|
+
<br />
|
139
|
+
|
140
|
+
<kbd><img src="./doc/img/select-app-menu.png" width="700px" style="border: thin solid darkgray;"/></kbd>
|
141
|
+
|
142
|
+
<br />
|
143
|
+
|
144
|
+
In the main menu, you can use the cursor keys and the enter key to make a selection. You can also narrow down the choices each time you type a letter.
|
145
|
+
|
146
|
+
### Roles
|
147
|
+
|
148
|
+
Each message in the conversation is labeled with one of three roles: `User`, `GPT`, or `System`.
|
149
|
+
|
150
|
+
- `User`: messages from the user of the Monadic Chat app (that's you!)
|
151
|
+
- `GPT`: messages from the OpenAI large-scale language model
|
152
|
+
- `System`: messages from the Monadic Chat system
|
153
|
+
|
154
|
+
### System-Wide Functions
|
155
|
+
|
156
|
+
You can call up the function menu anytime. To invoke the function menu, type `help` or `menu`.
|
157
|
+
|
158
|
+
<br />
|
159
|
+
|
160
|
+
<kbd><img src="./doc/img/select-feature-menu.png" width="700px" style="border: thin solid darkgray;"/></kbd>
|
161
|
+
|
162
|
+
<br />
|
163
|
+
|
164
|
+
In the function menu, you can use the cursor keys and the enter key to make a selection. You can also narrow down the choices each time you type a letter. Some functions are given multiple names, so typing on the keyboard quickly locates the necessary function.
|
165
|
+
|
166
|
+
**params/settings/config**
|
167
|
+
|
168
|
+
You can set parameters to be sent to OpenAI's APIs. The items that can be set are listed below.
|
169
|
+
|
170
|
+
- `model`
|
171
|
+
- `max_tokens`
|
172
|
+
- `temperature`
|
173
|
+
- `top_p`
|
174
|
+
- `frequency_penalty`
|
175
|
+
- `presence_penalty`
|
176
|
+
|
177
|
+
For detailed information on each parameter, please refer to OpenAI's [API Documentation](https://platform.openai.com/docs/). The default value of each parameter depends on the individual "mode" and "app."
|
178
|
+
|
179
|
+
**data/context**
|
180
|
+
|
181
|
+
In `normal` mode, this function only displays the conversation history between User and GPT. In `research` mode, metadata (e.g., topics, language being used, number of turns) values are presented.
|
182
|
+
|
183
|
+
In `research` mode, it may take a while (usually several seconds) after the `data/context` command is executed before the data is displayed. This is because in `research` mode, even after displaying a direct response to user input, there may be a process running in the background that retrieves the context data and reconstructs it.
|
184
|
+
|
185
|
+
**html**
|
186
|
+
|
187
|
+
All the information retrievable by running the `data/context` function can be presented in HTML. The HTML file is automatically opened in the default web browser.
|
188
|
+
|
189
|
+
<br />
|
190
|
+
|
191
|
+
<kbd><img src="./doc/img/linguistic-html.png" width="700px" style="border: thin solid darkgray;"/></kbd>
|
192
|
+
|
193
|
+
<br />
|
194
|
+
|
195
|
+
The generated HTML is saved in the user's home directory (`$HOME`) with the file name `monadic_chat.html`. The file contents does not automatically updated. Run `html` command every time when you need it. HTML data is written to this file regardless of the app.
|
196
|
+
|
197
|
+
In `research` mode, it may take several seconds to several minutes after the `html` command is executed before the acutual HTML is displayed. This is because in `research` mode, even after displaying a direct response to user input, there may be a process running in the background that retrieves and reconstructs the context data, requiring the system to wait for it to finish.
|
198
|
+
|
199
|
+
**reset**
|
200
|
+
|
201
|
+
You can reset all the conversation history (messages by both User and GPT). Note that API parameter settings will be reset to default as well.
|
202
|
+
|
203
|
+
**save and load**
|
204
|
+
|
205
|
+
The conversation history (messages by both User and GPT, and metadata in `research` mode) can be saved as a JSON file in a specified path. Note that the saved file can only be read by the same application that saved it in the `research` mode.
|
206
|
+
|
207
|
+
**clear/clean**
|
208
|
+
|
209
|
+
Selecting this, you can scroll and clear the screen so that the cursor is at the top.
|
210
|
+
|
211
|
+
**readme/documentation**
|
212
|
+
|
213
|
+
The README page on the GitHub repository (the document you are looking at now) will be opened.
|
214
|
+
|
215
|
+
**exit/bye/quit**
|
216
|
+
|
217
|
+
Selecting this will exit the current app and return to the main menu.
|
218
|
+
|
219
|
+
## Apps
|
220
|
+
|
221
|
+
### Chat
|
222
|
+
|
223
|
+
Monadic Chat's `chat` app is the most basic and generic app among others offered by default.
|
224
|
+
|
225
|
+
<br />
|
226
|
+
|
227
|
+
<kbd><img src="./doc/img/readme-example-beatles.png" width="700px" /></kbd>
|
228
|
+
|
229
|
+
<kbd><img src="./doc/img/readme-example-beatles-html.png" width="700px" /></kbd>
|
230
|
+
|
231
|
+
<br />
|
232
|
+
|
233
|
+
In the `chat` app, OpenAI's large-scale language model acts as a competent assistant that can do anything. It can write computer code, create fiction and poetry texts, and translate texts from one language into another. Of course, it can also engage in casual or academic discussions on specific topics. As with ChatGPT, there can be many variations in the content of the conversation.
|
234
|
+
|
235
|
+
- [`normal` mode template for `chat` app in JSON](https://github.com/yohasebe/monadic-chat/blob/main/apps/chat/chat.json)
|
236
|
+
- [`research` mode template for `chat` app in Markdown](https://github.com/yohasebe/monadic-chat/blob/main/apps/chat/chat.md)
|
237
|
+
|
238
|
+
|
239
|
+
### Code
|
240
|
+
|
241
|
+
Monadic Chat's `code` is designed to be an app that can write computer code for you.
|
242
|
+
|
243
|
+
<br />
|
244
|
+
|
245
|
+
<kbd><img src="./doc/img/code-example-time.png" width="700px" /></kbd>
|
246
|
+
|
247
|
+
<kbd><img src="./doc/img/code-example-time-html.png" width="700px" /></kbd>
|
248
|
+
|
249
|
+
<br />
|
250
|
+
|
251
|
+
In the `code` app, OpenAI's GPT behaves as a competent software engineer. The main difference from the `chat` app is that the `temperature` parameter is set to `0.0` so that as less randomness as possible is introduced to the responses. Syntax highlighting is applied (where possible) to the program code in the result message. The same applies to the output via the `html` command available from the functions menu.
|
252
|
+
|
253
|
+
- [`normal` mode template for `code` app in JSON](https://github.com/yohasebe/monadic-chat/blob/main/apps/code/code.json)
|
254
|
+
- [`research` mode template for `code` app in Markdown](https://github.com/yohasebe/monadic-chat/blob/main/apps/code/code.md)
|
255
|
+
|
256
|
+
### Novel
|
257
|
+
|
258
|
+
Monadic Chat's `novel` is designed to help you develop novel plots; the app instructs OpenAI's GPT model to write text based on a topic, theme, or brief description of an event indicated in the user prompt. Each new response is based on what was generated in previous responses. The interactive nature of the app allows the user to control the plot development rather than having an AI agent create a new novel all at once.
|
259
|
+
|
260
|
+
- [`normal` mode template for `novel` app in JSON](https://github.com/yohasebe/monadic-chat/blob/main/apps/novel/novel.json)
|
261
|
+
- [`research` mode template for `novel` app in Markdown](https://github.com/yohasebe/monadic-chat/blob/main/apps/novel/novel.md)
|
262
|
+
|
263
|
+
### Translate
|
264
|
+
|
265
|
+
Monadic Chat's `translate` is an app that helps translate text written in one language into another. Rather than translating the entire text simultaneously, the app allows users to work sentence by sentence or paragraph by paragraph.
|
266
|
+
|
267
|
+
The preferred translation for a given expression is specified in a pair of parentheses ( ) right after the original expression in question in a pair of brackets [ ] in the source text.
|
268
|
+
|
269
|
+
<br />
|
270
|
+
|
271
|
+
<kbd><img src="./doc/img/example-translation.png" width="700px" /></kbd>
|
272
|
+
|
273
|
+
<br />
|
274
|
+
|
275
|
+
Sometimes, however, problematic translations are created. The user can "save" the set of source and target texts and make any necessary corrections. The same unwanted expressions can be prevented or avoided later by providing the corrected translation data to the app.
|
276
|
+
|
277
|
+
- [`normal` mode template for `translate` app in JSON](https://github.com/yohasebe/monadic-chat/blob/main/apps/translate/translate.json)
|
278
|
+
- [`research` mode template for `translate` app in Markdown](https://github.com/yohasebe/monadic-chat/blob/main/apps/translate/translate.md)
|
279
|
+
|
280
|
+
## Modes
|
281
|
+
|
282
|
+
Monadic Chat has two modes. The `normal` mode utilizes OpenAI's chat API to achieve ChatGPT-like functionality. It is suitable for using a large language model as a competent companion for various pragmatic purposes. On the other hand, the `research` mode utilizes OpenAI's text-completion API. This mode allows for acquiring metadata in the background while receiving the primary response at each conversation turn. It may be especially useful for researchers exploring the possibilities of large-scale language models and their applications.
|
283
|
+
|
284
|
+
### Normal Mode
|
285
|
+
|
286
|
+
The default language model for `normal` mode is `gpt-3.5-turbo`.
|
287
|
+
|
288
|
+
In the default configuration, the dialogue messages are reduced after ten turns by deleting the oldest ones (but not the messages that the `system` role gave as instructions).
|
289
|
+
|
290
|
+
### Research Mode
|
291
|
+
|
292
|
+
The default language model for `research` mode is `text-davinci-003`.
|
293
|
+
|
294
|
+
Although the text-completion API is not a system optimized for chat-style dialogue, it can be used to realize a dialogue system with a mechanism that keeps track of the conversation history in a monadic structure. By default, when the number of tokens in the response from the GPT (which increases with each iteration because of the conversation history) reaches a specific value, the oldest message is deleted. Such a mechanism also has the advantage of retrieving metadata at each dialogue turn.
|
295
|
+
|
296
|
+
By default, when the number of tokens in the response from the GPT (which increases with each iteration because of the conversation history) reaches a specific value, the oldest message is deleted.
|
297
|
+
|
298
|
+
If you wish to specify how the conversation history is handled as the interaction with the GPT model unfolds, you can write a `Proc` object containing Ruby code. Since various metadata are available in this mode, finer-grained control is possible.
|
299
|
+
|
300
|
+
> **Warning**
|
301
|
+
> The `research` mode is not intended for general use but for research purposes. You may not get the expected results depending on the template design, parameter settings, and reducer settings. Adjustments in such cases require technical knowledge of OpenAI's text completion API or Ruby.
|
302
|
+
|
303
|
+
## What is Research Mode?
|
304
|
+
|
305
|
+
Monadic Chat's `research` mode has the following advantages:
|
306
|
+
|
307
|
+
- In `research` mode, each turn of the conversation can capture **metadata** as well as the **primary responses**
|
308
|
+
- You can define the **accumulator** and **reducer** mechanism and control the **flow** of the conversation
|
309
|
+
- It has structural features that mimic the **monadic** nature of natural language discourse
|
310
|
+
|
311
|
+
There are some drawbacks, however:
|
312
|
+
|
313
|
+
- It uses OpenAI's `text-davinci-003` model. The response text from this model is less detailed than in the `normal` mode that uses `gpt-3.5-turbo`.
|
314
|
+
- After displaying a response message from GPT, contextual information is processed in the background, which can cause lag when displaying conversation history in the command line screen or HTML output.
|
315
|
+
- Templates for `research` mode are larger and more complex, requiring more effort to create and fine-tune.
|
316
|
+
- `Research` mode requires more extensive input/output data and consumes more tokens than `normal` mode.
|
317
|
+
- The text-completion API used in `research` mode is more expensive than the chat API used in `normal` mode.
|
318
|
+
|
319
|
+
For these reasons, `normal` mode is recommended for casual use as an alternative CLI to ChatGPT. Nevertheless, as described below, the research mode makes Monadic Chat definitively different from other GPT client applications.
|
320
|
+
|
321
|
+
### How Research Mode Works
|
322
|
+
|
323
|
+
The following is a schematic of the process flow in the `research` mode.
|
324
|
+
|
325
|
+
<br />
|
326
|
+
|
327
|
+
<img src="./doc/img/how-research-mode-works.svg" width="900px"/>
|
328
|
+
|
329
|
+
<br />
|
330
|
+
|
331
|
+
Terms in bold in it may require more explanation.
|
332
|
+
|
333
|
+
- **Input** is a string entered by the user on the command line. The input is filled in the `{{NEW PROMPT}}` placeholder in the template and is sent to the API.
|
334
|
+
- The **template** contains conversation data in JSON format and instructions on how the text-completion API should update this data. More details are given in the [Creating New Apps]("#creating-new-apps") section below.
|
335
|
+
- The term **prompt** can be used in two ways: in one sense, it means text input from the user. In the figure above, however, "prompt" refers to the contents of the template as a whole, which is sent to the API.
|
336
|
+
- The response to the user’s input is referred to as **output**. Input and output are in the returned JSON object, structured according to the instruction specified in the template.
|
337
|
+
- The JSON object contains a list of the conversation history, referred to as the **accum** (accumulator) in the figure. Each turn of the conversation increases the messages stored in the accumulator.
|
338
|
+
- A Monadic Chat app must define a **reducer** to prevent the accumulator from growing excessively.
|
339
|
+
|
340
|
+
### Accumulator
|
341
|
+
|
342
|
+
`Normal` mode uses OpenAI's chat API, where the following basic structure is used for conversation history management.
|
343
|
+
|
344
|
+
```json
|
345
|
+
{"messages": [
|
346
|
+
{"role": "system", "content": "You are a friendly but professional consultant who answers various questions ... "},
|
347
|
+
{"role": "user", "content": "Can I ask something?"},
|
348
|
+
{"role": "assistant", "content": "Sure!"}
|
349
|
+
]}
|
350
|
+
```
|
351
|
+
|
352
|
+
The accumulator in `research` mode also looks like this.
|
353
|
+
|
354
|
+
### Reducer
|
355
|
+
|
356
|
+
The reducer mechanism must be implemented in Ruby code for each application. In many cases, it is sufficient to keep the size of the accumulator within a specific range by deleting old messages when a certain number of conversation turns are reached. Other possible implementations include the following.
|
357
|
+
|
358
|
+
**Example 1**
|
359
|
+
|
360
|
+
- Retrieve the current conversation topic as metadata at each turn and delete old exchanges if the conversation topic has changed.
|
361
|
+
- The metadata about the conversation topic is retained in list form even if old messages are deleted.
|
362
|
+
|
363
|
+
**Example 2**
|
364
|
+
|
365
|
+
- After a certain number of turns, the reducer writes the history of the conversation up to that point to an external file and deletes it from the accumulator.
|
366
|
+
- A summary of the deleted content is returned to the accumulator as an annotation message by the `system`, and the conversation continues with that summary information as context.
|
367
|
+
|
368
|
+
The Ruby implementation of the "reducer" mechanism for each default app can be found below:
|
369
|
+
|
370
|
+
- [`apps/chat/chat.rb`](https://github.com/yohasebe/monadic-chat/blob/main/apps/chat/chat.rb)
|
371
|
+
- [`apps/code/code.rb`](https://github.com/yohasebe/monadic-chat/blob/main/apps/code/code.rb)
|
372
|
+
- [`apps/novel/novel.rb`](https://github.com/yohasebe/monadic-chat/blob/main/apps/novel/novel.rb)
|
373
|
+
- [`apps/translate/translate.rb`](https://github.com/yohasebe/monadic-chat/blob/main/apps/translation/translation.rb)
|
374
|
+
|
375
|
+
## Creating New App
|
376
|
+
|
377
|
+
This section describes how users can create their own original Monadic Chat apps.
|
378
|
+
|
379
|
+
As an example, let us create an app named `linguistic`. It will do the following on the user input all at once:
|
380
|
+
|
381
|
+
- Return the result of syntactic parsing of the input as a primary response.
|
382
|
+
- Classify syntactic types of the input ("declarative," "interrogative," "imperative," "exclamatory," etc.)
|
383
|
+
- Perform sentiment analysis of the input ("happy," "sad," "troubled," "sad," etc.)
|
384
|
+
- Write text summarizing all the user input up to that point.
|
385
|
+
|
386
|
+
The specifications for Monadic Chat's command-line user interface for this app are as follows.
|
387
|
+
|
388
|
+
- The text to be parsed must be enclosed in double quotes to prevent the GPT model from misinterpreting it as some instruction.
|
389
|
+
- Parsed data will be formatted in Penn Treebank format. However, square brackets [ ] are used instead of parentheses ( ).
|
390
|
+
- The parsed data is returned as Markdown inline code enclosed in backticks (` `).
|
391
|
+
|
392
|
+
The use of square brackets (instead of parentheses) in the notation of syntactic analysis here is to conform to the format of [RSyntaxTree](https://yohasebe.com/rsyntaxtree), a tree-drawing program for linguistic research developed by the author of Monadic Chat.
|
393
|
+
|
394
|
+
<img src="./doc/img/syntree-sample.png" width="300px" />
|
395
|
+
|
396
|
+
The sample app we create in this section is stored in the [`sample_app`](https://github.com/yohasebe/monadic-chat/tree/main/sample_app) folder in the repository.
|
397
|
+
|
398
|
+
### Folder/File Structure
|
399
|
+
|
400
|
+
New Monadic Chat apps must be placed inside the `apps` folder. The folders and files for default apps `chat`, `code`, `novel`, and `translate` are also in this folder.
|
401
|
+
|
402
|
+
```text
|
403
|
+
apps
|
404
|
+
├── chat
|
405
|
+
│ ├── chat.json
|
406
|
+
│ ├── chat.md
|
407
|
+
│ └── chat.rb
|
408
|
+
├── code
|
409
|
+
│ ├── code.json
|
410
|
+
│ ├── code.md
|
411
|
+
│ └── code.rb
|
412
|
+
├── novel
|
413
|
+
│ ├── novel.json
|
414
|
+
│ ├── novel.md
|
415
|
+
│ └── novel.rb
|
416
|
+
└─── translate
|
417
|
+
├── translate.json
|
418
|
+
├── translate.md
|
419
|
+
└── translate.rb
|
420
|
+
```
|
421
|
+
|
422
|
+
Notice in the figure above that three files with the same name but different extensions (`.rb`, `.json`, and `.md`) are stored under each of the four default app folders. Similarly, when creating a new app, you create these three types of files under a folder with the same name as the app name.
|
423
|
+
|
424
|
+
```text
|
425
|
+
apps
|
426
|
+
└─── linguistic
|
427
|
+
├── linguistic.json
|
428
|
+
├── linguistic.md
|
429
|
+
└── linguistic.rb
|
430
|
+
```
|
431
|
+
|
432
|
+
The purpose of each file is as follows.
|
433
|
+
|
434
|
+
- `linguistic.rb`: Ruby code to define the "reducer"
|
435
|
+
- `linguistic.json`: JSON template describing GPT behavior in `normal` mode
|
436
|
+
- `linguistic.md`: Markdown template describing GPT behavior in `research` mode
|
437
|
+
|
438
|
+
The `.rb` file is required, but you may create both `.json` and `.md` files, or only one of them.
|
439
|
+
|
440
|
+
Template files with a name beginning with `_` are also ignored. If a folder has a name beginning with `_`, all its contents are ignored.
|
441
|
+
|
442
|
+
### Reducer Code
|
443
|
+
|
444
|
+
We do not need to make the reducer do anything special for the current purposes. So, let's copy the code from the default `chat` app and make a minor modification, such as changing the class name so that it matches the app name. We save it as `apps/linguistic/linguistic.rb`.
|
445
|
+
|
446
|
+
### Template for `Normal` Mode
|
447
|
+
|
448
|
+
In `normal` mode, achieving all the necessary functions shown earlier is impossible or very tough, to say the least. All we do here is display the results of syntactic analysis and define a user interface. Create a JSON file `apps/linguistic/linguistic.rb` and save it with the following contents:
|
449
|
+
|
450
|
+
```json
|
451
|
+
{"messages": [
|
452
|
+
{"role": "system",
|
453
|
+
"content": "You are a syntactic parser for natural languages. Analyze the given input sentence from the user and execute a syntactic parsing. Give your response in a variation of the penn treebank format, but use brackets [ ] instead of parentheses ( ). Also, give your response in a markdown code span. The sentence must always be parsed if the user's input sentence is enclosed in double quotes."},
|
454
|
+
{"role": "user", "content": "\"We saw a beautiful sunset.\""},
|
455
|
+
{"role": "assistant",
|
456
|
+
"content": "`[S [NP He] [VP [V saw] [NP [det a] [N' [Adj beautiful] [N sunset] ] ] ] ]`"},
|
457
|
+
{"role": "user", "content": "\"We didn't take a picture.\"" },
|
458
|
+
{"role": "assistant",
|
459
|
+
"content": "`[S [NP We] [IP [I didn't] [VP [V take] [NP [Det a] [N picture] ] ] ] ] ]`"}
|
460
|
+
]}
|
461
|
+
```
|
462
|
+
|
463
|
+
The data structure here is no different from that specified in [OpenAI Chat API](https://platform.openai.com/docs/guides/chat). The `normal` mode of Monadic Chat is just a client application that uses this API to achieve ChatGPT-like functionality on the command line.
|
464
|
+
|
465
|
+
### Template for `Research` Mode
|
466
|
+
|
467
|
+
The template in `research` mode is a Markdown file consisting of five sections. The role and content of each section are shown in the following figure.
|
468
|
+
|
469
|
+
<br />
|
470
|
+
|
471
|
+
<img src="./doc/img/research-mode-template.svg" width="500px"/>
|
472
|
+
|
473
|
+
<br />
|
474
|
+
|
475
|
+
Below we will look at the `research` mode template for the `linguistic` app, section by section.
|
476
|
+
|
477
|
+
**Main Section**
|
478
|
+
|
479
|
+
<div style="highlight highlight-source-gfm"><pre style="white-space : pre-wrap !important;">You are a natural language syntactic/semantic/pragmatic analyzer. Analyze the new prompt from the user below and execute a syntactic parsing. Give your response in a variation of the penn treebank format, but use brackets [ ] instead of parentheses ( ). Also, give your response in a markdown code span. The sentence must always be parsed if the user's input sentence is enclosed in double quotes. Create a response to the following new prompt from the user and set your response to the "response" property of the JSON object below. All prompts by "user" in the "messages" property are continuous in content.
|
480
|
+
</pre></div>
|
481
|
+
|
482
|
+
The text here is the same as the text in the template for the `normal` mode in an instruction message by the `system`. However, note that it contains an instruction that the response from GPT should be presented in the form of a JSON object, as shown in one of the following sections.
|
483
|
+
|
484
|
+
**New Prompt**
|
485
|
+
|
486
|
+
```markdown
|
487
|
+
NEW PROMPT: {{PROMPT}}
|
488
|
+
```
|
489
|
+
|
490
|
+
Monadic Chat replaces `{{PROMPT}}` with input from the user when sending templates through the API.
|
491
|
+
|
492
|
+
**JSON Object**
|
493
|
+
|
494
|
+
```json
|
495
|
+
{
|
496
|
+
"prompt": "\"We didn't have a camera.\"",
|
497
|
+
"response": "`[S [NP We] [VP [V didn't] [VP [V have] [NP [Det a] [N camera] ] ] ] ] ]`\n\n###\n\n",
|
498
|
+
"mode": "linguistic",
|
499
|
+
"turns": 2,
|
500
|
+
"sentence_type": ["declarative"],
|
501
|
+
"sentiment": ["sad"],
|
502
|
+
"summary": "The user saw a beautiful sunset, but did not take a picture because the user did not have a camera.",
|
503
|
+
"tokens": 351,
|
504
|
+
"messages": [{"user": "\"We saw a beautiful sunset.\"", "assistant": "`[S [NP He] [VP [V saw] [NP [det a] [N' [Adj beautiful] [N sunset] ] ] ] ]`\n\n###\n\n" },
|
505
|
+
{"user": "\"We didn't take a picture.\"", "assistant": "`[S [NP We] [IP [I didn't] [VP [V take] [NP [Det a] [N picture] ] ] ] ] ]`\n\n###\n\n" },
|
506
|
+
{"user": "\"We didn't have a camera.\"", "assistant": "`[S [NP We] [IP [I didn't] [VP [V have] [NP [Det a] [N camera] ] ] ] ] ]`\n\n###\n\n" }
|
507
|
+
]
|
508
|
+
}
|
509
|
+
```
|
510
|
+
|
511
|
+
This is the core of the `research` mode template.
|
512
|
+
|
513
|
+
Note that the entire `research` mode template is written in Markdown format, so the above JSON object is actually separated from the rest of the template by a code fence, as shown below.
|
514
|
+
|
515
|
+
```json
|
516
|
+
{
|
517
|
+
"prompt": ...
|
518
|
+
...
|
519
|
+
"messages": ...
|
520
|
+
}
|
521
|
+
```
|
522
|
+
|
523
|
+
The required properties of this JSON object are `prompt`, `response`, and `messages`. Other properties are optional. The format of the `messages` property is similar to that of the `normal` mode (i.e., OpenAI's chat API. The only difference is that it is structured as a list of objects whose keys are user and assistant to make it easier to describe.)
|
524
|
+
|
525
|
+
The JSON object in the `research` mode template is saved in the user’s home directory (`$HOME`) with the file `monadic_chat.json`. The content is overwritten every time the JSON object is updated. Note that this JSON file is created for logging purposes (so the data is not pretty printed). Modifying its content does not affect the processes carried out by the app.
|
526
|
+
|
527
|
+
**Content Requirements**
|
528
|
+
|
529
|
+
```markdown
|
530
|
+
Make sure the following content requirements are all fulfilled:
|
531
|
+
|
532
|
+
- keep the value of the "mode" property at "linguistic"
|
533
|
+
- set the new prompt to the "prompt" property
|
534
|
+
- create your response to the new prompt in accordance with the "messages" and set it to "response"
|
535
|
+
- insert both the new prompt and the response after all the existing items in the "messages"
|
536
|
+
- analyze the new prompt's sentence type and set a sentence type value such as "interrogative", "imperative", "exclamatory", or "declarative" to the "sentence_type" property
|
537
|
+
- analyze the new prompt's sentiment and set one or more sentiment types such as "happy", "excited", "troubled", "upset", or "sad" to the "sentiment" property
|
538
|
+
- summarize the user's messages so far and update the "summary" property with a text of fewer than 100 words.
|
539
|
+
- update the value of "tokens" with the number of tokens of the resulting JSON object"
|
540
|
+
- increment the value of "turns" by 1 and update the property so that the value of "turns" equals the number of the items in the "messages" of the resulting JSON object
|
541
|
+
```
|
542
|
+
|
543
|
+
Note that all the properties of the JSON object above are mentioned so that GPT can update them accordingly.
|
544
|
+
|
545
|
+
**Formal Requirements**
|
546
|
+
|
547
|
+
```markdown
|
548
|
+
Make sure the following formal requirements are all fulfilled:
|
549
|
+
|
550
|
+
- do not use invalid characters in the JSON object
|
551
|
+
- escape double quotes and other special characters in the text values in the resulting JSON object
|
552
|
+
|
553
|
+
Add "\n\n###\n\n" at the end of the "response" value.
|
554
|
+
|
555
|
+
Wrap the JSON object with "<JSON>\n" and "\n</JSON>".
|
556
|
+
```
|
557
|
+
|
558
|
+
This section details the format of the response returned through the API. JSON is essentially text data, and some characters must be escaped appropriately.
|
559
|
+
|
560
|
+
Due to their importance, two formal requirements are described as independent sentences rather than in list form. It is necessary since the language model available in OpenAI’s text-completion API is subject to some indeterminacy (even when the `temperature` parameter is `0.0`).
|
561
|
+
|
562
|
+
- To ensure that a valid JSON object is retrieved, Monadic Chat requires `< JSON>... </JSON>` tags to enclose the whole JSON data.
|
563
|
+
- Monadic Chat requires that the primary response from GPT end with the string `\n\n####\n\n`. This is part of the mechanism to detect when the response string has reached the end so that it can be displayed on the terminal as soon as possible.
|
564
|
+
|
565
|
+
## What is Monadic about Monadic Chat?
|
566
|
+
|
567
|
+
A monad is a type of data structure in functional programming (leaving aside for the moment the notion of the monad in mathematical category theory). An element with a monadic structure can be manipulated in a certain way to change its internal data. However, no matter how much the internal data changes, the external structure of the monadic process remains the same and can be manipulated in the same way as it was at first.
|
568
|
+
|
569
|
+
Many such monadic processes surround us, and natural language discourse is one of them. A "chat" between a human user and an AI agent can be thought of as a form of natural language discourse, which is monadic in nature. If so, an application that provides an interactive interface to a large-scale language model, such as ChatGPT, would most naturally be designed in a "functional" way, considering the monadic nature of natural language discourse.
|
570
|
+
|
571
|
+
### Unit, Map, and Join
|
572
|
+
|
573
|
+
Many “functional” programming languages, such as Haskell, have monads as a core feature. However, Monadic Chat is developed using the Ruby programming language, which does not. This is because, with Ruby, it will be easier for users to write their apps. Ruby is not classified as a "functional language" per se. Still, Monadic Chat has the following three features required of a monad, and in this sense, it can be considered "monadic."
|
574
|
+
|
575
|
+
- **unit**: a monadic process has a means of taking data and enclosing it in a monadic structure
|
576
|
+
- **map**: a monadic process has a means of performing some operation on the data inside a monadic structure and returning the result in a monadic structure
|
577
|
+
- **join**: a monadic process has a means of flattening a structure with multiple monadic layers into a single monadic layer
|
578
|
+
|
579
|
+
### Discourse Management Object
|
580
|
+
|
581
|
+
In Monadic Chat's `research` mode, the discourse management object described in JSON serves as an environment to keep a conversation going between the user and the large language model. Any sample/past interaction data can be wrapped inside such an environment (***unit***).
|
582
|
+
|
583
|
+
The interaction between the user and the AI can be interpreted as an operation on the *discourse world* built in the previous conversational exchanges. Monadic Chat updates the discourse world by retrieving the conversation history embedded in the template and performing operations responding to user input (***map***).
|
584
|
+
|
585
|
+
Responses from OpenAI's language model APIs (chat API and text-completion API) are also returned in the same JSON format. The main conversational response content is wrapped within this environment. If the whole object were treated as the conversational response to the user input, the discourse management object would involve a nested structure, which could continue inifinitely. Therefore, Monadic Chat extracts only the necessary values from the response object and reassembles the (single-layered) discourse management object using them (***join***).
|
586
|
+
|
587
|
+
<br />
|
588
|
+
|
589
|
+
<img src="./doc/img/state-monad.svg" width="700px" />
|
590
|
+
|
591
|
+
<br />
|
592
|
+
|
593
|
+
The architecture of the `research` mode of Monadic Chat--with its capability of generating and managing metadata properties inside a monadic structure--is parallel to the architecture of natural language discourse in general: They both can be seen as a kind of "state monad" (Hasebe 2021).
|
594
|
+
|
595
|
+
## Future Plans
|
596
|
+
|
597
|
+
- More test cases to verify command line user interaction behavior
|
598
|
+
- Improved error handling mechanism to catch incorrect responses from GPT
|
599
|
+
- Develop a DSL to define templates in a more efficient and systematic manner
|
600
|
+
- Develop scaffolding capabilities to build new apps quickly
|
601
|
+
|
602
|
+
## Bibliographical Data
|
603
|
+
|
604
|
+
Please use one of the following Bibtex entries when referring to Monadic Chat or the underlying concepts.
|
605
|
+
|
606
|
+
```
|
607
|
+
@inproceedings{hasebe_2023j,
|
608
|
+
author = {長谷部陽一郎},
|
609
|
+
title = {Monadic Chat:テキスト補完APIで文脈を保持するためのフレームワーク},
|
610
|
+
booktitle = {言語処理学会第29回年次大会発表論文集},
|
611
|
+
url = {https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/Q12-9.pdf},
|
612
|
+
year = {2023},
|
613
|
+
pages = {3138--3143}
|
614
|
+
}
|
615
|
+
|
616
|
+
@inproceedings{hasebe_2023e,
|
617
|
+
author = {Yoichiro Hasebe},
|
618
|
+
title = {Monadic Chat: Framework for managing context with text completion API},
|
619
|
+
booktitle = {Proceedings of the 29th annual meeting of the Association for Natural Language Processing},
|
620
|
+
url = {https://www.anlp.jp/proceedings/annual_meeting/2023/pdf_dir/Q12-9.pdf},
|
621
|
+
year = {2023},
|
622
|
+
pages = {3138--3143}
|
623
|
+
}
|
624
|
+
|
625
|
+
@phdthesis{hasebe_2021,
|
626
|
+
author = {Yoichiro Hasebe},
|
627
|
+
title = {An Integrated Approach to Discourse Connectives as Grammatical Constructions},
|
628
|
+
school = {Kyoto University},
|
629
|
+
url = {https://repository.kulib.kyoto-u.ac.jp/dspace/bitstream/2433/261627/2/dnink00969.pdf},
|
630
|
+
year = {2021}
|
631
|
+
}
|
632
|
+
|
633
|
+
```
|
634
|
+
|
635
|
+
## Acknowledgments
|
636
|
+
|
637
|
+
This work was partially supported by JSPS KAKENHI Grant Number JP18K00670.
|
638
|
+
|
639
|
+
## Contributing
|
640
|
+
|
641
|
+
Bug reports and pull requests are welcome on GitHub at [https://github.com/yohasebe/monadic_chat]([https://github.com/yohasebe/monadic_chat]).
|
642
|
+
|
643
|
+
## Author
|
644
|
+
|
645
|
+
Yoichiro HASEBE
|
646
|
+
|
647
|
+
[yohasebe@gmail.com](yohasebe@gmail.com)
|
648
|
+
|
649
|
+
## License
|
650
|
+
|
651
|
+
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
652
|
+
|
data/Rakefile
ADDED
data/apps/chat/chat.json
ADDED
@@ -0,0 +1,4 @@
|
|
1
|
+
{"messages": [
|
2
|
+
{"role": "system",
|
3
|
+
"content": "You are a friendly but professional consultant who answers various questions, write computer program code, make decent suggestions, give helpful advice in response to a prompt from the user. If the prompt is not clear enough, ask the user to rephrase it. You are able to empathize with the user; insert an emoji (displayable on the terminal screen) that you deem appropriate for the user's input at the beginning of your response. If the user input is sentimentally neutral, pick up any emoji that matchs the topic."}
|
4
|
+
]}
|