@lobehub/chat 1.2.13 → 1.2.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1 +1,139 @@
1
- TODO
1
+ ---
2
+ title: OpenAI GPT 系列 Tools Calling 评测
3
+ description: >-
4
+ 使用 LobeChat 测试 OpenAI GPT 系列模型(GPT 3.5-turbo / GPT-4 /GPT-4o) 的工具调用(Function
5
+ Calling)能力,并展现评测结果
6
+ tags:
7
+ - Tools Calling
8
+ - Benchmark
9
+ - Function Calling
10
+ - 工具调用
11
+ - 插件
12
+ ---
13
+
14
+ # OpenAI GPT Series Tool Calling
15
+
16
+ Overview of the Tool Calling capabilities of OpenAI GPT series models:
17
+
18
+ | Model | Tool Calling Support | Streaming | Parallel | Simple Instruction Score | Complex Instruction Score |
19
+ | --- | --- | --- | --- | --- | --- |
20
+ | GPT-3.5-turbo | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟 |
21
+ | GPT-4-turbo | ✅ | ✅ | ✅ | 🌟🌟 | 🌟🌟 |
22
+ | GPT-4o | ✅ | ✅ | ✅ | 🌟🌟🌟 | 🌟🌟 |
23
+
24
+ <Callout type={'info'}>
25
+ For testing instructions, see [Tools Calling - Evaluation Task
26
+ Introduction](/en/docs/usage/tools-calling#evaluation-task-introduction)
27
+ </Callout>
28
+
29
+ ## GPT 3.5-turbo
30
+
31
+ ### Simple Instruction Call: Weather Inquiry
32
+
33
+ Test Instruction: Instruction ①
34
+
35
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/65901ee2-78b8-4f56-9e0d-6407c484f434" />
36
+
37
+ <Image
38
+ alt="Tool Calling for Simple Instruction in GPT 3.5 Turbo"
39
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/1251dfc0-d1c4-4c3d-825e-dd6205793d53"
40
+ />
41
+
42
+ <details>
43
+ <summary>Streaming Tool Calling Raw Output:</summary>
44
+
45
+ </details>
46
+
47
+ ### Complex Instruction Call: Wenshengtu
48
+
49
+ Test Instruction: Instruction ②
50
+
51
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/2047665f-ab22-4da7-a390-0fb4ec5a2a14" />
52
+
53
+ <Image
54
+ alt="Tool Calling for Complex Instruction in GPT 3.5 Turbo"
55
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/125ad028-a621-4433-b5fa-321f8fd76302"
56
+ />
57
+
58
+ <details>
59
+ <summary>Streaming Tool Calling Raw Output:</summary>
60
+
61
+ </details>
62
+
63
+ ## GPT-4 Turbo
64
+
65
+ ### Simple Instruction Call: Weather Inquiry
66
+
67
+ Test Instruction: Instruction ①
68
+
69
+ Unlike GPT-3.5 Turbo, GPT-4 Turbo did not respond with "okay" when calling Tool Calling, and after multiple tests, it remained the same. Therefore, in this follow-up of a compound instruction, it is not as good as GPT-3.5 Turbo, but the remaining two capabilities are still good.
70
+
71
+ Of course, it is also possible that GPT-4 Turbo's model has more "autonomy" and believes that it does not need to output this "okay."
72
+
73
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/f865d91b-b84a-4258-ae09-9d1e15eeb43d" />
74
+
75
+ <Image
76
+ alt="Tool Calling for Simple Instruction in GPT-4 Turbo"
77
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/19298693-7a9b-4b54-9e28-c46b541b4f41"
78
+ />
79
+
80
+ <details>
81
+ <summary>Streaming Tool Calling Raw Output:</summary>
82
+
83
+ </details>
84
+
85
+ ### Complex Instruction Call: Wenshengtu
86
+
87
+ Test Instruction: Instruction ②
88
+
89
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/69989faf-9b98-41ec-ba51-40cc3545d8d1" />
90
+
91
+ <Image
92
+ alt="Tool Calling for Complex Instruction in GPT-4 Turbo"
93
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/8329c1b2-5e36-4457-946c-ce3781b05afd"
94
+ />
95
+
96
+ <details>
97
+ <summary>Streaming Tool Calling Raw Output:</summary>
98
+
99
+ </details>
100
+
101
+ ## GPT-4o
102
+
103
+ ### Simple Instruction Call: Weather Inquiry
104
+
105
+ Test Instruction: Instruction ①
106
+
107
+ Similar to GPT-3.5, GPT-4o performs very well in following compound instructions in simple instruction calls.
108
+
109
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/c77b65ab-0854-4e1f-a25b-ff43275bd318" />
110
+
111
+ <Image
112
+ alt="Tool Calling for Simple Instruction in GPT-4o"
113
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/e5d6214f-f628-4064-a330-cbd7c5d474ac"
114
+ />
115
+
116
+ <details>
117
+ <summary>Streaming Tool Calling Raw Output:</summary>
118
+
119
+ </details>
120
+
121
+ ### Complex Instruction Call: Wenshengtu
122
+
123
+ Test Instruction: Instruction ②
124
+
125
+ <Video src="https://github.com/lobehub/lobe-chat/assets/28616219/714bd86a-3b58-4941-8323-186c3fa4c6ea" />
126
+
127
+ <Image
128
+ alt="Tool Calling for Complex Instruction in GPT-4o"
129
+ src="https://github.com/lobehub/lobe-chat/assets/28616219/8329c1b2-5e36-4457-946c-ce3781b05afd"
130
+ />
131
+
132
+ <details>
133
+ <summary>Streaming Tool Calling Raw Output:</summary>
134
+
135
+ ```yml
136
+
137
+ ```
138
+
139
+ </details>