@botpress/zai 2.2.0 → 2.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -763,3 +763,145 @@
763
763
  {"test":"group > large arrays and chunking > should handle array that requires multiple group window slides-7","key":"63a629bf","input":"[{\"input\":{\"messages\":[{\"content\":\"**Existing Groups (prefer reusing these):**\\n- Group Item 0\\n- Group Item 1\\n- Group Item 2\\n- Group Item 3\\n- Group Item 4\\n- Group Item 5\\n- Group Item 6\\n- Group Item 7\\n- Group Item 8\\n- Group Item 9\\n\\n**Elements (■0 to ■49):**\\n■0: item 0■\\n■1: item 1■\\n■2: item 2■\\n■3: item 3■\\n■4: item 4■\\n■5: item 5■\\n■6: item 6■\\n■7: item 7■\\n■8: item 8■\\n■9: item 9■\\n■10: item 0■\\n■11: item 1■\\n■12: item 2■\\n■13: item 3■\\n■14: item 4■\\n■15: item 5■\\n■16: item 6■\\n■17: item 7■\\n■18: item 8■\\n■19: item 9■\\n■20: item 0■\\n■21: item 1■\\n■22: item 2■\\n■23: item 3■\\n■24: item 4■\\n■25: item 5■\\n■26: item 6■\\n■27: item 7■\\n■28: item 8■\\n■29: item 9■\\n■30: item 0■\\n■31: item 1■\\n■32: item 2■\\n■33: item 3■\\n■34: item 4■\\n■35: item 5■\\n■36: item 6■\\n■37: item 7■\\n■38: item 8■\\n■39: item 9■\\n■40: item 0■\\n■41: item 1■\\n■42: item 2■\\n■43: item 3■\\n■44: item 4■\\n■45: item 5■\\n■46: item 6■\\n■47: item 7■\\n■48: item 8■\\n■49: item 9■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** Group identical or very similar items together\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXWYHTBLBZ1G5A6UGT1Izvg2eyExv","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:Group Item 0■ \n■1:Group Item 1■ \n■2:Group Item 2■ \n■3:Group Item 3■ \n■4:Group Item 4■ \n■5:Group Item 5■ \n■6:Group Item 6■ \n■7:Group Item 7■ \n■8:Group Item 8■ \n■9:Group Item 9■ \n■10:Group Item 0■ \n■11:Group Item 1■ \n■12:Group Item 2■ \n■13:Group Item 3■ \n■14:Group Item 4■ \n■15:Group Item 5■ \n■16:Group Item 6■ \n■17:Group Item 7■ \n■18:Group Item 8■ \n■19:Group Item 9■ \n■20:Group Item 0■ \n■21:Group Item 1■ \n■22:Group Item 2■ \n■23:Group Item 3■ \n■24:Group Item 4■ \n■25:Group Item 5■ \n■26:Group Item 6■ \n■27:Group Item 7■ \n■28:Group Item 8■ \n■29:Group Item 9■ \n■30:Group Item 0■ \n■31:Group Item 1■ \n■32:Group Item 2■ \n■33:Group Item 3■ \n■34:Group Item 4■ \n■35:Group Item 5■ \n■36:Group Item 6■ \n■37:Group Item 7■ \n■38:Group Item 8■ \n■39:Group Item 9■ \n■40:Group Item 0■ \n■41:Group Item 1■ \n■42:Group Item 2■ \n■43:Group Item 3■ \n■44:Group Item 4■ \n■45:Group Item 5■ \n■46:Group Item 6■ \n■47:Group Item 7■ \n■48:Group Item 8■ \n■49:Group Item 9■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":588,"inputCost":0.0014700000000000002,"outputTokens":453,"outputCost":0.00453},"botpress":{"cost":0.006}},"meta":{"cached":false}}}
764
764
  {"test":"group > large arrays and chunking > should handle array that requires multiple group window slides-6","key":"63a629bf","input":"[{\"input\":{\"messages\":[{\"content\":\"**Existing Groups (prefer reusing these):**\\n- Group Item 0\\n- Group Item 1\\n- Group Item 2\\n- Group Item 3\\n- Group Item 4\\n- Group Item 5\\n- Group Item 6\\n- Group Item 7\\n- Group Item 8\\n- Group Item 9\\n\\n**Elements (■0 to ■49):**\\n■0: item 0■\\n■1: item 1■\\n■2: item 2■\\n■3: item 3■\\n■4: item 4■\\n■5: item 5■\\n■6: item 6■\\n■7: item 7■\\n■8: item 8■\\n■9: item 9■\\n■10: item 0■\\n■11: item 1■\\n■12: item 2■\\n■13: item 3■\\n■14: item 4■\\n■15: item 5■\\n■16: item 6■\\n■17: item 7■\\n■18: item 8■\\n■19: item 9■\\n■20: item 0■\\n■21: item 1■\\n■22: item 2■\\n■23: item 3■\\n■24: item 4■\\n■25: item 5■\\n■26: item 6■\\n■27: item 7■\\n■28: item 8■\\n■29: item 9■\\n■30: item 0■\\n■31: item 1■\\n■32: item 2■\\n■33: item 3■\\n■34: item 4■\\n■35: item 5■\\n■36: item 6■\\n■37: item 7■\\n■38: item 8■\\n■39: item 9■\\n■40: item 0■\\n■41: item 1■\\n■42: item 2■\\n■43: item 3■\\n■44: item 4■\\n■45: item 5■\\n■46: item 6■\\n■47: item 7■\\n■48: item 8■\\n■49: item 9■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** Group identical or very similar items together\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXWYHhP1EGL11sVTtYlvfyNp0LV7o","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:Group Item 0■ \n■1:Group Item 1■ \n■2:Group Item 2■ \n■3:Group Item 3■ \n■4:Group Item 4■ \n■5:Group Item 5■ \n■6:Group Item 6■ \n■7:Group Item 7■ \n■8:Group Item 8■ \n■9:Group Item 9■ \n■10:Group Item 0■ \n■11:Group Item 1■ \n■12:Group Item 2■ \n■13:Group Item 3■ \n■14:Group Item 4■ \n■15:Group Item 5■ \n■16:Group Item 6■ \n■17:Group Item 7■ \n■18:Group Item 8■ \n■19:Group Item 9■ \n■20:Group Item 0■ \n■21:Group Item 1■ \n■22:Group Item 2■ \n■23:Group Item 3■ \n■24:Group Item 4■ \n■25:Group Item 5■ \n■26:Group Item 6■ \n■27:Group Item 7■ \n■28:Group Item 8■ \n■29:Group Item 9■ \n■30:Group Item 0■ \n■31:Group Item 1■ \n■32:Group Item 2■ \n■33:Group Item 3■ \n■34:Group Item 4■ \n■35:Group Item 5■ \n■36:Group Item 6■ \n■37:Group Item 7■ \n■38:Group Item 8■ \n■39:Group Item 9■ \n■40:Group Item 0■ \n■41:Group Item 1■ \n■42:Group Item 2■ \n■43:Group Item 3■ \n■44:Group Item 4■ \n■45:Group Item 5■ \n■46:Group Item 6■ \n■47:Group Item 7■ \n■48:Group Item 8■ \n■49:Group Item 9■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":588,"inputCost":0.0014700000000000002,"outputTokens":453,"outputCost":0.00453},"botpress":{"cost":0.006}},"meta":{"cached":false}}}
765
765
  {"test":"group > large arrays and chunking > should handle array that requires multiple group window slides-5","key":"63a629bf","input":"[{\"input\":{\"messages\":[{\"content\":\"**Existing Groups (prefer reusing these):**\\n- Group Item 0\\n- Group Item 1\\n- Group Item 2\\n- Group Item 3\\n- Group Item 4\\n- Group Item 5\\n- Group Item 6\\n- Group Item 7\\n- Group Item 8\\n- Group Item 9\\n\\n**Elements (■0 to ■49):**\\n■0: item 0■\\n■1: item 1■\\n■2: item 2■\\n■3: item 3■\\n■4: item 4■\\n■5: item 5■\\n■6: item 6■\\n■7: item 7■\\n■8: item 8■\\n■9: item 9■\\n■10: item 0■\\n■11: item 1■\\n■12: item 2■\\n■13: item 3■\\n■14: item 4■\\n■15: item 5■\\n■16: item 6■\\n■17: item 7■\\n■18: item 8■\\n■19: item 9■\\n■20: item 0■\\n■21: item 1■\\n■22: item 2■\\n■23: item 3■\\n■24: item 4■\\n■25: item 5■\\n■26: item 6■\\n■27: item 7■\\n■28: item 8■\\n■29: item 9■\\n■30: item 0■\\n■31: item 1■\\n■32: item 2■\\n■33: item 3■\\n■34: item 4■\\n■35: item 5■\\n■36: item 6■\\n■37: item 7■\\n■38: item 8■\\n■39: item 9■\\n■40: item 0■\\n■41: item 1■\\n■42: item 2■\\n■43: item 3■\\n■44: item 4■\\n■45: item 5■\\n■46: item 6■\\n■47: item 7■\\n■48: item 8■\\n■49: item 9■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** Group identical or very similar items together\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXWYH0yHmQYz64QWkji2nOskM5uNv","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:Group Item 0■ \n■1:Group Item 1■ \n■2:Group Item 2■ \n■3:Group Item 3■ \n■4:Group Item 4■ \n■5:Group Item 5■ \n■6:Group Item 6■ \n■7:Group Item 7■ \n■8:Group Item 8■ \n■9:Group Item 9■ \n■10:Group Item 0■ \n■11:Group Item 1■ \n■12:Group Item 2■ \n■13:Group Item 3■ \n■14:Group Item 4■ \n■15:Group Item 5■ \n■16:Group Item 6■ \n■17:Group Item 7■ \n■18:Group Item 8■ \n■19:Group Item 9■ \n■20:Group Item 0■ \n■21:Group Item 1■ \n■22:Group Item 2■ \n■23:Group Item 3■ \n■24:Group Item 4■ \n■25:Group Item 5■ \n■26:Group Item 6■ \n■27:Group Item 7■ \n■28:Group Item 8■ \n■29:Group Item 9■ \n■30:Group Item 0■ \n■31:Group Item 1■ \n■32:Group Item 2■ \n■33:Group Item 3■ \n■34:Group Item 4■ \n■35:Group Item 5■ \n■36:Group Item 6■ \n■37:Group Item 7■ \n■38:Group Item 8■ \n■39:Group Item 9■ \n■40:Group Item 0■ \n■41:Group Item 1■ \n■42:Group Item 2■ \n■43:Group Item 3■ \n■44:Group Item 4■ \n■45:Group Item 5■ \n■46:Group Item 6■ \n■47:Group Item 7■ \n■48:Group Item 8■ \n■49:Group Item 9■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":588,"inputCost":0.0014700000000000002,"outputTokens":453,"outputCost":0.00453},"botpress":{"cost":0.006}},"meta":{"cached":false}}}
766
+ {"test":"zai.rate > string instructions > should rate a single item with string instructions-1","key":"5acfacea","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how complete is this student record?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXBKuqR48SwAlZGLQOnC1lMhb2H","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"basic_information\": {\n \"very_bad\": \"Missing all personal info fields.\",\n \"bad\": \"Most personal info fields are incomplete.\",\n \"average\": \"Some key personal info is complete.\",\n \"good\": \"Most personal info is filled accurately.\",\n \"very_good\": \"All personal info fields are complete.\"\n },\n \"academic_details\": {\n \"very_bad\": \"No academic records are provided.\",\n \"bad\": \"Few academic details are incomplete.\",\n \"average\": \"Some academic records are missing.\",\n \"good\": \"Most academic records are complete.\",\n \"very_good\": \"All academic details are complete.\"\n },\n \"contact_information\": {\n \"very_bad\": \"No contact details provided.\",\n \"bad\": \"Most contact fields are blank.\",\n \"average\": \"Some contact details are incomplete.\",\n \"good\": \"Most contact details are provided.\",\n \"very_good\": \"All contact fields are complete.\"\n },\n \"documents_uploaded\": {\n \"very_bad\": \"No required documents uploaded.\",\n \"bad\": \"Only a few documents uploaded.\",\n \"average\": \"Some essential documents are missing.\",\n \"good\": \"Most required documents are uploaded.\",\n \"very_good\": \"All required documents are uploaded.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":192,"inputCost":0.00048000000000000007,"outputTokens":268,"outputCost":0.00268},"botpress":{"cost":0.00316}},"meta":{"cached":false}}}
767
+ {"test":"zai.rate > string instructions > should rate a single item with string instructions-2","key":"3d46d526","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"name\\\":\\\"Alice Johnson\\\",\\\"studentId\\\":\\\"STU-001\\\",\\\"email\\\":\\\"alice@school.edu\\\",\\\"phone\\\":\\\"+1-555-0100\\\",\\\"address\\\":\\\"123 Main St, City, State 12345\\\",\\\"enrollmentDate\\\":\\\"2024-01-15\\\",\\\"gpa\\\":3.8,\\\"credits\\\":120}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**basic_information**:\\n - very_bad (1): Missing all personal info fields.\\n - bad (2): Most personal info fields are incomplete.\\n - average (3): Some key personal info is complete.\\n - good (4): Most personal info is filled accurately.\\n - very_good (5): All personal info fields are complete.\\n\\n**academic_details**:\\n - very_bad (1): No academic records are provided.\\n - bad (2): Few academic details are incomplete.\\n - average (3): Some academic records are missing.\\n - good (4): Most academic records are complete.\\n - very_good (5): All academic details are complete.\\n\\n**contact_information**:\\n - very_bad (1): No contact details provided.\\n - bad (2): Most contact fields are blank.\\n - average (3): Some contact details are incomplete.\\n - good (4): Most contact details are provided.\\n - very_good (5): All contact fields are complete.\\n\\n**documents_uploaded**:\\n - very_bad (1): No required documents uploaded.\\n - bad (2): Only a few documents uploaded.\\n - average (3): Some essential documents are missing.\\n - good (4): Most required documents are uploaded.\\n - very_good (5): All required documents are uploaded.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: basic_information, academic_details, contact_information, documents_uploaded\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXEP5O8BxlLANTYXAnnLDouVFBG","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:basic_information=very_good;academic_details=very_good;contact_information=very_good;documents_uploaded=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":540,"inputCost":0.00135,"outputTokens":31,"outputCost":0.00031},"botpress":{"cost":0.00166}},"meta":{"cached":false}}}
768
+ {"test":"zai.rate > string instructions > should rate multiple items with clear differences-2","key":"89f4051a","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■2):\\n■0: {\\\"name\\\":\\\"Alice Johnson\\\",\\\"studentId\\\":\\\"STU-001\\\",\\\"email\\\":\\\"alice@school.edu\\\",\\\"phone\\\":\\\"+1-555-0100\\\",\\\"address\\\":\\\"123 Main St\\\",\\\"enrollmentDate\\\":\\\"2024-01-15\\\"}■\\n■1: {\\\"name\\\":\\\"Bob\\\",\\\"studentId\\\":null,\\\"email\\\":null,\\\"phone\\\":null,\\\"address\\\":null,\\\"enrollmentDate\\\":null}■\\n■2: {\\\"name\\\":\\\"Carol Smith\\\",\\\"studentId\\\":\\\"STU-003\\\",\\\"email\\\":\\\"carol@school.edu\\\",\\\"phone\\\":null,\\\"address\\\":null,\\\"enrollmentDate\\\":\\\"2024-02-01\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**basic_information**:\\n - very_bad (1): Missing all personal info fields.\\n - bad (2): Most personal info fields are incomplete.\\n - average (3): Some key personal info is complete.\\n - good (4): Most personal info is filled accurately.\\n - very_good (5): All personal info fields are complete.\\n\\n**academic_details**:\\n - very_bad (1): No academic records are provided.\\n - bad (2): Few academic details are incomplete.\\n - average (3): Some academic records are missing.\\n - good (4): Most academic records are complete.\\n - very_good (5): All academic details are complete.\\n\\n**contact_information**:\\n - very_bad (1): No contact details provided.\\n - bad (2): Most contact fields are blank.\\n - average (3): Some contact details are incomplete.\\n - good (4): Most contact details are provided.\\n - very_good (5): All contact fields are complete.\\n\\n**documents_uploaded**:\\n - very_bad (1): No required documents uploaded.\\n - bad (2): Only a few documents uploaded.\\n - average (3): Some essential documents are missing.\\n - good (4): Most required documents are uploaded.\\n - very_good (5): All required documents are uploaded.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■2)\\n- Use exact criterion names: basic_information, academic_details, contact_information, documents_uploaded\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXFPV1DGBhdbe42nKYYFu0c2YGX","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:basic_information=very_good;academic_details=very_bad;contact_information=very_good;documents_uploaded=very_bad■ \n■1:basic_information=very_bad;academic_details=very_bad;contact_information=very_bad;documents_uploaded=very_bad■ \n■2:basic_information=good;academic_details=very_bad;contact_information=average;documents_uploaded=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":601,"inputCost":0.0015025000000000001,"outputTokens":85,"outputCost":0.0008500000000000001},"botpress":{"cost":0.0023525000000000004}},"meta":{"cached":false}}}
769
+ {"test":"zai.rate > string instructions > should provide detailed results with string instructions-1","key":"569f869e","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how complete is this form submission?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXGFdt3Yy62bRPmc7KxQ0mOg4A2","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"field_completion\": {\n \"very_bad\": \"Most fields are missing or blank.\",\n \"bad\": \"Several key fields are incomplete.\",\n \"average\": \"Some fields are missing or incomplete.\",\n \"good\": \"Most fields are filled out properly.\",\n \"very_good\": \"All fields are fully and properly completed.\"\n },\n \"accuracy\": {\n \"very_bad\": \"Contains numerous errors or inconsistencies.\",\n \"bad\": \"Several errors or mismatched information.\",\n \"average\": \"Some minor mistakes or inconsistencies.\",\n \"good\": \"Few small errors, mostly accurate.\",\n \"very_good\": \"Completely accurate with no errors.\"\n },\n \"clarity\": {\n \"very_bad\": \"Form is unclear and hard to understand.\",\n \"bad\": \"Many unclear responses or confusing answers.\",\n \"average\": \"Some answers are unclear but understandable.\",\n \"good\": \"Mostly clear and easy to understand.\",\n \"very_good\": \"Entirely clear and straightforward.\"\n },\n \"required_fields\": {\n \"very_bad\": \"Required fields are mostly blank or missing.\",\n \"bad\": \"Several required fields left blank.\",\n \"average\": \"Some required fields are incomplete.\",\n \"good\": \"Most required fields are completed.\",\n \"very_good\": \"All required fields are fully completed.\"\n },\n \"document_uploads\": {\n \"very_bad\": \"No required documents uploaded.\",\n \"bad\": \"Few required documents are uploaded.\",\n \"average\": \"Some required documents are missing.\",\n \"good\": \"Most required documents are uploaded.\",\n \"very_good\": \"All required documents are uploaded.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":192,"inputCost":0.00048000000000000007,"outputTokens":348,"outputCost":0.0034800000000000005},"botpress":{"cost":0.003960000000000001}},"meta":{"cached":false}}}
770
+ {"test":"zai.rate > string instructions > should provide detailed results with string instructions-2","key":"cda4080a","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"firstName\\\":\\\"John\\\",\\\"lastName\\\":\\\"Doe\\\",\\\"email\\\":\\\"john.doe@example.com\\\",\\\"phone\\\":\\\"+1-555-0123\\\",\\\"address\\\":\\\"456 Oak Ave\\\",\\\"city\\\":\\\"Springfield\\\",\\\"state\\\":\\\"IL\\\",\\\"zipCode\\\":\\\"62701\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_completion**:\\n - very_bad (1): Most fields are missing or blank.\\n - bad (2): Several key fields are incomplete.\\n - average (3): Some fields are missing or incomplete.\\n - good (4): Most fields are filled out properly.\\n - very_good (5): All fields are fully and properly completed.\\n\\n**accuracy**:\\n - very_bad (1): Contains numerous errors or inconsistencies.\\n - bad (2): Several errors or mismatched information.\\n - average (3): Some minor mistakes or inconsistencies.\\n - good (4): Few small errors, mostly accurate.\\n - very_good (5): Completely accurate with no errors.\\n\\n**clarity**:\\n - very_bad (1): Form is unclear and hard to understand.\\n - bad (2): Many unclear responses or confusing answers.\\n - average (3): Some answers are unclear but understandable.\\n - good (4): Mostly clear and easy to understand.\\n - very_good (5): Entirely clear and straightforward.\\n\\n**required_fields**:\\n - very_bad (1): Required fields are mostly blank or missing.\\n - bad (2): Several required fields left blank.\\n - average (3): Some required fields are incomplete.\\n - good (4): Most required fields are completed.\\n - very_good (5): All required fields are fully completed.\\n\\n**document_uploads**:\\n - very_bad (1): No required documents uploaded.\\n - bad (2): Few required documents are uploaded.\\n - average (3): Some required documents are missing.\\n - good (4): Most required documents are uploaded.\\n - very_good (5): All required documents are uploaded.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: field_completion, accuracy, clarity, required_fields, document_uploads\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXLSLfmiAuoQNajpmLkdtkvGLk8","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_completion=very_good;accuracy=very_good;clarity=very_good;required_fields=very_good;document_uploads=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":607,"inputCost":0.0015175000000000002,"outputTokens":37,"outputCost":0.00037000000000000005},"botpress":{"cost":0.0018875000000000003}},"meta":{"cached":false}}}
771
+ {"test":"zai.rate > record instructions > should rate items with fixed criteria-1","key":"f72ae1d8","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- length: password length (12+ chars = very_good, 8-11 = good, 6-7 = average, 4-5 = bad, <4 = very_bad)\\n- complexity: character variety (uppercase+lowercase+numbers+symbols = very_good, 3 types = good, 2 types = average, 1 type = bad)\\n- strength: overall password strength against brute force attacks\\n\\nOutput format (JSON):\\n{\\n \\\"length\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXMtjpK0sWPZinvNRRlqNbrb9o6","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"length\": {\n \"very_bad\": \"Less than 4 characters long.\",\n \"bad\": \"Between 4 and 5 characters long.\",\n \"average\": \"Between 6 and 7 characters long.\",\n \"good\": \"Between 8 and 11 characters long.\",\n \"very_good\": \"12 or more characters long.\"\n },\n \"complexity\": {\n \"very_bad\": \"Contains only one character type.\",\n \"bad\": \"Contains two character types.\",\n \"average\": \"Contains three character types.\",\n \"good\": \"Uppercase, lowercase, numbers, or symbols.\",\n \"very_good\": \"All: uppercase, lowercase, numbers, symbols.\"\n },\n \"strength\": {\n \"very_bad\": \"Extremely easy to brute-force.\",\n \"bad\": \"Weak and easily cracked.\",\n \"average\": \"Moderately strong, but not secure.\",\n \"good\": \"Strong, secure against most attacks.\",\n \"very_good\": \"Extremely strong, highly resistant to attacks.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":241,"inputCost":0.0006025000000000001,"outputTokens":221,"outputCost":0.00221},"botpress":{"cost":0.0028125000000000003}},"meta":{"cached":false}}}
772
+ {"test":"zai.rate > record instructions > should rate items with fixed criteria-2","key":"f071e452","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"password\\\":\\\"Tr0ng!P@ssw0rd#2024\\\",\\\"length\\\":20,\\\"hasUppercase\\\":true,\\\"hasLowercase\\\":true,\\\"hasNumbers\\\":true,\\\"hasSymbols\\\":true}■\\n■1: {\\\"password\\\":\\\"weak\\\",\\\"length\\\":4,\\\"hasUppercase\\\":false,\\\"hasLowercase\\\":true,\\\"hasNumbers\\\":false,\\\"hasSymbols\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**length**:\\n - very_bad (1): Less than 4 characters long.\\n - bad (2): Between 4 and 5 characters long.\\n - average (3): Between 6 and 7 characters long.\\n - good (4): Between 8 and 11 characters long.\\n - very_good (5): 12 or more characters long.\\n\\n**complexity**:\\n - very_bad (1): Contains only one character type.\\n - bad (2): Contains two character types.\\n - average (3): Contains three character types.\\n - good (4): Uppercase, lowercase, numbers, or symbols.\\n - very_good (5): All: uppercase, lowercase, numbers, symbols.\\n\\n**strength**:\\n - very_bad (1): Extremely easy to brute-force.\\n - bad (2): Weak and easily cracked.\\n - average (3): Moderately strong, but not secure.\\n - good (4): Strong, secure against most attacks.\\n - very_good (5): Extremely strong, highly resistant to attacks.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: length, complexity, strength\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXPxH4bWHJODYLdc0MUchmC1Def","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:length=very_good;complexity=very_good;strength=very_good■ \n■1:length=bad;complexity=very_bad;strength=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":498,"inputCost":0.001245,"outputTokens":38,"outputCost":0.00038},"botpress":{"cost":0.0016250000000000001}},"meta":{"cached":false}}}
773
+ {"test":"zai.rate > record instructions > should rate with single criterion-1","key":"f2a3edc6","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- validity: email format validity (has @ symbol, domain, and TLD)\\n\\nOutput format (JSON):\\n{\\n \\\"validity\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n \\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXQyOnbsTdkTuzCfGckrup99g59","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"validity\": {\n \"very_bad\": \"Completely invalid; missing @ or domain.\",\n \"bad\": \"Incorrect format; partial domain or TLD.\",\n \"average\": \"Mostly valid but minor format issues.\",\n \"good\": \"Valid format; minor improvements possible.\",\n \"very_good\": \"Perfectly valid email format.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":171,"inputCost":0.00042750000000000004,"outputTokens":79,"outputCost":0.00079},"botpress":{"cost":0.0012175}},"meta":{"cached":false}}}
774
+ {"test":"zai.rate > record instructions > should rate with single criterion-2","key":"d70e8028","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"email\\\":\\\"valid.email@example.com\\\",\\\"hasAt\\\":true,\\\"hasDomain\\\":true,\\\"hasTLD\\\":true}■\\n■1: {\\\"email\\\":\\\"invalid-email\\\",\\\"hasAt\\\":false,\\\"hasDomain\\\":false,\\\"hasTLD\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**validity**:\\n - very_bad (1): Completely invalid; missing @ or domain.\\n - bad (2): Incorrect format; partial domain or TLD.\\n - average (3): Mostly valid but minor format issues.\\n - good (4): Valid format; minor improvements possible.\\n - very_good (5): Perfectly valid email format.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: validity\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXRVaOKSznFYAspSm3j3x9FaQbt","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:validity=very_good■ \n■1:validity=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":320,"inputCost":0.0008,"outputTokens":23,"outputCost":0.00023},"botpress":{"cost":0.00103}},"meta":{"cached":false}}}
775
+ {"test":"zai.rate > record instructions > should provide detailed results with record instructions-1","key":"66aabae3","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- naming: filename follows conventions (has extension, no special chars)\\n- size: file size is reasonable (not too large or too small)\\n\\nOutput format (JSON):\\n{\\n \\\"naming\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXSNU8j4lMIhWwb9XOOxzCg82vx","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"naming\": {\n \"very_bad\": \"No extension, special chars, unclear name.\",\n \"bad\": \"Has issues: missing extension or special chars.\",\n \"average\": \"Mostly follows rules but minor inconsistencies.\",\n \"good\": \"Follows conventions with clear, valid naming.\",\n \"very_good\": \"Perfectly adheres to naming conventions.\"\n },\n \"size\": {\n \"very_bad\": \"File size is unusably large or tiny.\",\n \"bad\": \"Size is problematic but still accessible.\",\n \"average\": \"Reasonable size but not ideal.\",\n \"good\": \"Appropriate file size for typical use.\",\n \"very_good\": \"Optimized size: efficient and ideal.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":184,"inputCost":0.00046,"outputTokens":157,"outputCost":0.0015700000000000002},"botpress":{"cost":0.00203}},"meta":{"cached":false}}}
776
+ {"test":"zai.rate > record instructions > should provide detailed results with record instructions-2","key":"ef4ed0a6","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"filename\\\":\\\"document.pdf\\\",\\\"sizeBytes\\\":1024000,\\\"hasExtension\\\":true,\\\"validName\\\":true,\\\"reasonableSize\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**naming**:\\n - very_bad (1): No extension, special chars, unclear name.\\n - bad (2): Has issues: missing extension or special chars.\\n - average (3): Mostly follows rules but minor inconsistencies.\\n - good (4): Follows conventions with clear, valid naming.\\n - very_good (5): Perfectly adheres to naming conventions.\\n\\n**size**:\\n - very_bad (1): File size is unusably large or tiny.\\n - bad (2): Size is problematic but still accessible.\\n - average (3): Reasonable size but not ideal.\\n - good (4): Appropriate file size for typical use.\\n - very_good (5): Optimized size: efficient and ideal.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: naming, size\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXVmkHr1AOEA1PMdimW5BixpoxS","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:naming=very_good;size=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":379,"inputCost":0.0009475000000000001,"outputTokens":16,"outputCost":0.00016},"botpress":{"cost":0.0011075000000000002}},"meta":{"cached":false}}}
777
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-1","key":"285b57bb","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how complete are the required fields?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXVa3K3wTvXasfysWLy83SuaKGo","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"field_presence\": {\n \"very_bad\": \"Most fields are completely missing.\",\n \"bad\": \"Several required fields are incomplete.\",\n \"average\": \"Some fields are complete, others missing.\",\n \"good\": \"Most required fields are complete.\",\n \"very_good\": \"All fields are complete with no omissions.\"\n },\n \"data_accuracy\": {\n \"very_bad\": \"Field data is inaccurate or invalid.\",\n \"bad\": \"Many fields contain incorrect data.\",\n \"average\": \"Some fields have minor inaccuracies.\",\n \"good\": \"Most field data is accurate.\",\n \"very_good\": \"All field data is fully accurate.\"\n },\n \"format_compliance\": {\n \"very_bad\": \"Field formatting ignores requirements.\",\n \"bad\": \"Frequent formatting errors in fields.\",\n \"average\": \"Occasional non-compliance with formats.\",\n \"good\": \"Most fields follow correct formatting.\",\n \"very_good\": \"All fields meet specified formats.\"\n },\n \"field_relevance\": {\n \"very_bad\": \"Fields provided are irrelevant to requirements.\",\n \"bad\": \"Many fields are irrelevant or redundant.\",\n \"average\": \"Some fields are contextually relevant.\",\n \"good\": \"Most fields align with the context.\",\n \"very_good\": \"All fields are fully relevant and appropriate.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":192,"inputCost":0.00048000000000000007,"outputTokens":282,"outputCost":0.00282},"botpress":{"cost":0.0033}},"meta":{"cached":false}}}
778
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-5","key":"341659dc","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":150,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":151,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":152,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":153,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":154,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":155,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":156,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":157,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":158,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":159,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":160,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":161,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":162,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":163,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":164,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":165,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":166,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":167,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":168,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":169,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":170,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":171,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":172,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":173,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":174,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":175,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":176,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":177,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":178,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":179,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":180,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":181,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":182,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":183,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":184,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":185,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":186,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":187,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":188,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":189,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":190,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":191,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":192,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":193,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":194,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":195,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":196,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":197,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":198,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":199,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXZPfA5SWJCdGYjPOXRKAdYSB0v","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
779
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-8","key":"b4ffcc22","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":300,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":301,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":302,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":303,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":304,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":305,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":306,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":307,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":308,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":309,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":310,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":311,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":312,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":313,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":314,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":315,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":316,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":317,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":318,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":319,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":320,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":321,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":322,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":323,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":324,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":325,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":326,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":327,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":328,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":329,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":330,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":331,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":332,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":333,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":334,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":335,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":336,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":337,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":338,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":339,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":340,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":341,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":342,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":343,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":344,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":345,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":346,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":347,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":348,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":349,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXZBBrPzEXPvqHDphF3sT4vmLJJ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
780
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-3","key":"a16683da","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":50,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":51,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":52,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":53,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":54,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":55,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":56,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":57,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":58,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":59,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":60,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":61,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":62,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":63,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":64,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":65,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":66,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":67,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":68,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":69,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":70,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":71,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":72,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":73,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":74,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":75,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":76,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":77,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":78,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":79,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":80,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":81,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":82,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":83,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":84,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":85,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":86,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":87,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":88,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":89,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":90,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":91,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":92,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":93,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":94,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":95,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":96,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":97,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":98,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":99,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXaR62wgJ6IhJQMapKSnI4mep8i","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
781
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-10","key":"357ec26c","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":400,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":401,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":402,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":403,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":404,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":405,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":406,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":407,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":408,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":409,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":410,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":411,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":412,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":413,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":414,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":415,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":416,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":417,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":418,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":419,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":420,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":421,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":422,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":423,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":424,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":425,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":426,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":427,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":428,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":429,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":430,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":431,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":432,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":433,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":434,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":435,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":436,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":437,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":438,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":439,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":440,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":441,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":442,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":443,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":444,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":445,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":446,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":447,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":448,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":449,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXZ1SMjE8ZlDU5XGxGi1Cc9EsbB","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
782
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-6","key":"3480d5d8","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":200,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":201,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":202,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":203,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":204,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":205,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":206,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":207,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":208,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":209,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":210,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":211,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":212,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":213,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":214,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":215,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":216,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":217,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":218,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":219,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":220,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":221,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":222,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":223,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":224,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":225,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":226,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":227,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":228,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":229,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":230,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":231,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":232,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":233,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":234,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":235,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":236,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":237,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":238,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":239,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":240,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":241,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":242,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":243,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":244,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":245,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":246,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":247,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":248,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":249,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXabjDSzlGdnmiihxHW7DCIaY9V","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
783
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-11","key":"b5933cba","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":450,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":451,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":452,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":453,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":454,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":455,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":456,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":457,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":458,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":459,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":460,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":461,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":462,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":463,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":464,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":465,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":466,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":467,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":468,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":469,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":470,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":471,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":472,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":473,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":474,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":475,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":476,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":477,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":478,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":479,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":480,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":481,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":482,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":483,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":484,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":485,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":486,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":487,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":488,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":489,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":490,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":491,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":492,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":493,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":494,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":495,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":496,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":497,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":498,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":499,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXaWWkAexMYuIyzzll2J1WUIiV8","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
784
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-7","key":"b4955026","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":250,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":251,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":252,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":253,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":254,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":255,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":256,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":257,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":258,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":259,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":260,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":261,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":262,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":263,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":264,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":265,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":266,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":267,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":268,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":269,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":270,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":271,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":272,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":273,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":274,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":275,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":276,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":277,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":278,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":279,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":280,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":281,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":282,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":283,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":284,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":285,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":286,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":287,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":288,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":289,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":290,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":291,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":292,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":293,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":294,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":295,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":296,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":297,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":298,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":299,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXZXPQWYEQ7zArVy6cKBd6YiPyQ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
785
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-4","key":"b401df8e","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":100,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":101,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":102,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":103,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":104,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":105,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":106,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":107,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":108,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":109,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":110,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":111,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":112,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":113,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":114,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":115,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":116,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":117,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":118,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":119,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":120,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":121,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":122,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":123,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":124,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":125,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":126,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":127,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":128,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":129,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":130,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":131,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":132,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":133,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":134,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":135,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":136,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":137,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":138,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":139,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":140,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":141,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":142,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":143,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":144,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":145,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":146,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":147,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":148,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":149,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXaXeBfDXd59tTJDpAQTS6vIPDK","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
786
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-9","key":"35144670","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":350,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":351,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":352,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":353,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":354,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":355,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":356,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":357,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":358,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":359,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":360,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":361,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":362,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":363,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":364,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":365,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":366,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":367,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":368,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":369,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":370,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":371,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":372,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":373,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":374,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":375,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":376,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":377,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":378,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":379,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":380,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":381,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":382,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":383,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":384,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":385,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":386,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":387,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":388,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":389,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":390,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":391,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":392,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":393,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":394,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":395,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":396,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":397,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":398,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":399,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXZ90vdfD4xzG0BnM2YfmnjQKWW","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
787
+ {"test":"zai.rate > large arrays > should rate many items efficiently with parallelization-2","key":"5162c374","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":0,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■1: {\\\"id\\\":1,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■2: {\\\"id\\\":2,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■3: {\\\"id\\\":3,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■4: {\\\"id\\\":4,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■5: {\\\"id\\\":5,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■6: {\\\"id\\\":6,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■7: {\\\"id\\\":7,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■8: {\\\"id\\\":8,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■9: {\\\"id\\\":9,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■10: {\\\"id\\\":10,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■11: {\\\"id\\\":11,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■12: {\\\"id\\\":12,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■13: {\\\"id\\\":13,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■14: {\\\"id\\\":14,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■15: {\\\"id\\\":15,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■16: {\\\"id\\\":16,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■17: {\\\"id\\\":17,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■18: {\\\"id\\\":18,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■19: {\\\"id\\\":19,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■20: {\\\"id\\\":20,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■21: {\\\"id\\\":21,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■22: {\\\"id\\\":22,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■23: {\\\"id\\\":23,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■24: {\\\"id\\\":24,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■25: {\\\"id\\\":25,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■26: {\\\"id\\\":26,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■27: {\\\"id\\\":27,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■28: {\\\"id\\\":28,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■29: {\\\"id\\\":29,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■30: {\\\"id\\\":30,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■31: {\\\"id\\\":31,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■32: {\\\"id\\\":32,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■33: {\\\"id\\\":33,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■34: {\\\"id\\\":34,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■35: {\\\"id\\\":35,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■36: {\\\"id\\\":36,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■37: {\\\"id\\\":37,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■38: {\\\"id\\\":38,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■39: {\\\"id\\\":39,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■40: {\\\"id\\\":40,\\\"requiredField1\\\":\\\"present\\\",\\\"requiredField2\\\":\\\"present\\\",\\\"requiredField3\\\":\\\"present\\\",\\\"optionalField\\\":\\\"present\\\"}■\\n■41: {\\\"id\\\":41,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■42: {\\\"id\\\":42,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■43: {\\\"id\\\":43,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■44: {\\\"id\\\":44,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■45: {\\\"id\\\":45,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■46: {\\\"id\\\":46,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■47: {\\\"id\\\":47,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■48: {\\\"id\\\":48,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n■49: {\\\"id\\\":49,\\\"requiredField1\\\":null,\\\"requiredField2\\\":null,\\\"requiredField3\\\":null,\\\"optionalField\\\":null}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_presence**:\\n - very_bad (1): Most fields are completely missing.\\n - bad (2): Several required fields are incomplete.\\n - average (3): Some fields are complete, others missing.\\n - good (4): Most required fields are complete.\\n - very_good (5): All fields are complete with no omissions.\\n\\n**data_accuracy**:\\n - very_bad (1): Field data is inaccurate or invalid.\\n - bad (2): Many fields contain incorrect data.\\n - average (3): Some fields have minor inaccuracies.\\n - good (4): Most field data is accurate.\\n - very_good (5): All field data is fully accurate.\\n\\n**format_compliance**:\\n - very_bad (1): Field formatting ignores requirements.\\n - bad (2): Frequent formatting errors in fields.\\n - average (3): Occasional non-compliance with formats.\\n - good (4): Most fields follow correct formatting.\\n - very_good (5): All fields meet specified formats.\\n\\n**field_relevance**:\\n - very_bad (1): Fields provided are irrelevant to requirements.\\n - bad (2): Many fields are irrelevant or redundant.\\n - average (3): Some fields are contextually relevant.\\n - good (4): Most fields align with the context.\\n - very_good (5): All fields are fully relevant and appropriate.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: field_presence, data_accuracy, format_compliance, field_relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXaNBkiQADLEfwB2YM9czFZhGHd","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■1:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■2:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■3:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■4:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■5:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■6:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■7:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■8:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■9:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■10:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■11:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■12:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■13:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■14:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■15:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■16:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■17:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■18:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■19:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■20:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■21:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■22:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■23:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■24:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■25:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■26:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■27:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■28:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■29:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■30:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■31:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■32:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■33:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■34:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■35:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■36:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■37:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■38:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■39:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■40:field_presence=very_good;data_accuracy=very_good;format_compliance=very_good;field_relevance=very_good■ \n■41:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■42:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■43:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■44:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■45:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■46:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■47:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■48:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n■49:field_presence=very_bad;data_accuracy=very_bad;format_compliance=very_bad;field_relevance=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2135,"inputCost":0.0053375,"outputTokens":1503,"outputCost":0.015030000000000002},"botpress":{"cost":0.020367500000000004}},"meta":{"cached":false}}}
788
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-1","key":"69f6c473","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- id_presence: does it have an id field?\\n- value_presence: does it have a value field?\\n\\nOutput format (JSON):\\n{\\n \\\"id_presence\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXhr2IF3PP1F7CRyASEzgLlnqqt","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"id_presence\": {\n \"very_bad\": \"No id field present.\",\n \"bad\": \"id field exists but incorrect format.\",\n \"average\": \"id field present but partially complete.\",\n \"good\": \"id field present and mostly accurate.\",\n \"very_good\": \"id field present and perfectly accurate.\"\n },\n \"value_presence\": {\n \"very_bad\": \"No value field present.\",\n \"bad\": \"value field exists but incorrect format.\",\n \"average\": \"value field present but incomplete.\",\n \"good\": \"value field present and mostly valid.\",\n \"very_good\": \"value field present and perfectly valid.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":177,"inputCost":0.0004425,"outputTokens":141,"outputCost":0.00141},"botpress":{"cost":0.0018525}},"meta":{"cached":false}}}
789
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-6","key":"290abc3","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":200,\\\"value\\\":400,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":201,\\\"value\\\":402,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":202,\\\"value\\\":404,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":203,\\\"value\\\":406,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":204,\\\"value\\\":408,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":205,\\\"value\\\":410,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":206,\\\"value\\\":412,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":207,\\\"value\\\":414,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":208,\\\"value\\\":416,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":209,\\\"value\\\":418,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":210,\\\"value\\\":420,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":211,\\\"value\\\":422,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":212,\\\"value\\\":424,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":213,\\\"value\\\":426,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":214,\\\"value\\\":428,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":215,\\\"value\\\":430,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":216,\\\"value\\\":432,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":217,\\\"value\\\":434,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":218,\\\"value\\\":436,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":219,\\\"value\\\":438,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":220,\\\"value\\\":440,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":221,\\\"value\\\":442,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":222,\\\"value\\\":444,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":223,\\\"value\\\":446,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":224,\\\"value\\\":448,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":225,\\\"value\\\":450,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":226,\\\"value\\\":452,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":227,\\\"value\\\":454,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":228,\\\"value\\\":456,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":229,\\\"value\\\":458,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":230,\\\"value\\\":460,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":231,\\\"value\\\":462,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":232,\\\"value\\\":464,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":233,\\\"value\\\":466,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":234,\\\"value\\\":468,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":235,\\\"value\\\":470,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":236,\\\"value\\\":472,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":237,\\\"value\\\":474,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":238,\\\"value\\\":476,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":239,\\\"value\\\":478,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":240,\\\"value\\\":480,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":241,\\\"value\\\":482,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":242,\\\"value\\\":484,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":243,\\\"value\\\":486,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":244,\\\"value\\\":488,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":245,\\\"value\\\":490,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":246,\\\"value\\\":492,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":247,\\\"value\\\":494,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":248,\\\"value\\\":496,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":249,\\\"value\\\":498,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXjE8cdqeO7deO4ElBStD9jUFhO","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
790
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-3","key":"a5947ed9","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":50,\\\"value\\\":100,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":51,\\\"value\\\":102,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":52,\\\"value\\\":104,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":53,\\\"value\\\":106,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":54,\\\"value\\\":108,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":55,\\\"value\\\":110,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":56,\\\"value\\\":112,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":57,\\\"value\\\":114,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":58,\\\"value\\\":116,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":59,\\\"value\\\":118,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":60,\\\"value\\\":120,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":61,\\\"value\\\":122,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":62,\\\"value\\\":124,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":63,\\\"value\\\":126,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":64,\\\"value\\\":128,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":65,\\\"value\\\":130,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":66,\\\"value\\\":132,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":67,\\\"value\\\":134,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":68,\\\"value\\\":136,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":69,\\\"value\\\":138,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":70,\\\"value\\\":140,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":71,\\\"value\\\":142,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":72,\\\"value\\\":144,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":73,\\\"value\\\":146,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":74,\\\"value\\\":148,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":75,\\\"value\\\":150,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":76,\\\"value\\\":152,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":77,\\\"value\\\":154,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":78,\\\"value\\\":156,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":79,\\\"value\\\":158,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":80,\\\"value\\\":160,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":81,\\\"value\\\":162,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":82,\\\"value\\\":164,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":83,\\\"value\\\":166,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":84,\\\"value\\\":168,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":85,\\\"value\\\":170,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":86,\\\"value\\\":172,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":87,\\\"value\\\":174,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":88,\\\"value\\\":176,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":89,\\\"value\\\":178,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":90,\\\"value\\\":180,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":91,\\\"value\\\":182,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":92,\\\"value\\\":184,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":93,\\\"value\\\":186,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":94,\\\"value\\\":188,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":95,\\\"value\\\":190,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":96,\\\"value\\\":192,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":97,\\\"value\\\":194,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":98,\\\"value\\\":196,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":99,\\\"value\\\":198,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXjwH6rSn3KnZPO0VgYL01uS2A5","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
791
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-4","key":"a29f6be3","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":100,\\\"value\\\":200,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":101,\\\"value\\\":202,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":102,\\\"value\\\":204,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":103,\\\"value\\\":206,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":104,\\\"value\\\":208,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":105,\\\"value\\\":210,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":106,\\\"value\\\":212,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":107,\\\"value\\\":214,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":108,\\\"value\\\":216,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":109,\\\"value\\\":218,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":110,\\\"value\\\":220,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":111,\\\"value\\\":222,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":112,\\\"value\\\":224,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":113,\\\"value\\\":226,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":114,\\\"value\\\":228,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":115,\\\"value\\\":230,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":116,\\\"value\\\":232,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":117,\\\"value\\\":234,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":118,\\\"value\\\":236,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":119,\\\"value\\\":238,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":120,\\\"value\\\":240,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":121,\\\"value\\\":242,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":122,\\\"value\\\":244,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":123,\\\"value\\\":246,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":124,\\\"value\\\":248,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":125,\\\"value\\\":250,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":126,\\\"value\\\":252,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":127,\\\"value\\\":254,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":128,\\\"value\\\":256,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":129,\\\"value\\\":258,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":130,\\\"value\\\":260,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":131,\\\"value\\\":262,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":132,\\\"value\\\":264,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":133,\\\"value\\\":266,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":134,\\\"value\\\":268,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":135,\\\"value\\\":270,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":136,\\\"value\\\":272,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":137,\\\"value\\\":274,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":138,\\\"value\\\":276,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":139,\\\"value\\\":278,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":140,\\\"value\\\":280,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":141,\\\"value\\\":282,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":142,\\\"value\\\":284,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":143,\\\"value\\\":286,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":144,\\\"value\\\":288,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":145,\\\"value\\\":290,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":146,\\\"value\\\":292,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":147,\\\"value\\\":294,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":148,\\\"value\\\":296,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":149,\\\"value\\\":298,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXjbDjJjcPc2c6EwbJDlPvsqDiB","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
792
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-2","key":"32326399","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":0,\\\"value\\\":0,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":1,\\\"value\\\":2,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":2,\\\"value\\\":4,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":3,\\\"value\\\":6,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":4,\\\"value\\\":8,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":5,\\\"value\\\":10,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":6,\\\"value\\\":12,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":7,\\\"value\\\":14,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":8,\\\"value\\\":16,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":9,\\\"value\\\":18,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":10,\\\"value\\\":20,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":11,\\\"value\\\":22,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":12,\\\"value\\\":24,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":13,\\\"value\\\":26,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":14,\\\"value\\\":28,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":15,\\\"value\\\":30,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":16,\\\"value\\\":32,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":17,\\\"value\\\":34,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":18,\\\"value\\\":36,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":19,\\\"value\\\":38,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":20,\\\"value\\\":40,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":21,\\\"value\\\":42,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":22,\\\"value\\\":44,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":23,\\\"value\\\":46,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":24,\\\"value\\\":48,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":25,\\\"value\\\":50,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":26,\\\"value\\\":52,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":27,\\\"value\\\":54,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":28,\\\"value\\\":56,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":29,\\\"value\\\":58,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":30,\\\"value\\\":60,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":31,\\\"value\\\":62,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":32,\\\"value\\\":64,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":33,\\\"value\\\":66,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":34,\\\"value\\\":68,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":35,\\\"value\\\":70,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":36,\\\"value\\\":72,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":37,\\\"value\\\":74,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":38,\\\"value\\\":76,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":39,\\\"value\\\":78,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":40,\\\"value\\\":80,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":41,\\\"value\\\":82,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":42,\\\"value\\\":84,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":43,\\\"value\\\":86,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":44,\\\"value\\\":88,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":45,\\\"value\\\":90,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":46,\\\"value\\\":92,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":47,\\\"value\\\":94,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":48,\\\"value\\\":96,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":49,\\\"value\\\":98,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXjUFlx8oNeoSeWalN0IzDcA03Z","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
793
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-5","key":"5d85f563","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":150,\\\"value\\\":300,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":151,\\\"value\\\":302,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":152,\\\"value\\\":304,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":153,\\\"value\\\":306,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":154,\\\"value\\\":308,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":155,\\\"value\\\":310,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":156,\\\"value\\\":312,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":157,\\\"value\\\":314,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":158,\\\"value\\\":316,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":159,\\\"value\\\":318,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":160,\\\"value\\\":320,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":161,\\\"value\\\":322,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":162,\\\"value\\\":324,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":163,\\\"value\\\":326,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":164,\\\"value\\\":328,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":165,\\\"value\\\":330,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":166,\\\"value\\\":332,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":167,\\\"value\\\":334,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":168,\\\"value\\\":336,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":169,\\\"value\\\":338,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":170,\\\"value\\\":340,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":171,\\\"value\\\":342,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":172,\\\"value\\\":344,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":173,\\\"value\\\":346,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":174,\\\"value\\\":348,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":175,\\\"value\\\":350,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":176,\\\"value\\\":352,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":177,\\\"value\\\":354,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":178,\\\"value\\\":356,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":179,\\\"value\\\":358,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":180,\\\"value\\\":360,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":181,\\\"value\\\":362,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":182,\\\"value\\\":364,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":183,\\\"value\\\":366,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":184,\\\"value\\\":368,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":185,\\\"value\\\":370,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":186,\\\"value\\\":372,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":187,\\\"value\\\":374,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":188,\\\"value\\\":376,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":189,\\\"value\\\":378,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":190,\\\"value\\\":380,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":191,\\\"value\\\":382,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":192,\\\"value\\\":384,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":193,\\\"value\\\":386,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":194,\\\"value\\\":388,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":195,\\\"value\\\":390,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":196,\\\"value\\\":392,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":197,\\\"value\\\":394,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":198,\\\"value\\\":396,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":199,\\\"value\\\":398,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXjJUpRvTJsEczYIB24nOACwg47","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
794
+ {"test":"zai.rate > large arrays > should maintain order with large parallel chunks-7","key":"bd773543","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":250,\\\"value\\\":500,\\\"isEven\\\":true}■\\n■1: {\\\"id\\\":251,\\\"value\\\":502,\\\"isEven\\\":false}■\\n■2: {\\\"id\\\":252,\\\"value\\\":504,\\\"isEven\\\":true}■\\n■3: {\\\"id\\\":253,\\\"value\\\":506,\\\"isEven\\\":false}■\\n■4: {\\\"id\\\":254,\\\"value\\\":508,\\\"isEven\\\":true}■\\n■5: {\\\"id\\\":255,\\\"value\\\":510,\\\"isEven\\\":false}■\\n■6: {\\\"id\\\":256,\\\"value\\\":512,\\\"isEven\\\":true}■\\n■7: {\\\"id\\\":257,\\\"value\\\":514,\\\"isEven\\\":false}■\\n■8: {\\\"id\\\":258,\\\"value\\\":516,\\\"isEven\\\":true}■\\n■9: {\\\"id\\\":259,\\\"value\\\":518,\\\"isEven\\\":false}■\\n■10: {\\\"id\\\":260,\\\"value\\\":520,\\\"isEven\\\":true}■\\n■11: {\\\"id\\\":261,\\\"value\\\":522,\\\"isEven\\\":false}■\\n■12: {\\\"id\\\":262,\\\"value\\\":524,\\\"isEven\\\":true}■\\n■13: {\\\"id\\\":263,\\\"value\\\":526,\\\"isEven\\\":false}■\\n■14: {\\\"id\\\":264,\\\"value\\\":528,\\\"isEven\\\":true}■\\n■15: {\\\"id\\\":265,\\\"value\\\":530,\\\"isEven\\\":false}■\\n■16: {\\\"id\\\":266,\\\"value\\\":532,\\\"isEven\\\":true}■\\n■17: {\\\"id\\\":267,\\\"value\\\":534,\\\"isEven\\\":false}■\\n■18: {\\\"id\\\":268,\\\"value\\\":536,\\\"isEven\\\":true}■\\n■19: {\\\"id\\\":269,\\\"value\\\":538,\\\"isEven\\\":false}■\\n■20: {\\\"id\\\":270,\\\"value\\\":540,\\\"isEven\\\":true}■\\n■21: {\\\"id\\\":271,\\\"value\\\":542,\\\"isEven\\\":false}■\\n■22: {\\\"id\\\":272,\\\"value\\\":544,\\\"isEven\\\":true}■\\n■23: {\\\"id\\\":273,\\\"value\\\":546,\\\"isEven\\\":false}■\\n■24: {\\\"id\\\":274,\\\"value\\\":548,\\\"isEven\\\":true}■\\n■25: {\\\"id\\\":275,\\\"value\\\":550,\\\"isEven\\\":false}■\\n■26: {\\\"id\\\":276,\\\"value\\\":552,\\\"isEven\\\":true}■\\n■27: {\\\"id\\\":277,\\\"value\\\":554,\\\"isEven\\\":false}■\\n■28: {\\\"id\\\":278,\\\"value\\\":556,\\\"isEven\\\":true}■\\n■29: {\\\"id\\\":279,\\\"value\\\":558,\\\"isEven\\\":false}■\\n■30: {\\\"id\\\":280,\\\"value\\\":560,\\\"isEven\\\":true}■\\n■31: {\\\"id\\\":281,\\\"value\\\":562,\\\"isEven\\\":false}■\\n■32: {\\\"id\\\":282,\\\"value\\\":564,\\\"isEven\\\":true}■\\n■33: {\\\"id\\\":283,\\\"value\\\":566,\\\"isEven\\\":false}■\\n■34: {\\\"id\\\":284,\\\"value\\\":568,\\\"isEven\\\":true}■\\n■35: {\\\"id\\\":285,\\\"value\\\":570,\\\"isEven\\\":false}■\\n■36: {\\\"id\\\":286,\\\"value\\\":572,\\\"isEven\\\":true}■\\n■37: {\\\"id\\\":287,\\\"value\\\":574,\\\"isEven\\\":false}■\\n■38: {\\\"id\\\":288,\\\"value\\\":576,\\\"isEven\\\":true}■\\n■39: {\\\"id\\\":289,\\\"value\\\":578,\\\"isEven\\\":false}■\\n■40: {\\\"id\\\":290,\\\"value\\\":580,\\\"isEven\\\":true}■\\n■41: {\\\"id\\\":291,\\\"value\\\":582,\\\"isEven\\\":false}■\\n■42: {\\\"id\\\":292,\\\"value\\\":584,\\\"isEven\\\":true}■\\n■43: {\\\"id\\\":293,\\\"value\\\":586,\\\"isEven\\\":false}■\\n■44: {\\\"id\\\":294,\\\"value\\\":588,\\\"isEven\\\":true}■\\n■45: {\\\"id\\\":295,\\\"value\\\":590,\\\"isEven\\\":false}■\\n■46: {\\\"id\\\":296,\\\"value\\\":592,\\\"isEven\\\":true}■\\n■47: {\\\"id\\\":297,\\\"value\\\":594,\\\"isEven\\\":false}■\\n■48: {\\\"id\\\":298,\\\"value\\\":596,\\\"isEven\\\":true}■\\n■49: {\\\"id\\\":299,\\\"value\\\":598,\\\"isEven\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**id_presence**:\\n - very_bad (1): No id field present.\\n - bad (2): id field exists but incorrect format.\\n - average (3): id field present but partially complete.\\n - good (4): id field present and mostly accurate.\\n - very_good (5): id field present and perfectly accurate.\\n\\n**value_presence**:\\n - very_bad (1): No value field present.\\n - bad (2): value field exists but incorrect format.\\n - average (3): value field present but incomplete.\\n - good (4): value field present and mostly valid.\\n - very_good (5): value field present and perfectly valid.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: id_presence, value_presence\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXj1V1y7AFiIcb48zCQpzVtT9Dt","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:id_presence=very_good;value_presence=very_good■ \n■1:id_presence=very_good;value_presence=very_good■ \n■2:id_presence=very_good;value_presence=very_good■ \n■3:id_presence=very_good;value_presence=very_good■ \n■4:id_presence=very_good;value_presence=very_good■ \n■5:id_presence=very_good;value_presence=very_good■ \n■6:id_presence=very_good;value_presence=very_good■ \n■7:id_presence=very_good;value_presence=very_good■ \n■8:id_presence=very_good;value_presence=very_good■ \n■9:id_presence=very_good;value_presence=very_good■ \n■10:id_presence=very_good;value_presence=very_good■ \n■11:id_presence=very_good;value_presence=very_good■ \n■12:id_presence=very_good;value_presence=very_good■ \n■13:id_presence=very_good;value_presence=very_good■ \n■14:id_presence=very_good;value_presence=very_good■ \n■15:id_presence=very_good;value_presence=very_good■ \n■16:id_presence=very_good;value_presence=very_good■ \n■17:id_presence=very_good;value_presence=very_good■ \n■18:id_presence=very_good;value_presence=very_good■ \n■19:id_presence=very_good;value_presence=very_good■ \n■20:id_presence=very_good;value_presence=very_good■ \n■21:id_presence=very_good;value_presence=very_good■ \n■22:id_presence=very_good;value_presence=very_good■ \n■23:id_presence=very_good;value_presence=very_good■ \n■24:id_presence=very_good;value_presence=very_good■ \n■25:id_presence=very_good;value_presence=very_good■ \n■26:id_presence=very_good;value_presence=very_good■ \n■27:id_presence=very_good;value_presence=very_good■ \n■28:id_presence=very_good;value_presence=very_good■ \n■29:id_presence=very_good;value_presence=very_good■ \n■30:id_presence=very_good;value_presence=very_good■ \n■31:id_presence=very_good;value_presence=very_good■ \n■32:id_presence=very_good;value_presence=very_good■ \n■33:id_presence=very_good;value_presence=very_good■ \n■34:id_presence=very_good;value_presence=very_good■ \n■35:id_presence=very_good;value_presence=very_good■ \n■36:id_presence=very_good;value_presence=very_good■ \n■37:id_presence=very_good;value_presence=very_good■ \n■38:id_presence=very_good;value_presence=very_good■ \n■39:id_presence=very_good;value_presence=very_good■ \n■40:id_presence=very_good;value_presence=very_good■ \n■41:id_presence=very_good;value_presence=very_good■ \n■42:id_presence=very_good;value_presence=very_good■ \n■43:id_presence=very_good;value_presence=very_good■ \n■44:id_presence=very_good;value_presence=very_good■ \n■45:id_presence=very_good;value_presence=very_good■ \n■46:id_presence=very_good;value_presence=very_good■ \n■47:id_presence=very_good;value_presence=very_good■ \n■48:id_presence=very_good;value_presence=very_good■ \n■49:id_presence=very_good;value_presence=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1283,"inputCost":0.0032075000000000003,"outputTokens":753,"outputCost":0.00753},"botpress":{"cost":0.0107375}},"meta":{"cached":false}}}
795
+ {"test":"zai.rate > large arrays > should handle chunking with record instructions-1","key":"bcb16946","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- evenness: is the ID an even number?\\n- divisibility: is the ID divisible by 5?\\n\\nOutput format (JSON):\\n{\\n \\\"evenness\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXnkgjTK1q2fY4YaOnPmbMJV0TY","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"evenness\": {\n \"very_bad\": \"Rarely an even number.\",\n \"bad\": \"Occasionally an even number.\",\n \"average\": \"Even and odd numbers equal.\",\n \"good\": \"Usually an even number.\",\n \"very_good\": \"Always an even number.\"\n },\n \"divisibility\": {\n \"very_bad\": \"Rarely divisible by 5.\",\n \"bad\": \"Occasionally divisible by 5.\",\n \"average\": \"Divisible by 5 half the time.\",\n \"good\": \"Usually divisible by 5.\",\n \"very_good\": \"Always divisible by 5.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":178,"inputCost":0.00044500000000000003,"outputTokens":142,"outputCost":0.00142},"botpress":{"cost":0.0018650000000000001}},"meta":{"cached":false}}}
796
+ {"test":"zai.rate > large arrays > should handle chunking with record instructions-3","key":"3fbc22a5","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■24):\\n■0: {\\\"id\\\":50,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■1: {\\\"id\\\":51,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■2: {\\\"id\\\":52,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■3: {\\\"id\\\":53,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■4: {\\\"id\\\":54,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■5: {\\\"id\\\":55,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■6: {\\\"id\\\":56,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■7: {\\\"id\\\":57,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■8: {\\\"id\\\":58,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■9: {\\\"id\\\":59,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■10: {\\\"id\\\":60,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■11: {\\\"id\\\":61,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■12: {\\\"id\\\":62,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■13: {\\\"id\\\":63,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■14: {\\\"id\\\":64,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■15: {\\\"id\\\":65,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■16: {\\\"id\\\":66,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■17: {\\\"id\\\":67,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■18: {\\\"id\\\":68,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■19: {\\\"id\\\":69,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■20: {\\\"id\\\":70,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■21: {\\\"id\\\":71,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■22: {\\\"id\\\":72,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■23: {\\\"id\\\":73,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■24: {\\\"id\\\":74,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**evenness**:\\n - very_bad (1): Rarely an even number.\\n - bad (2): Occasionally an even number.\\n - average (3): Even and odd numbers equal.\\n - good (4): Usually an even number.\\n - very_good (5): Always an even number.\\n\\n**divisibility**:\\n - very_bad (1): Rarely divisible by 5.\\n - bad (2): Occasionally divisible by 5.\\n - average (3): Divisible by 5 half the time.\\n - good (4): Usually divisible by 5.\\n - very_good (5): Always divisible by 5.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■24)\\n- Use exact criterion names: evenness, divisibility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXofV2PaWHxqwOx2MjhLkIeJDtj","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:evenness=very_good;divisibility=very_good■ \n■1:evenness=very_bad;divisibility=very_bad■ \n■2:evenness=very_good;divisibility=very_bad■ \n■3:evenness=very_bad;divisibility=very_bad■ \n■4:evenness=very_good;divisibility=very_bad■ \n■5:evenness=very_bad;divisibility=very_good■ \n■6:evenness=very_good;divisibility=very_bad■ \n■7:evenness=very_bad;divisibility=very_bad■ \n■8:evenness=very_good;divisibility=very_bad■ \n■9:evenness=very_bad;divisibility=very_bad■ \n■10:evenness=very_good;divisibility=very_good■ \n■11:evenness=very_bad;divisibility=very_bad■ \n■12:evenness=very_good;divisibility=very_bad■ \n■13:evenness=very_bad;divisibility=very_bad■ \n■14:evenness=very_good;divisibility=very_bad■ \n■15:evenness=very_bad;divisibility=very_good■ \n■16:evenness=very_good;divisibility=very_bad■ \n■17:evenness=very_bad;divisibility=very_bad■ \n■18:evenness=very_good;divisibility=very_bad■ \n■19:evenness=very_bad;divisibility=very_bad■ \n■20:evenness=very_good;divisibility=very_good■ \n■21:evenness=very_bad;divisibility=very_bad■ \n■22:evenness=very_good;divisibility=very_bad■ \n■23:evenness=very_bad;divisibility=very_bad■ \n■24:evenness=very_good;divisibility=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":905,"inputCost":0.0022625,"outputTokens":428,"outputCost":0.00428},"botpress":{"cost":0.0065425}},"meta":{"cached":false}}}
797
+ {"test":"zai.rate > large arrays > should handle chunking with record instructions-2","key":"26c4eab7","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":0,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■1: {\\\"id\\\":1,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■2: {\\\"id\\\":2,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■3: {\\\"id\\\":3,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■4: {\\\"id\\\":4,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■5: {\\\"id\\\":5,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■6: {\\\"id\\\":6,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■7: {\\\"id\\\":7,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■8: {\\\"id\\\":8,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■9: {\\\"id\\\":9,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■10: {\\\"id\\\":10,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■11: {\\\"id\\\":11,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■12: {\\\"id\\\":12,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■13: {\\\"id\\\":13,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■14: {\\\"id\\\":14,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■15: {\\\"id\\\":15,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■16: {\\\"id\\\":16,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■17: {\\\"id\\\":17,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■18: {\\\"id\\\":18,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■19: {\\\"id\\\":19,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■20: {\\\"id\\\":20,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■21: {\\\"id\\\":21,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■22: {\\\"id\\\":22,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■23: {\\\"id\\\":23,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■24: {\\\"id\\\":24,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■25: {\\\"id\\\":25,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■26: {\\\"id\\\":26,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■27: {\\\"id\\\":27,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■28: {\\\"id\\\":28,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■29: {\\\"id\\\":29,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■30: {\\\"id\\\":30,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■31: {\\\"id\\\":31,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■32: {\\\"id\\\":32,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■33: {\\\"id\\\":33,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■34: {\\\"id\\\":34,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■35: {\\\"id\\\":35,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■36: {\\\"id\\\":36,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■37: {\\\"id\\\":37,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■38: {\\\"id\\\":38,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■39: {\\\"id\\\":39,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■40: {\\\"id\\\":40,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":true}■\\n■41: {\\\"id\\\":41,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■42: {\\\"id\\\":42,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■43: {\\\"id\\\":43,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■44: {\\\"id\\\":44,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■45: {\\\"id\\\":45,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":true}■\\n■46: {\\\"id\\\":46,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■47: {\\\"id\\\":47,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n■48: {\\\"id\\\":48,\\\"isEven\\\":true,\\\"isDivisibleBy5\\\":false}■\\n■49: {\\\"id\\\":49,\\\"isEven\\\":false,\\\"isDivisibleBy5\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**evenness**:\\n - very_bad (1): Rarely an even number.\\n - bad (2): Occasionally an even number.\\n - average (3): Even and odd numbers equal.\\n - good (4): Usually an even number.\\n - very_good (5): Always an even number.\\n\\n**divisibility**:\\n - very_bad (1): Rarely divisible by 5.\\n - bad (2): Occasionally divisible by 5.\\n - average (3): Divisible by 5 half the time.\\n - good (4): Usually divisible by 5.\\n - very_good (5): Always divisible by 5.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: evenness, divisibility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXohcZFFd6kMPFFMf9DJiDU2XPk","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:evenness=very_good;divisibility=very_good■ \n■1:evenness=very_bad;divisibility=very_bad■ \n■2:evenness=very_good;divisibility=very_bad■ \n■3:evenness=very_bad;divisibility=very_bad■ \n■4:evenness=very_good;divisibility=very_bad■ \n■5:evenness=very_bad;divisibility=very_good■ \n■6:evenness=very_good;divisibility=very_bad■ \n■7:evenness=very_bad;divisibility=very_bad■ \n■8:evenness=very_good;divisibility=very_bad■ \n■9:evenness=very_bad;divisibility=very_bad■ \n■10:evenness=very_good;divisibility=very_good■ \n■11:evenness=very_bad;divisibility=very_bad■ \n■12:evenness=very_good;divisibility=very_bad■ \n■13:evenness=very_bad;divisibility=very_bad■ \n■14:evenness=very_good;divisibility=very_bad■ \n■15:evenness=very_bad;divisibility=very_good■ \n■16:evenness=very_good;divisibility=very_bad■ \n■17:evenness=very_bad;divisibility=very_bad■ \n■18:evenness=very_good;divisibility=very_bad■ \n■19:evenness=very_bad;divisibility=very_bad■ \n■20:evenness=very_good;divisibility=very_good■ \n■21:evenness=very_bad;divisibility=very_bad■ \n■22:evenness=very_good;divisibility=very_bad■ \n■23:evenness=very_bad;divisibility=very_bad■ \n■24:evenness=very_good;divisibility=very_bad■ \n■25:evenness=very_bad;divisibility=very_good■ \n■26:evenness=very_good;divisibility=very_bad■ \n■27:evenness=very_bad;divisibility=very_bad■ \n■28:evenness=very_good;divisibility=very_bad■ \n■29:evenness=very_bad;divisibility=very_bad■ \n■30:evenness=very_good;divisibility=very_good■ \n■31:evenness=very_bad;divisibility=very_bad■ \n■32:evenness=very_good;divisibility=very_bad■ \n■33:evenness=very_bad;divisibility=very_bad■ \n■34:evenness=very_good;divisibility=very_bad■ \n■35:evenness=very_bad;divisibility=very_good■ \n■36:evenness=very_good;divisibility=very_bad■ \n■37:evenness=very_bad;divisibility=very_bad■ \n■38:evenness=very_good;divisibility=very_bad■ \n■39:evenness=very_bad;divisibility=very_bad■ \n■40:evenness=very_good;divisibility=very_good■ \n■41:evenness=very_bad;divisibility=very_bad■ \n■42:evenness=very_good;divisibility=very_bad■ \n■43:evenness=very_bad;divisibility=very_bad■ \n■44:evenness=very_good;divisibility=very_bad■ \n■45:evenness=very_bad;divisibility=very_good■ \n■46:evenness=very_good;divisibility=very_bad■ \n■47:evenness=very_bad;divisibility=very_bad■ \n■48:evenness=very_good;divisibility=very_bad■ \n■49:evenness=very_bad;divisibility=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1480,"inputCost":0.0037,"outputTokens":853,"outputCost":0.008530000000000001},"botpress":{"cost":0.012230000000000001}},"meta":{"cached":false}}}
798
+ {"test":"zai.rate > edge cases > should handle single item edge case-1","key":"5afefa89","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"is this item complete?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXtrujx6CdwjUyNNFiLgYNAQVpm","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"content_coverage\": {\n \"very_bad\": \"Missing most essential parts or sections.\",\n \"bad\": \"Significant gaps in key sections exist.\",\n \"average\": \"Most parts present but incomplete.\",\n \"good\": \"Minor areas lacking but largely complete.\",\n \"very_good\": \"All content is thorough and complete.\"\n },\n \"accuracy\": {\n \"very_bad\": \"Contains frequent and critical inaccuracies.\",\n \"bad\": \"Several notable inaccuracies are present.\",\n \"average\": \"Few inaccuracies but not significant.\",\n \"good\": \"Accurate with minor discrepancies.\",\n \"very_good\": \"Completely accurate and error-free.\"\n },\n \"clarity\": {\n \"very_bad\": \"Confusing and almost impossible to understand.\",\n \"bad\": \"Difficult to follow or understand.\",\n \"average\": \"Somewhat clear but needs improvement.\",\n \"good\": \"Mostly clear with minor confusion.\",\n \"very_good\": \"Crystal clear and easy to follow.\"\n },\n \"format_compliance\": {\n \"very_bad\": \"Does not follow required formatting at all.\",\n \"bad\": \"Major formatting issues affect readability.\",\n \"average\": \"Partially follows the required format.\",\n \"good\": \"Mostly adheres to the required format.\",\n \"very_good\": \"Completely follows the required format.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":190,"inputCost":0.00047500000000000005,"outputTokens":284,"outputCost":0.00284},"botpress":{"cost":0.0033150000000000002}},"meta":{"cached":false}}}
799
+ {"test":"zai.rate > edge cases > should handle single item edge case-2","key":"893c1fd8","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"status\\\":\\\"complete\\\",\\\"fields\\\":10,\\\"validated\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**content_coverage**:\\n - very_bad (1): Missing most essential parts or sections.\\n - bad (2): Significant gaps in key sections exist.\\n - average (3): Most parts present but incomplete.\\n - good (4): Minor areas lacking but largely complete.\\n - very_good (5): All content is thorough and complete.\\n\\n**accuracy**:\\n - very_bad (1): Contains frequent and critical inaccuracies.\\n - bad (2): Several notable inaccuracies are present.\\n - average (3): Few inaccuracies but not significant.\\n - good (4): Accurate with minor discrepancies.\\n - very_good (5): Completely accurate and error-free.\\n\\n**clarity**:\\n - very_bad (1): Confusing and almost impossible to understand.\\n - bad (2): Difficult to follow or understand.\\n - average (3): Somewhat clear but needs improvement.\\n - good (4): Mostly clear with minor confusion.\\n - very_good (5): Crystal clear and easy to follow.\\n\\n**format_compliance**:\\n - very_bad (1): Does not follow required formatting at all.\\n - bad (2): Major formatting issues affect readability.\\n - average (3): Partially follows the required format.\\n - good (4): Mostly adheres to the required format.\\n - very_good (5): Completely follows the required format.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: content_coverage, accuracy, clarity, format_compliance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXvLMFSddgIRnYjAWVbirfDLm5R","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:content_coverage=very_good;accuracy=very_good;clarity=very_good;format_compliance=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":499,"inputCost":0.0012475000000000001,"outputTokens":32,"outputCost":0.00032},"botpress":{"cost":0.0015675}},"meta":{"cached":false}}}
800
+ {"test":"zai.rate > edge cases > should handle items with clear binary outcomes-1","key":"55580f8e","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how healthy is this codebase status?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXXwSCD2Nai11Bkr58izenofTgnu","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"code_quality\": {\n \"very_bad\": \"Riddled with bugs and broken logic.\",\n \"bad\": \"Frequent bugs; poor syntax and design.\",\n \"average\": \"Usable with some bugs and inconsistencies.\",\n \"good\": \"Few bugs; code is well-structured.\",\n \"very_good\": \"No bugs; clean and efficient design.\"\n },\n \"documentation\": {\n \"very_bad\": \"No documentation; no clarity at all.\",\n \"bad\": \"Minimal and outdated documentation.\",\n \"average\": \"Basic documentation; some unclear areas.\",\n \"good\": \"Clear and mostly up-to-date documentation.\",\n \"very_good\": \"Comprehensive and always up-to-date docs.\"\n },\n \"test_coverage\": {\n \"very_bad\": \"No tests or extremely poor test coverage.\",\n \"bad\": \"Few tests cover core functionality barely.\",\n \"average\": \"Moderate test coverage; some gaps remain.\",\n \"good\": \"Test coverage is high with rare gaps.\",\n \"very_good\": \"Extensive, robust, and complete test coverage.\"\n },\n \"dependency_health\": {\n \"very_bad\": \"Outdated and insecure dependencies used.\",\n \"bad\": \"Mostly outdated, some insecure dependencies.\",\n \"average\": \"Mixed: some current, some outdated dependencies.\",\n \"good\": \"Most dependencies are current and secure.\",\n \"very_good\": \"All dependencies are updated and safe.\"\n },\n \"scalability\": {\n \"very_bad\": \"Cannot handle increases in usage or data.\",\n \"bad\": \"Struggles with moderate increases in usage.\",\n \"average\": \"Handles growth, but needs optimization.\",\n \"good\": \"Supports moderate growth; easily adaptable.\",\n \"very_good\": \"Highly scalable for any growth or needs.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":193,"inputCost":0.0004825,"outputTokens":381,"outputCost":0.0038100000000000005},"botpress":{"cost":0.0042925}},"meta":{"cached":false}}}
801
+ {"test":"zai.rate > edge cases > should handle items with clear binary outcomes-2","key":"86bc3f35","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"testsPassing\\\":true,\\\"buildSuccessful\\\":true,\\\"lintClean\\\":true,\\\"coverageAbove80\\\":true}■\\n■1: {\\\"testsPassing\\\":false,\\\"buildSuccessful\\\":false,\\\"lintClean\\\":false,\\\"coverageAbove80\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**code_quality**:\\n - very_bad (1): Riddled with bugs and broken logic.\\n - bad (2): Frequent bugs; poor syntax and design.\\n - average (3): Usable with some bugs and inconsistencies.\\n - good (4): Few bugs; code is well-structured.\\n - very_good (5): No bugs; clean and efficient design.\\n\\n**documentation**:\\n - very_bad (1): No documentation; no clarity at all.\\n - bad (2): Minimal and outdated documentation.\\n - average (3): Basic documentation; some unclear areas.\\n - good (4): Clear and mostly up-to-date documentation.\\n - very_good (5): Comprehensive and always up-to-date docs.\\n\\n**test_coverage**:\\n - very_bad (1): No tests or extremely poor test coverage.\\n - bad (2): Few tests cover core functionality barely.\\n - average (3): Moderate test coverage; some gaps remain.\\n - good (4): Test coverage is high with rare gaps.\\n - very_good (5): Extensive, robust, and complete test coverage.\\n\\n**dependency_health**:\\n - very_bad (1): Outdated and insecure dependencies used.\\n - bad (2): Mostly outdated, some insecure dependencies.\\n - average (3): Mixed: some current, some outdated dependencies.\\n - good (4): Most dependencies are current and secure.\\n - very_good (5): All dependencies are updated and safe.\\n\\n**scalability**:\\n - very_bad (1): Cannot handle increases in usage or data.\\n - bad (2): Struggles with moderate increases in usage.\\n - average (3): Handles growth, but needs optimization.\\n - good (4): Supports moderate growth; easily adaptable.\\n - very_good (5): Highly scalable for any growth or needs.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: code_quality, documentation, test_coverage, dependency_health, scalability\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY1sX0Y8UI9JmtbAIemJDyucqLq","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:code_quality=very_good;documentation=average;test_coverage=very_good;dependency_health=average;scalability=good■ \n■1:code_quality=very_bad;documentation=very_bad;test_coverage=very_bad;dependency_health=very_bad;scalability=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":637,"inputCost":0.0015925000000000002,"outputTokens":70,"outputCost":0.0007000000000000001},"botpress":{"cost":0.0022925000000000003}},"meta":{"cached":false}}}
802
+ {"test":"zai.rate > options > should respect tokensPerItem option-1","key":"80b07f63","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"does this have a description?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY2YQsAZ23xQOn9XkqQ0D0sd3eT","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"clarity\": {\n \"very_bad\": \"Description is missing or unreadable.\",\n \"bad\": \"Description is unclear and confusing.\",\n \"average\": \"Description is somewhat clear but vague.\",\n \"good\": \"Description is mostly clear and understandable.\",\n \"very_good\": \"Description is fully clear and precise.\"\n },\n \"detail\": {\n \"very_bad\": \"No meaningful details are included.\",\n \"bad\": \"Minimal details; very incomplete.\",\n \"average\": \"Some important details are provided.\",\n \"good\": \"Most relevant details are present.\",\n \"very_good\": \"Description has all necessary details.\"\n },\n \"relevance\": {\n \"very_bad\": \"Description is irrelevant to the item.\",\n \"bad\": \"Description is mostly off-topic or unrelated.\",\n \"average\": \"Description is partly relevant but incomplete.\",\n \"good\": \"Description is mostly relevant and on-topic.\",\n \"very_good\": \"Description aligns perfectly with the item.\"\n },\n \"length\": {\n \"very_bad\": \"Description is nonexistent or extremely short.\",\n \"bad\": \"Description is too short to be useful.\",\n \"average\": \"Description is of acceptable length.\",\n \"good\": \"Description is adequately detailed and sized.\",\n \"very_good\": \"Description is perfect in length and depth.\"\n },\n \"grammar\": {\n \"very_bad\": \"Description is riddled with errors.\",\n \"bad\": \"Description has frequent grammatical mistakes.\",\n \"average\": \"Description has minor grammatical issues.\",\n \"good\": \"Description is grammatically correct.\",\n \"very_good\": \"Description is flawless and well-written.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":191,"inputCost":0.00047750000000000006,"outputTokens":347,"outputCost":0.0034700000000000004},"botpress":{"cost":0.0039475000000000005}},"meta":{"cached":false}}}
803
+ {"test":"zai.rate > options > should respect tokensPerItem option-2","key":"a3d17791","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"description\\\":\\\"AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**clarity**:\\n - very_bad (1): Description is missing or unreadable.\\n - bad (2): Description is unclear and confusing.\\n - average (3): Description is somewhat clear but vague.\\n - good (4): Description is mostly clear and understandable.\\n - very_good (5): Description is fully clear and precise.\\n\\n**detail**:\\n - very_bad (1): No meaningful details are included.\\n - bad (2): Minimal details; very incomplete.\\n - average (3): Some important details are provided.\\n - good (4): Most relevant details are present.\\n - very_good (5): Description has all necessary details.\\n\\n**relevance**:\\n - very_bad (1): Description is irrelevant to the item.\\n - bad (2): Description is mostly off-topic or unrelated.\\n - average (3): Description is partly relevant but incomplete.\\n - good (4): Description is mostly relevant and on-topic.\\n - very_good (5): Description aligns perfectly with the item.\\n\\n**length**:\\n - very_bad (1): Description is nonexistent or extremely short.\\n - bad (2): Description is too short to be useful.\\n - average (3): Description is of acceptable length.\\n - good (4): Description is adequately detailed and sized.\\n - very_good (5): Description is perfect in length and depth.\\n\\n**grammar**:\\n - very_bad (1): Description is riddled with errors.\\n - bad (2): Description has frequent grammatical mistakes.\\n - average (3): Description has minor grammatical issues.\\n - good (4): Description is grammatically correct.\\n - very_good (5): Description is flawless and well-written.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: clarity, detail, relevance, length, grammar\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY5RXpqDEw0VV4prr5YsbpcHnTA","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:clarity=very_bad;detail=very_bad;relevance=very_bad;length=very_bad;grammar=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":604,"inputCost":0.00151,"outputTokens":34,"outputCost":0.00034},"botpress":{"cost":0.00185}},"meta":{"cached":false}}}
804
+ {"test":"zai.rate > options > should respect maxItemsPerChunk option-1","key":"2fef57ae","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"is the id field present?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY6c653NH5rPw96AKRt453IYH7i","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"field_existence\": {\n \"very_bad\": \"id field is always missing.\",\n \"bad\": \"id field is mostly missing.\",\n \"average\": \"id field is present half the time.\",\n \"good\": \"id field is mostly present.\",\n \"very_good\": \"id field is always present.\"\n },\n \"field_consistency\": {\n \"very_bad\": \"id field presence is highly inconsistent.\",\n \"bad\": \"id field presence is often inconsistent.\",\n \"average\": \"id field presence is somewhat consistent.\",\n \"good\": \"id field presence is mostly consistent.\",\n \"very_good\": \"id field presence is completely consistent.\"\n },\n \"field_format\": {\n \"very_bad\": \"id field has severe formatting issues.\",\n \"bad\": \"id field often has formatting issues.\",\n \"average\": \"id field sometimes has formatting issues.\",\n \"good\": \"id field rarely has formatting issues.\",\n \"very_good\": \"id field is correctly formatted.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":191,"inputCost":0.00047750000000000006,"outputTokens":214,"outputCost":0.00214},"botpress":{"cost":0.0026175}},"meta":{"cached":false}}}
805
+ {"test":"zai.rate > options > should respect maxItemsPerChunk option-5","key":"4a808d24","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"id\\\":9,\\\"isPresent\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_existence**:\\n - very_bad (1): id field is always missing.\\n - bad (2): id field is mostly missing.\\n - average (3): id field is present half the time.\\n - good (4): id field is mostly present.\\n - very_good (5): id field is always present.\\n\\n**field_consistency**:\\n - very_bad (1): id field presence is highly inconsistent.\\n - bad (2): id field presence is often inconsistent.\\n - average (3): id field presence is somewhat consistent.\\n - good (4): id field presence is mostly consistent.\\n - very_good (5): id field presence is completely consistent.\\n\\n**field_format**:\\n - very_bad (1): id field has severe formatting issues.\\n - bad (2): id field often has formatting issues.\\n - average (3): id field sometimes has formatting issues.\\n - good (4): id field rarely has formatting issues.\\n - very_good (5): id field is correctly formatted.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: field_existence, field_consistency, field_format\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY8MZjEgbUIz2ONGB1YGPyRiEdA","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":428,"inputCost":0.00107,"outputTokens":27,"outputCost":0.00027},"botpress":{"cost":0.00134}},"meta":{"cached":false}}}
806
+ {"test":"zai.rate > options > should respect maxItemsPerChunk option-4","key":"5ce6999d","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■2):\\n■0: {\\\"id\\\":6,\\\"isPresent\\\":true}■\\n■1: {\\\"id\\\":7,\\\"isPresent\\\":true}■\\n■2: {\\\"id\\\":8,\\\"isPresent\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_existence**:\\n - very_bad (1): id field is always missing.\\n - bad (2): id field is mostly missing.\\n - average (3): id field is present half the time.\\n - good (4): id field is mostly present.\\n - very_good (5): id field is always present.\\n\\n**field_consistency**:\\n - very_bad (1): id field presence is highly inconsistent.\\n - bad (2): id field presence is often inconsistent.\\n - average (3): id field presence is somewhat consistent.\\n - good (4): id field presence is mostly consistent.\\n - very_good (5): id field presence is completely consistent.\\n\\n**field_format**:\\n - very_bad (1): id field has severe formatting issues.\\n - bad (2): id field often has formatting issues.\\n - average (3): id field sometimes has formatting issues.\\n - good (4): id field rarely has formatting issues.\\n - very_good (5): id field is correctly formatted.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■2)\\n- Use exact criterion names: field_existence, field_consistency, field_format\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY8khy792FuY5Mb7T7Gx7OuVOJm","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■1:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■2:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":458,"inputCost":0.0011450000000000002,"outputTokens":75,"outputCost":0.00075},"botpress":{"cost":0.0018950000000000002}},"meta":{"cached":false}}}
807
+ {"test":"zai.rate > options > should respect maxItemsPerChunk option-2","key":"3be7318b","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■2):\\n■0: {\\\"id\\\":0,\\\"isPresent\\\":true}■\\n■1: {\\\"id\\\":1,\\\"isPresent\\\":true}■\\n■2: {\\\"id\\\":2,\\\"isPresent\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_existence**:\\n - very_bad (1): id field is always missing.\\n - bad (2): id field is mostly missing.\\n - average (3): id field is present half the time.\\n - good (4): id field is mostly present.\\n - very_good (5): id field is always present.\\n\\n**field_consistency**:\\n - very_bad (1): id field presence is highly inconsistent.\\n - bad (2): id field presence is often inconsistent.\\n - average (3): id field presence is somewhat consistent.\\n - good (4): id field presence is mostly consistent.\\n - very_good (5): id field presence is completely consistent.\\n\\n**field_format**:\\n - very_bad (1): id field has severe formatting issues.\\n - bad (2): id field often has formatting issues.\\n - average (3): id field sometimes has formatting issues.\\n - good (4): id field rarely has formatting issues.\\n - very_good (5): id field is correctly formatted.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■2)\\n- Use exact criterion names: field_existence, field_consistency, field_format\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY8YkpjL47v3OcKm1CKXMuMPlwr","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■1:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■2:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":458,"inputCost":0.0011450000000000002,"outputTokens":75,"outputCost":0.00075},"botpress":{"cost":0.0018950000000000002}},"meta":{"cached":false}}}
808
+ {"test":"zai.rate > options > should respect maxItemsPerChunk option-3","key":"cc66e594","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■2):\\n■0: {\\\"id\\\":3,\\\"isPresent\\\":true}■\\n■1: {\\\"id\\\":4,\\\"isPresent\\\":true}■\\n■2: {\\\"id\\\":5,\\\"isPresent\\\":true}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**field_existence**:\\n - very_bad (1): id field is always missing.\\n - bad (2): id field is mostly missing.\\n - average (3): id field is present half the time.\\n - good (4): id field is mostly present.\\n - very_good (5): id field is always present.\\n\\n**field_consistency**:\\n - very_bad (1): id field presence is highly inconsistent.\\n - bad (2): id field presence is often inconsistent.\\n - average (3): id field presence is somewhat consistent.\\n - good (4): id field presence is mostly consistent.\\n - very_good (5): id field presence is completely consistent.\\n\\n**field_format**:\\n - very_bad (1): id field has severe formatting issues.\\n - bad (2): id field often has formatting issues.\\n - average (3): id field sometimes has formatting issues.\\n - good (4): id field rarely has formatting issues.\\n - very_good (5): id field is correctly formatted.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■2)\\n- Use exact criterion names: field_existence, field_consistency, field_format\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY8fri3hQppEkYKF5xnfVd6DU9m","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■1:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n■2:field_existence=very_good;field_consistency=very_good;field_format=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":458,"inputCost":0.0011450000000000002,"outputTokens":75,"outputCost":0.00075},"botpress":{"cost":0.0018950000000000002}},"meta":{"cached":false}}}
809
+ {"test":"zai.rate > progress tracking > should emit progress events-1","key":"d73ed527","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"rate these\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXY9bMn6UneGL9DvzbisTVryvjnM","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"quality\": {\n \"very_bad\": \"Extremely poor craftsmanship or materials.\",\n \"bad\": \"Below average with noticeable flaws.\",\n \"average\": \"Decent but unremarkable quality.\",\n \"good\": \"Well-made with minor imperfections.\",\n \"very_good\": \"Exceptional quality and attention to detail.\"\n },\n \"design\": {\n \"very_bad\": \"Unappealing or poorly thought-out.\",\n \"bad\": \"Could use significant improvements.\",\n \"average\": \"Functional but not eye-catching.\",\n \"good\": \"Attractive and well-executed design.\",\n \"very_good\": \"Beautiful and highly creative design.\"\n },\n \"usability\": {\n \"very_bad\": \"Confusing or impossible to use.\",\n \"bad\": \"Challenging and inefficient functionality.\",\n \"average\": \"Usable but somewhat inconvenient.\",\n \"good\": \"Easy and straightforward to use.\",\n \"very_good\": \"Highly intuitive and effortless to use.\"\n },\n \"value\": {\n \"very_bad\": \"Not worth the price at all.\",\n \"bad\": \"Overpriced for what it offers.\",\n \"average\": \"Fair value for the cost.\",\n \"good\": \"Good deal for the quality provided.\",\n \"very_good\": \"Excellent value, exceeds expectations.\"\n },\n \"durability\": {\n \"very_bad\": \"Breaks or deteriorates instantly.\",\n \"bad\": \"Short lifespan or unreliable.\",\n \"average\": \"Moderate durability, average use lifespan.\",\n \"good\": \"Sturdy and lasts a long time.\",\n \"very_good\": \"Extremely durable under any condition.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":188,"inputCost":0.00047000000000000004,"outputTokens":347,"outputCost":0.0034700000000000004},"botpress":{"cost":0.003940000000000001}},"meta":{"cached":false}}}
810
+ {"test":"zai.rate > progress tracking > should emit progress events-2","key":"98aacd15","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■19):\\n■0: {\\\"id\\\":0,\\\"value\\\":0}■\\n■1: {\\\"id\\\":1,\\\"value\\\":1}■\\n■2: {\\\"id\\\":2,\\\"value\\\":2}■\\n■3: {\\\"id\\\":3,\\\"value\\\":3}■\\n■4: {\\\"id\\\":4,\\\"value\\\":4}■\\n■5: {\\\"id\\\":5,\\\"value\\\":5}■\\n■6: {\\\"id\\\":6,\\\"value\\\":6}■\\n■7: {\\\"id\\\":7,\\\"value\\\":7}■\\n■8: {\\\"id\\\":8,\\\"value\\\":8}■\\n■9: {\\\"id\\\":9,\\\"value\\\":9}■\\n■10: {\\\"id\\\":10,\\\"value\\\":10}■\\n■11: {\\\"id\\\":11,\\\"value\\\":11}■\\n■12: {\\\"id\\\":12,\\\"value\\\":12}■\\n■13: {\\\"id\\\":13,\\\"value\\\":13}■\\n■14: {\\\"id\\\":14,\\\"value\\\":14}■\\n■15: {\\\"id\\\":15,\\\"value\\\":15}■\\n■16: {\\\"id\\\":16,\\\"value\\\":16}■\\n■17: {\\\"id\\\":17,\\\"value\\\":17}■\\n■18: {\\\"id\\\":18,\\\"value\\\":18}■\\n■19: {\\\"id\\\":19,\\\"value\\\":19}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**quality**:\\n - very_bad (1): Extremely poor craftsmanship or materials.\\n - bad (2): Below average with noticeable flaws.\\n - average (3): Decent but unremarkable quality.\\n - good (4): Well-made with minor imperfections.\\n - very_good (5): Exceptional quality and attention to detail.\\n\\n**design**:\\n - very_bad (1): Unappealing or poorly thought-out.\\n - bad (2): Could use significant improvements.\\n - average (3): Functional but not eye-catching.\\n - good (4): Attractive and well-executed design.\\n - very_good (5): Beautiful and highly creative design.\\n\\n**usability**:\\n - very_bad (1): Confusing or impossible to use.\\n - bad (2): Challenging and inefficient functionality.\\n - average (3): Usable but somewhat inconvenient.\\n - good (4): Easy and straightforward to use.\\n - very_good (5): Highly intuitive and effortless to use.\\n\\n**value**:\\n - very_bad (1): Not worth the price at all.\\n - bad (2): Overpriced for what it offers.\\n - average (3): Fair value for the cost.\\n - good (4): Good deal for the quality provided.\\n - very_good (5): Excellent value, exceeds expectations.\\n\\n**durability**:\\n - very_bad (1): Breaks or deteriorates instantly.\\n - bad (2): Short lifespan or unreliable.\\n - average (3): Moderate durability, average use lifespan.\\n - good (4): Sturdy and lasts a long time.\\n - very_good (5): Extremely durable under any condition.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■19)\\n- Use exact criterion names: quality, design, usability, value, durability\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYD7sFjK51mZ2yWU7Ae6WXsxazc","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:quality=very_bad;design=very_bad;usability=very_bad;value=very_bad;durability=very_bad■ \n■1:quality=very_bad;design=very_bad;usability=very_bad;value=very_bad;durability=very_bad■ \n■2:quality=bad;design=bad;usability=bad;value=bad;durability=bad■ \n■3:quality=bad;design=bad;usability=bad;value=bad;durability=bad■ \n■4:quality=average;design=average;usability=average;value=average;durability=average■ \n■5:quality=average;design=average;usability=average;value=average;durability=average■ \n■6:quality=good;design=good;usability=good;value=good;durability=good■ \n■7:quality=good;design=good;usability=good;value=good;durability=good■ \n■8:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■9:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■10:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■11:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■12:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■13:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■14:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■15:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■16:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■17:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■18:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n■19:quality=very_good;design=very_good;usability=very_good;value=very_good;durability=very_good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":824,"inputCost":0.00206,"outputTokens":593,"outputCost":0.00593},"botpress":{"cost":0.00799}},"meta":{"cached":false}}}
811
+ {"test":"zai.rate > abort signal > should support abort signal-3","key":"cb234f81","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":50}■\\n■1: {\\\"id\\\":51}■\\n■2: {\\\"id\\\":52}■\\n■3: {\\\"id\\\":53}■\\n■4: {\\\"id\\\":54}■\\n■5: {\\\"id\\\":55}■\\n■6: {\\\"id\\\":56}■\\n■7: {\\\"id\\\":57}■\\n■8: {\\\"id\\\":58}■\\n■9: {\\\"id\\\":59}■\\n■10: {\\\"id\\\":60}■\\n■11: {\\\"id\\\":61}■\\n■12: {\\\"id\\\":62}■\\n■13: {\\\"id\\\":63}■\\n■14: {\\\"id\\\":64}■\\n■15: {\\\"id\\\":65}■\\n■16: {\\\"id\\\":66}■\\n■17: {\\\"id\\\":67}■\\n■18: {\\\"id\\\":68}■\\n■19: {\\\"id\\\":69}■\\n■20: {\\\"id\\\":70}■\\n■21: {\\\"id\\\":71}■\\n■22: {\\\"id\\\":72}■\\n■23: {\\\"id\\\":73}■\\n■24: {\\\"id\\\":74}■\\n■25: {\\\"id\\\":75}■\\n■26: {\\\"id\\\":76}■\\n■27: {\\\"id\\\":77}■\\n■28: {\\\"id\\\":78}■\\n■29: {\\\"id\\\":79}■\\n■30: {\\\"id\\\":80}■\\n■31: {\\\"id\\\":81}■\\n■32: {\\\"id\\\":82}■\\n■33: {\\\"id\\\":83}■\\n■34: {\\\"id\\\":84}■\\n■35: {\\\"id\\\":85}■\\n■36: {\\\"id\\\":86}■\\n■37: {\\\"id\\\":87}■\\n■38: {\\\"id\\\":88}■\\n■39: {\\\"id\\\":89}■\\n■40: {\\\"id\\\":90}■\\n■41: {\\\"id\\\":91}■\\n■42: {\\\"id\\\":92}■\\n■43: {\\\"id\\\":93}■\\n■44: {\\\"id\\\":94}■\\n■45: {\\\"id\\\":95}■\\n■46: {\\\"id\\\":96}■\\n■47: {\\\"id\\\":97}■\\n■48: {\\\"id\\\":98}■\\n■49: {\\\"id\\\":99}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**quality**:\\n - very_bad (1): Extremely poor craftsmanship or materials.\\n - bad (2): Below average with noticeable flaws.\\n - average (3): Decent but unremarkable quality.\\n - good (4): Well-made with minor imperfections.\\n - very_good (5): Exceptional quality and attention to detail.\\n\\n**design**:\\n - very_bad (1): Unappealing or poorly thought-out.\\n - bad (2): Could use significant improvements.\\n - average (3): Functional but not eye-catching.\\n - good (4): Attractive and well-executed design.\\n - very_good (5): Beautiful and highly creative design.\\n\\n**usability**:\\n - very_bad (1): Confusing or impossible to use.\\n - bad (2): Challenging and inefficient functionality.\\n - average (3): Usable but somewhat inconvenient.\\n - good (4): Easy and straightforward to use.\\n - very_good (5): Highly intuitive and effortless to use.\\n\\n**value**:\\n - very_bad (1): Not worth the price at all.\\n - bad (2): Overpriced for what it offers.\\n - average (3): Fair value for the cost.\\n - good (4): Good deal for the quality provided.\\n - very_good (5): Excellent value, exceeds expectations.\\n\\n**durability**:\\n - very_bad (1): Breaks or deteriorates instantly.\\n - bad (2): Short lifespan or unreliable.\\n - average (3): Moderate durability, average use lifespan.\\n - good (4): Sturdy and lasts a long time.\\n - very_good (5): Extremely durable under any condition.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: quality, design, usability, value, durability\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYG0soZ4abv36hQY7GsPe0NDLXh","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:quality=average;design=average;usability=average;value=average;durability=average■ \n■1:quality=average;design=average;usability=average;value=average;durability=average■ \n■2:quality=average;design=average;usability=average;value=average;durability=average■ \n■3:quality=average;design=average;usability=average;value=average;durability=average■ \n■4:quality=average;design=average;usability=average;value=average;durability=average■ \n■5:quality=average;design=average;usability=average;value=average;durability=average■ \n■6:quality=average;design=average;usability=average;value=average;durability=average■ \n■7:quality=average;design=average;usability=average;value=average;durability=average■ \n■8:quality=average;design=average;usability=average;value=average;durability=average■ \n■9:quality=average;design=average;usability=average;value=average;durability=average■ \n■10:quality=average;design=average;usability=average;value=average;durability=average■ \n■11:quality=average;design=average;usability=average;value=average;durability=average■ \n■12:quality=average;design=average;usability=average;value=average;durability=average■ \n■13:quality=average;design=average;usability=average;value=average;durability=average■ \n■14:quality=average;design=average;usability=average;value=average;durability=average■ \n■15:quality=average;design=average;usability=average;value=average;durability=average■ \n■16:quality=average;design=average;usability=average;value=average;durability=average■ \n■17:quality=average;design=average;usability=average;value=average;durability=average■ \n■18:quality=average;design=average;usability=average;value=average;durability=average■ \n■19:quality=average;design=average;usability=average;value=average;durability=average■ \n■20:quality=average;design=average;usability=average;value=average;durability=average■ \n■21:quality=average;design=average;usability=average;value=average;durability=average■ \n■22:quality=average;design=average;usability=average;value=average;durability=average■ \n■23:quality=average;design=average;usability=average;value=average;durability=average■ \n■24:quality=average;design=average;usability=average;value=average;durability=average■ \n■25:quality=average;design=average;usability=average;value=average;durability=average■ \n■26:quality=average;design=average;usability=average;value=average;durability=average■ \n■27:quality=average;design=average;usability=average;value=average;durability=average■ \n■28:quality=average;design=average;usability=average;value=average;durability=average■ \n■29:quality=average;design=average;usability=average;value=average;durability=average■ \n■30:quality=average;design=average;usability=average;value=average;durability=average■ \n■31:quality=average;design=average;usability=average;value=average;durability=average■ \n■32:quality=average;design=average;usability=average;value=average;durability=average■ \n■33:quality=average;design=average;usability=average;value=average;durability=average■ \n■34:quality=average;design=average;usability=average;value=average;durability=average■ \n■35:quality=average;design=average;usability=average;value=average;durability=average■ \n■36:quality=average;design=average;usability=average;value=average;durability=average■ \n■37:quality=average;design=average;usability=average;value=average;durability=average■ \n■38:quality=average;design=average;usability=average;value=average;durability=average■ \n■39:quality=average;design=average;usability=average;value=average;durability=average■ \n■40:quality=average;design=average;usability=average;value=average;durability=average■ \n■41:quality=average;design=average;usability=average;value=average;durability=average■ \n■42:quality=average;design=average;usability=average;value=average;durability=average■ \n■43:quality=average;design=average;usability=average;value=average;durability=average■ \n■44:quality=average;design=average;usability=average;value=average;durability=average■ \n■45:quality=average;design=average;usability=average;value=average;durability=average■ \n■46:quality=average;design=average;usability=average;value=average;durability=average■ \n■47:quality=average;design=average;usability=average;value=average;durability=average■ \n■48:quality=average;design=average;usability=average;value=average;durability=average■ \n■49:quality=average;design=average;usability=average;value=average;durability=average■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1044,"inputCost":0.0026100000000000003,"outputTokens":1303,"outputCost":0.013030000000000002},"botpress":{"cost":0.01564}},"meta":{"cached":false}}}
812
+ {"test":"zai.rate > abort signal > should support abort signal-2","key":"63781747","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■49):\\n■0: {\\\"id\\\":0}■\\n■1: {\\\"id\\\":1}■\\n■2: {\\\"id\\\":2}■\\n■3: {\\\"id\\\":3}■\\n■4: {\\\"id\\\":4}■\\n■5: {\\\"id\\\":5}■\\n■6: {\\\"id\\\":6}■\\n■7: {\\\"id\\\":7}■\\n■8: {\\\"id\\\":8}■\\n■9: {\\\"id\\\":9}■\\n■10: {\\\"id\\\":10}■\\n■11: {\\\"id\\\":11}■\\n■12: {\\\"id\\\":12}■\\n■13: {\\\"id\\\":13}■\\n■14: {\\\"id\\\":14}■\\n■15: {\\\"id\\\":15}■\\n■16: {\\\"id\\\":16}■\\n■17: {\\\"id\\\":17}■\\n■18: {\\\"id\\\":18}■\\n■19: {\\\"id\\\":19}■\\n■20: {\\\"id\\\":20}■\\n■21: {\\\"id\\\":21}■\\n■22: {\\\"id\\\":22}■\\n■23: {\\\"id\\\":23}■\\n■24: {\\\"id\\\":24}■\\n■25: {\\\"id\\\":25}■\\n■26: {\\\"id\\\":26}■\\n■27: {\\\"id\\\":27}■\\n■28: {\\\"id\\\":28}■\\n■29: {\\\"id\\\":29}■\\n■30: {\\\"id\\\":30}■\\n■31: {\\\"id\\\":31}■\\n■32: {\\\"id\\\":32}■\\n■33: {\\\"id\\\":33}■\\n■34: {\\\"id\\\":34}■\\n■35: {\\\"id\\\":35}■\\n■36: {\\\"id\\\":36}■\\n■37: {\\\"id\\\":37}■\\n■38: {\\\"id\\\":38}■\\n■39: {\\\"id\\\":39}■\\n■40: {\\\"id\\\":40}■\\n■41: {\\\"id\\\":41}■\\n■42: {\\\"id\\\":42}■\\n■43: {\\\"id\\\":43}■\\n■44: {\\\"id\\\":44}■\\n■45: {\\\"id\\\":45}■\\n■46: {\\\"id\\\":46}■\\n■47: {\\\"id\\\":47}■\\n■48: {\\\"id\\\":48}■\\n■49: {\\\"id\\\":49}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**quality**:\\n - very_bad (1): Extremely poor craftsmanship or materials.\\n - bad (2): Below average with noticeable flaws.\\n - average (3): Decent but unremarkable quality.\\n - good (4): Well-made with minor imperfections.\\n - very_good (5): Exceptional quality and attention to detail.\\n\\n**design**:\\n - very_bad (1): Unappealing or poorly thought-out.\\n - bad (2): Could use significant improvements.\\n - average (3): Functional but not eye-catching.\\n - good (4): Attractive and well-executed design.\\n - very_good (5): Beautiful and highly creative design.\\n\\n**usability**:\\n - very_bad (1): Confusing or impossible to use.\\n - bad (2): Challenging and inefficient functionality.\\n - average (3): Usable but somewhat inconvenient.\\n - good (4): Easy and straightforward to use.\\n - very_good (5): Highly intuitive and effortless to use.\\n\\n**value**:\\n - very_bad (1): Not worth the price at all.\\n - bad (2): Overpriced for what it offers.\\n - average (3): Fair value for the cost.\\n - good (4): Good deal for the quality provided.\\n - very_good (5): Excellent value, exceeds expectations.\\n\\n**durability**:\\n - very_bad (1): Breaks or deteriorates instantly.\\n - bad (2): Short lifespan or unreliable.\\n - average (3): Moderate durability, average use lifespan.\\n - good (4): Sturdy and lasts a long time.\\n - very_good (5): Extremely durable under any condition.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■49)\\n- Use exact criterion names: quality, design, usability, value, durability\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYGikZKmt60RvVd8rJ45wIqOBpY","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:quality=average;design=average;usability=average;value=average;durability=average■ \n■1:quality=average;design=average;usability=average;value=average;durability=average■ \n■2:quality=average;design=average;usability=average;value=average;durability=average■ \n■3:quality=average;design=average;usability=average;value=average;durability=average■ \n■4:quality=average;design=average;usability=average;value=average;durability=average■ \n■5:quality=average;design=average;usability=average;value=average;durability=average■ \n■6:quality=average;design=average;usability=average;value=average;durability=average■ \n■7:quality=average;design=average;usability=average;value=average;durability=average■ \n■8:quality=average;design=average;usability=average;value=average;durability=average■ \n■9:quality=average;design=average;usability=average;value=average;durability=average■ \n■10:quality=average;design=average;usability=average;value=average;durability=average■ \n■11:quality=average;design=average;usability=average;value=average;durability=average■ \n■12:quality=average;design=average;usability=average;value=average;durability=average■ \n■13:quality=average;design=average;usability=average;value=average;durability=average■ \n■14:quality=average;design=average;usability=average;value=average;durability=average■ \n■15:quality=average;design=average;usability=average;value=average;durability=average■ \n■16:quality=average;design=average;usability=average;value=average;durability=average■ \n■17:quality=average;design=average;usability=average;value=average;durability=average■ \n■18:quality=average;design=average;usability=average;value=average;durability=average■ \n■19:quality=average;design=average;usability=average;value=average;durability=average■ \n■20:quality=average;design=average;usability=average;value=average;durability=average■ \n■21:quality=average;design=average;usability=average;value=average;durability=average■ \n■22:quality=average;design=average;usability=average;value=average;durability=average■ \n■23:quality=average;design=average;usability=average;value=average;durability=average■ \n■24:quality=average;design=average;usability=average;value=average;durability=average■ \n■25:quality=average;design=average;usability=average;value=average;durability=average■ \n■26:quality=average;design=average;usability=average;value=average;durability=average■ \n■27:quality=average;design=average;usability=average;value=average;durability=average■ \n■28:quality=average;design=average;usability=average;value=average;durability=average■ \n■29:quality=average;design=average;usability=average;value=average;durability=average■ \n■30:quality=average;design=average;usability=average;value=average;durability=average■ \n■31:quality=average;design=average;usability=average;value=average;durability=average■ \n■32:quality=average;design=average;usability=average;value=average;durability=average■ \n■33:quality=average;design=average;usability=average;value=average;durability=average■ \n■34:quality=average;design=average;usability=average;value=average;durability=average■ \n■35:quality=average;design=average;usability=average;value=average;durability=average■ \n■36:quality=average;design=average;usability=average;value=average;durability=average■ \n■37:quality=average;design=average;usability=average;value=average;durability=average■ \n■38:quality=average;design=average;usability=average;value=average;durability=average■ \n■39:quality=average;design=average;usability=average;value=average;durability=average■ \n■40:quality=average;design=average;usability=average;value=average;durability=average■ \n■41:quality=average;design=average;usability=average;value=average;durability=average■ \n■42:quality=average;design=average;usability=average;value=average;durability=average■ \n■43:quality=average;design=average;usability=average;value=average;durability=average■ \n■44:quality=average;design=average;usability=average;value=average;durability=average■ \n■45:quality=average;design=average;usability=average;value=average;durability=average■ \n■46:quality=average;design=average;usability=average;value=average;durability=average■ \n■47:quality=average;design=average;usability=average;value=average;durability=average■ \n■48:quality=average;design=average;usability=average;value=average;durability=average■ \n■49:quality=average;design=average;usability=average;value=average;durability=average■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1044,"inputCost":0.0026100000000000003,"outputTokens":1303,"outputCost":0.013030000000000002},"botpress":{"cost":0.01564}},"meta":{"cached":false}}}
813
+ {"test":"zai.rate > objective numerical criteria > should rate based on clear numerical thresholds-1","key":"54f8cef8","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- cpu_health: CPU usage health (0-50% = very_good, 51-70% = good, 71-80% = average, 81-90% = bad, 90%+ = very_bad)\\n- memory_health: Memory usage health (0-50% = very_good, 51-70% = good, 71-80% = average, 81-90% = bad, 90%+ = very_bad)\\n- disk_health: Disk usage health (0-50% = very_good, 51-70% = good, 71-80% = average, 81-90% = bad, 90%+ = very_bad)\\n\\nOutput format (JSON):\\n{\\n \\\"cpu_health\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYLIq1kbowUIfUhidd7d0XO761S","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"cpu_health\": {\n \"very_bad\": \"CPU usage above 90%, critical risk.\",\n \"bad\": \"CPU usage 81-90%, heavy stress.\",\n \"average\": \"CPU usage 71-80%, moderate load.\",\n \"good\": \"CPU usage 51-70%, low stress.\",\n \"very_good\": \"CPU usage 0-50%, optimal.\"\n },\n \"memory_health\": {\n \"very_bad\": \"Memory usage above 90%, critical risk.\",\n \"bad\": \"Memory usage 81-90%, approaching max capacity.\",\n \"average\": \"Memory usage 71-80%, moderate consumption.\",\n \"good\": \"Memory usage 51-70%, stable performance.\",\n \"very_good\": \"Memory usage 0-50%, plenty of free space.\"\n },\n \"disk_health\": {\n \"very_bad\": \"Disk usage above 90%, near full.\",\n \"bad\": \"Disk usage 81-90%, limited free space.\",\n \"average\": \"Disk usage 71-80%, manageable capacity.\",\n \"good\": \"Disk usage 51-70%, stable space available.\",\n \"very_good\": \"Disk usage 0-50%, ample free space.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":299,"inputCost":0.0007475,"outputTokens":263,"outputCost":0.0026300000000000004},"botpress":{"cost":0.0033775000000000003}},"meta":{"cached":false}}}
814
+ {"test":"zai.rate > objective numerical criteria > should rate based on clear numerical thresholds-2","key":"34f038b7","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■2):\\n■0: {\\\"cpu\\\":95,\\\"memory\\\":90,\\\"disk\\\":85,\\\"uptime\\\":30}■\\n■1: {\\\"cpu\\\":50,\\\"memory\\\":55,\\\"disk\\\":60,\\\"uptime\\\":99.9}■\\n■2: {\\\"cpu\\\":75,\\\"memory\\\":70,\\\"disk\\\":65,\\\"uptime\\\":95}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**cpu_health**:\\n - very_bad (1): CPU usage above 90%, critical risk.\\n - bad (2): CPU usage 81-90%, heavy stress.\\n - average (3): CPU usage 71-80%, moderate load.\\n - good (4): CPU usage 51-70%, low stress.\\n - very_good (5): CPU usage 0-50%, optimal.\\n\\n**memory_health**:\\n - very_bad (1): Memory usage above 90%, critical risk.\\n - bad (2): Memory usage 81-90%, approaching max capacity.\\n - average (3): Memory usage 71-80%, moderate consumption.\\n - good (4): Memory usage 51-70%, stable performance.\\n - very_good (5): Memory usage 0-50%, plenty of free space.\\n\\n**disk_health**:\\n - very_bad (1): Disk usage above 90%, near full.\\n - bad (2): Disk usage 81-90%, limited free space.\\n - average (3): Disk usage 71-80%, manageable capacity.\\n - good (4): Disk usage 51-70%, stable space available.\\n - very_good (5): Disk usage 0-50%, ample free space.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■2)\\n- Use exact criterion names: cpu_health, memory_health, disk_health\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYNQdyELi8xjRGoIH0giTcMtifb","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:cpu_health=very_bad;memory_health=bad;disk_health=bad■ \n■1:cpu_health=very_good;memory_health=good;disk_health=good■ \n■2:cpu_health=average;memory_health=good;disk_health=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":531,"inputCost":0.0013275000000000001,"outputTokens":62,"outputCost":0.00062},"botpress":{"cost":0.0019475}},"meta":{"cached":false}}}
815
+ {"test":"zai.rate > objective numerical criteria > should rate based on count criteria-1","key":"8a539d14","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- issue_management: open issues count (0-10 = very_good, 11-50 = good, 51-100 = average, 101-200 = bad, 200+ = very_bad)\\n- community: contributor count (50+ = very_good, 20-49 = good, 10-19 = average, 5-9 = bad, <5 = very_bad)\\n- popularity: star count (1000+ = very_good, 500-999 = good, 100-499 = average, 50-99 = bad, <50 = very_bad)\\n\\nOutput format (JSON):\\n{\\n \\\"issue_management\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYOl3Ktp25wvyOYFz9938HsRW2N","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"issue_management\": {\n \"very_bad\": \"200+ open issues.\",\n \"bad\": \"101-200 open issues.\",\n \"average\": \"51-100 open issues.\",\n \"good\": \"11-50 open issues.\",\n \"very_good\": \"0-10 open issues.\"\n },\n \"community\": {\n \"very_bad\": \"Less than 5 contributors.\",\n \"bad\": \"5-9 contributors.\",\n \"average\": \"10-19 contributors.\",\n \"good\": \"20-49 contributors.\",\n \"very_good\": \"50+ contributors.\"\n },\n \"popularity\": {\n \"very_bad\": \"Less than 50 stars.\",\n \"bad\": \"50-99 stars.\",\n \"average\": \"100-499 stars.\",\n \"good\": \"500-999 stars.\",\n \"very_good\": \"1000+ stars.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":277,"inputCost":0.0006925000000000001,"outputTokens":190,"outputCost":0.0019000000000000002},"botpress":{"cost":0.0025925}},"meta":{"cached":false}}}
816
+ {"test":"zai.rate > objective numerical criteria > should rate based on count criteria-2","key":"f8402a57","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"name\\\":\\\"repo-a\\\",\\\"openIssues\\\":2,\\\"contributors\\\":50,\\\"stars\\\":1000}■\\n■1: {\\\"name\\\":\\\"repo-b\\\",\\\"openIssues\\\":200,\\\"contributors\\\":1,\\\"stars\\\":5}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**issue_management**:\\n - very_bad (1): 200+ open issues.\\n - bad (2): 101-200 open issues.\\n - average (3): 51-100 open issues.\\n - good (4): 11-50 open issues.\\n - very_good (5): 0-10 open issues.\\n\\n**community**:\\n - very_bad (1): Less than 5 contributors.\\n - bad (2): 5-9 contributors.\\n - average (3): 10-19 contributors.\\n - good (4): 20-49 contributors.\\n - very_good (5): 50+ contributors.\\n\\n**popularity**:\\n - very_bad (1): Less than 50 stars.\\n - bad (2): 50-99 stars.\\n - average (3): 100-499 stars.\\n - good (4): 500-999 stars.\\n - very_good (5): 1000+ stars.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: issue_management, community, popularity\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYS6Jtj7TXPEMsJMllAsbox3wTL","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:issue_management=very_good;community=very_good;popularity=very_good■ \n■1:issue_management=very_bad;community=very_bad;popularity=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":447,"inputCost":0.0011175,"outputTokens":45,"outputCost":0.00045000000000000004},"botpress":{"cost":0.0015675}},"meta":{"cached":false}}}
817
+ {"test":"zai.rate > abort signal > should support abort signal-1","key":"c41a7e87","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"rate these items\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXYuguWYzUV12suw1xk9bTW5YPwl","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"quality\": {\n \"very_bad\": \"Poorly made, easily breaks or fails.\",\n \"bad\": \"Subpar materials or inconsistent performance.\",\n \"average\": \"Acceptable quality, but not impressive.\",\n \"good\": \"Well-made with reliable performance.\",\n \"very_good\": \"High-quality, durable, and excellent performance.\"\n },\n \"design\": {\n \"very_bad\": \"Unattractive or poorly thought-out design.\",\n \"bad\": \"Below-average aesthetics or inconvenient layout.\",\n \"average\": \"Simple design with neutral appeal.\",\n \"good\": \"Visually appealing and functional design.\",\n \"very_good\": \"Outstanding design with great usability.\"\n },\n \"value\": {\n \"very_bad\": \"Overpriced with little to no benefit.\",\n \"bad\": \"Poor value for the cost.\",\n \"average\": \"Fair price for what it offers.\",\n \"good\": \"Reasonably priced for good quality.\",\n \"very_good\": \"Excellent offering for its price.\"\n },\n \"usability\": {\n \"very_bad\": \"Difficult or impossible to use effectively.\",\n \"bad\": \"Requires effort, minimal ease of use.\",\n \"average\": \"Usable with some minor inconveniences.\",\n \"good\": \"Easy and straightforward to use.\",\n \"very_good\": \"Highly intuitive and user-friendly.\"\n },\n \"durability\": {\n \"very_bad\": \"Breaks or wears out almost immediately.\",\n \"bad\": \"Lacks longevity, breaks with moderate use.\",\n \"average\": \"Decent lifespan, but not exceptional.\",\n \"good\": \"Reliable and lasts a reasonable time.\",\n \"very_good\": \"Extremely durable under frequent use.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":189,"inputCost":0.00047250000000000005,"outputTokens":361,"outputCost":0.0036100000000000004},"botpress":{"cost":0.0040825}},"meta":{"cached":false}}}
818
+ {"test":"zai.learn.rate > learns rating patterns from examples with string instructions-1","key":"91c74fed","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how healthy is this server?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjPyGuSMppWmdjz0j0cA6uHdk3H","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"uptime\": {\n \"very_bad\": \"Server crashes frequently, severe downtime.\",\n \"bad\": \"Server unstable, moderate downtime.\",\n \"average\": \"Occasional downtime, but mostly stable.\",\n \"good\": \"Minimal downtime, stable performance.\",\n \"very_good\": \"No downtime, excellent stability.\"\n },\n \"response_time\": {\n \"very_bad\": \"Very high latency, highly unresponsive.\",\n \"bad\": \"Noticeable delays in server responses.\",\n \"average\": \"Average response time, acceptable delays.\",\n \"good\": \"Quick response, minimal latency.\",\n \"very_good\": \"Instant responses, lightning-fast performance.\"\n },\n \"resource_usage\": {\n \"very_bad\": \"Severely overloaded, resources consistently maxed out.\",\n \"bad\": \"High usage, occasional overloads.\",\n \"average\": \"Balanced usage, occasional spikes.\",\n \"good\": \"Efficient usage, rare spikes.\",\n \"very_good\": \"Optimal resource use, highly efficient.\"\n },\n \"security\": {\n \"very_bad\": \"Multiple breaches, severe vulnerabilities.\",\n \"bad\": \"Frequent vulnerabilities, moderate risks.\",\n \"average\": \"Some vulnerabilities, manageable risks.\",\n \"good\": \"Good protection, minor vulnerabilities.\",\n \"very_good\": \"Highly secure, no vulnerabilities.\"\n },\n \"error_rate\": {\n \"very_bad\": \"Frequent critical errors, constant issues.\",\n \"bad\": \"Moderate errors, persistent problems.\",\n \"average\": \"Occasional issues, manageable disruptions.\",\n \"good\": \"Rare errors, smooth operation.\",\n \"very_good\": \"Almost no errors, flawless operation.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":191,"inputCost":0.00047750000000000006,"outputTokens":352,"outputCost":0.00352},"botpress":{"cost":0.0039975}},"meta":{"cached":false}}}
819
+ {"test":"zai.learn.rate > learns rating patterns from examples with string instructions-2","key":"757e767a","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"cpu\\\":25,\\\"memory\\\":30,\\\"disk\\\":20}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**uptime**:\\n - very_bad (1): Server crashes frequently, severe downtime.\\n - bad (2): Server unstable, moderate downtime.\\n - average (3): Occasional downtime, but mostly stable.\\n - good (4): Minimal downtime, stable performance.\\n - very_good (5): No downtime, excellent stability.\\n\\n**response_time**:\\n - very_bad (1): Very high latency, highly unresponsive.\\n - bad (2): Noticeable delays in server responses.\\n - average (3): Average response time, acceptable delays.\\n - good (4): Quick response, minimal latency.\\n - very_good (5): Instant responses, lightning-fast performance.\\n\\n**resource_usage**:\\n - very_bad (1): Severely overloaded, resources consistently maxed out.\\n - bad (2): High usage, occasional overloads.\\n - average (3): Balanced usage, occasional spikes.\\n - good (4): Efficient usage, rare spikes.\\n - very_good (5): Optimal resource use, highly efficient.\\n\\n**security**:\\n - very_bad (1): Multiple breaches, severe vulnerabilities.\\n - bad (2): Frequent vulnerabilities, moderate risks.\\n - average (3): Some vulnerabilities, manageable risks.\\n - good (4): Good protection, minor vulnerabilities.\\n - very_good (5): Highly secure, no vulnerabilities.\\n\\n**error_rate**:\\n - very_bad (1): Frequent critical errors, constant issues.\\n - bad (2): Moderate errors, persistent problems.\\n - average (3): Occasional issues, manageable disruptions.\\n - good (4): Rare errors, smooth operation.\\n - very_good (5): Almost no errors, flawless operation.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: uptime, response_time, resource_usage, security, error_rate\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjTjsSZjQsqq8BJWB4K1kAKWDbe","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:uptime=good;response_time=good;resource_usage=very_good;security=average;error_rate=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":570,"inputCost":0.001425,"outputTokens":32,"outputCost":0.00032},"botpress":{"cost":0.001745}},"meta":{"cached":false}}}
820
+ {"test":"zai.learn.rate > learns rating patterns from examples with record instructions-1","key":"2a40711","input":"[{\"input\":{\"messages\":[{\"content\":\"For these evaluation criteria, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions for each:\\n\\n- length: password length (12+ = very_good, 8-11 = good, 6-7 = average, 4-5 = bad, <4 = very_bad)\\n- complexity: character variety (all types = very_good)\\n- strength: overall password strength\\n\\nOutput format (JSON):\\n{\\n \\\"length\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n }\\n ...\\n}\\n\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjV3GVRk8a41zoB2h5PYEHsX4J0","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"length\": {\n \"very_bad\": \"Less than 4 characters.\",\n \"bad\": \"4 to 5 characters.\",\n \"average\": \"6 to 7 characters.\",\n \"good\": \"8 to 11 characters.\",\n \"very_good\": \"12 or more characters.\"\n },\n \"complexity\": {\n \"very_bad\": \"Only one type of character.\",\n \"bad\": \"Two types of characters.\",\n \"average\": \"Three types of characters.\",\n \"good\": \"Four types, moderate variety.\",\n \"very_good\": \"All types: upper, lower, digits, symbols.\"\n },\n \"strength\": {\n \"very_bad\": \"Very easy to guess or crack.\",\n \"bad\": \"Weak, low resistance to attacks.\",\n \"average\": \"Moderate resistance, could be stronger.\",\n \"good\": \"Secure, meets most recommendations.\",\n \"very_good\": \"Highly secure, difficult to crack.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":212,"inputCost":0.0005300000000000001,"outputTokens":206,"outputCost":0.00206},"botpress":{"cost":0.0025900000000000003}},"meta":{"cached":false}}}
821
+ {"test":"zai.learn.rate > learns rating patterns from examples with record instructions-2","key":"94e4ba35","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"password\\\":\\\"MyStr0ng!Pass\\\",\\\"length\\\":13,\\\"hasAll\\\":true}■\\n■1: {\\\"password\\\":\\\"bad\\\",\\\"length\\\":3,\\\"hasAll\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**length**:\\n - very_bad (1): Less than 4 characters.\\n - bad (2): 4 to 5 characters.\\n - average (3): 6 to 7 characters.\\n - good (4): 8 to 11 characters.\\n - very_good (5): 12 or more characters.\\n\\n**complexity**:\\n - very_bad (1): Only one type of character.\\n - bad (2): Two types of characters.\\n - average (3): Three types of characters.\\n - good (4): Four types, moderate variety.\\n - very_good (5): All types: upper, lower, digits, symbols.\\n\\n**strength**:\\n - very_bad (1): Very easy to guess or crack.\\n - bad (2): Weak, low resistance to attacks.\\n - average (3): Moderate resistance, could be stronger.\\n - good (4): Secure, meets most recommendations.\\n - very_good (5): Highly secure, difficult to crack.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: length, complexity, strength\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjX1Ple1N940njfp8GOLks6DbtU","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:length=very_good;complexity=very_good;strength=very_good■ \n■1:length=very_bad;complexity=very_bad;strength=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":446,"inputCost":0.001115,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.0015050000000000003}},"meta":{"cached":false}}}
822
+ {"test":"zai.learn.rate > saves and retrieves rating examples correctly-1","key":"92825093","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"how good is this data quality?\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjZdvXzsENR9yOwuw2jro1QY9lg","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"accuracy\": {\n \"very_bad\": \"Frequent errors and inaccurate information.\",\n \"bad\": \"Many errors with some accurate data.\",\n \"average\": \"Occasional errors, mostly accurate data.\",\n \"good\": \"Rare errors, data is accurate.\",\n \"very_good\": \"No errors, entirely accurate data.\"\n },\n \"completeness\": {\n \"very_bad\": \"Large amounts of missing data.\",\n \"bad\": \"Significant gaps in data coverage.\",\n \"average\": \"Some incomplete data but usable.\",\n \"good\": \"Minor gaps, mostly complete.\",\n \"very_good\": \"All data is complete and intact.\"\n },\n \"consistency\": {\n \"very_bad\": \"Data values frequently conflict.\",\n \"bad\": \"Frequent inconsistencies in data.\",\n \"average\": \"Some data conflicts but manageable.\",\n \"good\": \"Data is mostly consistent.\",\n \"very_good\": \"Data is fully consistent throughout.\"\n },\n \"timeliness\": {\n \"very_bad\": \"Severely outdated and irrelevant data.\",\n \"bad\": \"Usually late updates, partially outdated.\",\n \"average\": \"Somewhat recent, occasional delays.\",\n \"good\": \"Data is up-to-date with rare delays.\",\n \"very_good\": \"Fully up-to-date and timely data.\"\n },\n \"relevance\": {\n \"very_bad\": \"Most data is unrelated to needs.\",\n \"bad\": \"Data has limited relevance to needs.\",\n \"average\": \"Somewhat relevant, partially useful data.\",\n \"good\": \"Data is relevant and useful.\",\n \"very_good\": \"Highly relevant and perfectly suited data.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":192,"inputCost":0.00048000000000000007,"outputTokens":352,"outputCost":0.00352},"botpress":{"cost":0.004}},"meta":{"cached":false}}}
823
+ {"test":"zai.learn.rate > saves and retrieves rating examples correctly-2","key":"a4bbf2af","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■0):\\n■0: {\\\"completeness\\\":100,\\\"accuracy\\\":95,\\\"timeliness\\\":90}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**accuracy**:\\n - very_bad (1): Frequent errors and inaccurate information.\\n - bad (2): Many errors with some accurate data.\\n - average (3): Occasional errors, mostly accurate data.\\n - good (4): Rare errors, data is accurate.\\n - very_good (5): No errors, entirely accurate data.\\n\\n**completeness**:\\n - very_bad (1): Large amounts of missing data.\\n - bad (2): Significant gaps in data coverage.\\n - average (3): Some incomplete data but usable.\\n - good (4): Minor gaps, mostly complete.\\n - very_good (5): All data is complete and intact.\\n\\n**consistency**:\\n - very_bad (1): Data values frequently conflict.\\n - bad (2): Frequent inconsistencies in data.\\n - average (3): Some data conflicts but manageable.\\n - good (4): Data is mostly consistent.\\n - very_good (5): Data is fully consistent throughout.\\n\\n**timeliness**:\\n - very_bad (1): Severely outdated and irrelevant data.\\n - bad (2): Usually late updates, partially outdated.\\n - average (3): Somewhat recent, occasional delays.\\n - good (4): Data is up-to-date with rare delays.\\n - very_good (5): Fully up-to-date and timely data.\\n\\n**relevance**:\\n - very_bad (1): Most data is unrelated to needs.\\n - bad (2): Data has limited relevance to needs.\\n - average (3): Somewhat relevant, partially useful data.\\n - good (4): Data is relevant and useful.\\n - very_good (5): Highly relevant and perfectly suited data.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■0)\\n- Use exact criterion names: accuracy, completeness, consistency, timeliness, relevance\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXjc2vQ0zFG7XB1Sjb6hjKkH9FtH","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:accuracy=very_good;completeness=very_good;consistency=good;timeliness=good;relevance=average■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":572,"inputCost":0.00143,"outputTokens":34,"outputCost":0.00034},"botpress":{"cost":0.00177}},"meta":{"cached":false}}}
824
+ {"test":"zai.learn.rate > learns rating patterns from examples with string instructions-4","key":"5f7e28a3","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"cpu\\\":30,\\\"memory\\\":35,\\\"disk\\\":25}■\\n■1: {\\\"cpu\\\":92,\\\"memory\\\":88,\\\"disk\\\":90}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**uptime**:\\n - very_bad (1): Server crashes frequently, severe downtime.\\n - bad (2): Server unstable, moderate downtime.\\n - average (3): Occasional downtime, but mostly stable.\\n - good (4): Minimal downtime, stable performance.\\n - very_good (5): No downtime, excellent stability.\\n\\n**response_time**:\\n - very_bad (1): Very high latency, highly unresponsive.\\n - bad (2): Noticeable delays in server responses.\\n - average (3): Average response time, acceptable delays.\\n - good (4): Quick response, minimal latency.\\n - very_good (5): Instant responses, lightning-fast performance.\\n\\n**resource_usage**:\\n - very_bad (1): Severely overloaded, resources consistently maxed out.\\n - bad (2): High usage, occasional overloads.\\n - average (3): Balanced usage, occasional spikes.\\n - good (4): Efficient usage, rare spikes.\\n - very_good (5): Optimal resource use, highly efficient.\\n\\n**security**:\\n - very_bad (1): Multiple breaches, severe vulnerabilities.\\n - bad (2): Frequent vulnerabilities, moderate risks.\\n - average (3): Some vulnerabilities, manageable risks.\\n - good (4): Good protection, minor vulnerabilities.\\n - very_good (5): Highly secure, no vulnerabilities.\\n\\n**error_rate**:\\n - very_bad (1): Frequent critical errors, constant issues.\\n - bad (2): Moderate errors, persistent problems.\\n - average (3): Occasional issues, manageable disruptions.\\n - good (4): Rare errors, smooth operation.\\n - very_good (5): Almost no errors, flawless operation.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: uptime, response_time, resource_usage, security, error_rate\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXkHe8IwTNvD3iypWhSLlaojV7J5","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:uptime=good;response_time=good;resource_usage=good;security=good;error_rate=good■ \n■1:uptime=bad;response_time=bad;resource_usage=very_bad;security=average;error_rate=bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":588,"inputCost":0.0014700000000000002,"outputTokens":60,"outputCost":0.0006000000000000001},"botpress":{"cost":0.0020700000000000002}},"meta":{"cached":false}}}
825
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-1","key":"eae2d15f","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"rate this word quality\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXpYJ0zAxqCVA8JpTx7SmDRgIY5X","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"clarity\": {\n \"very_bad\": \"Word is extremely confusing or unclear.\",\n \"bad\": \"Word is difficult to understand or vague.\",\n \"average\": \"Word is somewhat clear but imperfect.\",\n \"good\": \"Word is clear and easy to understand.\",\n \"very_good\": \"Word is exceptionally clear and precise.\"\n },\n \"relevance\": {\n \"very_bad\": \"Word is completely unrelated to context.\",\n \"bad\": \"Word has limited connection to the context.\",\n \"average\": \"Word is somewhat related to the context.\",\n \"good\": \"Word is relevant to the topic or context.\",\n \"very_good\": \"Word is highly relevant and fits perfectly.\"\n },\n \"originality\": {\n \"very_bad\": \"Word is extremely overused or unoriginal.\",\n \"bad\": \"Word is moderately overused or common.\",\n \"average\": \"Word is ordinary but not overused.\",\n \"good\": \"Word is creative and somewhat unique.\",\n \"very_good\": \"Word is highly original and innovative.\"\n },\n \"impact\": {\n \"very_bad\": \"Word has no emotional or meaningful effect.\",\n \"bad\": \"Word has little impact or resonance.\",\n \"average\": \"Word has moderate impact or significance.\",\n \"good\": \"Word has strong emotional or meaningful value.\",\n \"very_good\": \"Word is memorable and leaves a big impact.\"\n },\n \"versatility\": {\n \"very_bad\": \"Word has no adaptability to different uses.\",\n \"bad\": \"Word is limited in usage or application.\",\n \"average\": \"Word is fairly useful in some contexts.\",\n \"good\": \"Word is flexible and applicable in many contexts.\",\n \"very_good\": \"Word is highly versatile in multiple settings.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":190,"inputCost":0.00047500000000000005,"outputTokens":376,"outputCost":0.0037600000000000003},"botpress":{"cost":0.004235}},"meta":{"cached":false}}}
826
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-2","key":"f129fc38","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rate (■0 to ■3):\\n■0: {\\\"word\\\":\\\"sky\\\"}■\\n■1: {\\\"word\\\":\\\"moon\\\"}■\\n■2: {\\\"word\\\":\\\"stars\\\"}■\\n■3: {\\\"word\\\":\\\"planet\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**clarity**:\\n - very_bad (1): Word is extremely confusing or unclear.\\n - bad (2): Word is difficult to understand or vague.\\n - average (3): Word is somewhat clear but imperfect.\\n - good (4): Word is clear and easy to understand.\\n - very_good (5): Word is exceptionally clear and precise.\\n\\n**relevance**:\\n - very_bad (1): Word is completely unrelated to context.\\n - bad (2): Word has limited connection to the context.\\n - average (3): Word is somewhat related to the context.\\n - good (4): Word is relevant to the topic or context.\\n - very_good (5): Word is highly relevant and fits perfectly.\\n\\n**originality**:\\n - very_bad (1): Word is extremely overused or unoriginal.\\n - bad (2): Word is moderately overused or common.\\n - average (3): Word is ordinary but not overused.\\n - good (4): Word is creative and somewhat unique.\\n - very_good (5): Word is highly original and innovative.\\n\\n**impact**:\\n - very_bad (1): Word has no emotional or meaningful effect.\\n - bad (2): Word has little impact or resonance.\\n - average (3): Word has moderate impact or significance.\\n - good (4): Word has strong emotional or meaningful value.\\n - very_good (5): Word is memorable and leaves a big impact.\\n\\n**versatility**:\\n - very_bad (1): Word has no adaptability to different uses.\\n - bad (2): Word is limited in usage or application.\\n - average (3): Word is fairly useful in some contexts.\\n - good (4): Word is flexible and applicable in many contexts.\\n - very_good (5): Word is highly versatile in multiple settings.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■3)\\n- Use exact criterion names: clarity, relevance, originality, impact, versatility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXpdpnSnqqDZFklOYqcYvDTol0g0","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:clarity=very_good;relevance=good;originality=average;impact=good;versatility=good■ \n■1:clarity=very_good;relevance=very_good;originality=average;impact=good;versatility=good■ \n■2:clarity=very_good;relevance=very_good;originality=average;impact=good;versatility=good■ \n■3:clarity=very_good;relevance=good;originality=average;impact=good;versatility=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":618,"inputCost":0.0015450000000000001,"outputTokens":121,"outputCost":0.0012100000000000001},"botpress":{"cost":0.0027550000000000005}},"meta":{"cached":false}}}
827
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-2","key":"fbfd33f1","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"house\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"car\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"cat\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"book\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"dogs\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■3):\\n■0: {\\\"word\\\":\\\"sky\\\"}■\\n■1: {\\\"word\\\":\\\"moon\\\"}■\\n■2: {\\\"word\\\":\\\"stars\\\"}■\\n■3: {\\\"word\\\":\\\"planet\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**clarity**:\\n - very_bad (1): Word is extremely confusing or unclear.\\n - bad (2): Word is difficult to understand or vague.\\n - average (3): Word is somewhat clear but imperfect.\\n - good (4): Word is clear and easy to understand.\\n - very_good (5): Word is exceptionally clear and precise.\\n\\n**relevance**:\\n - very_bad (1): Word is completely unrelated to context.\\n - bad (2): Word has limited connection to the context.\\n - average (3): Word is somewhat related to the context.\\n - good (4): Word is relevant to the topic or context.\\n - very_good (5): Word is highly relevant and fits perfectly.\\n\\n**originality**:\\n - very_bad (1): Word is extremely overused or unoriginal.\\n - bad (2): Word is moderately overused or common.\\n - average (3): Word is ordinary but not overused.\\n - good (4): Word is creative and somewhat unique.\\n - very_good (5): Word is highly original and innovative.\\n\\n**impact**:\\n - very_bad (1): Word has no emotional or meaningful effect.\\n - bad (2): Word has little impact or resonance.\\n - average (3): Word has moderate impact or significance.\\n - good (4): Word has strong emotional or meaningful value.\\n - very_good (5): Word is memorable and leaves a big impact.\\n\\n**versatility**:\\n - very_bad (1): Word has no adaptability to different uses.\\n - bad (2): Word is limited in usage or application.\\n - average (3): Word is fairly useful in some contexts.\\n - good (4): Word is flexible and applicable in many contexts.\\n - very_good (5): Word is highly versatile in multiple settings.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■3)\\n- Use exact criterion names: clarity, relevance, originality, impact, versatility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXraSXUcFP52AGRflzXOHBty5TBL","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n■1:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n■2:clarity=very_good;relevance=average;originality=good;impact=very_good;versatility=good■ \n■3:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":818,"inputCost":0.0020450000000000004,"outputTokens":120,"outputCost":0.0012000000000000001},"botpress":{"cost":0.0032450000000000005}},"meta":{"cached":false}}}
828
+ {"test":"zai.learn.rate > learns rating patterns from examples with record instructions-2","key":"84269865","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"password\\\":\\\"weak\\\",\\\"length\\\":4,\\\"hasAll\\\":false}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:length=very_bad;complexity=very_bad;strength=very_bad■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"password\\\":\\\"Str0ng!P@ss#2024\\\",\\\"length\\\":16,\\\"hasAll\\\":true}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:length=very_good;complexity=very_good;strength=very_good■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"password\\\":\\\"MyStr0ng!Pass\\\",\\\"length\\\":13,\\\"hasAll\\\":true}■\\n■1: {\\\"password\\\":\\\"bad\\\",\\\"length\\\":3,\\\"hasAll\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**length**:\\n - very_bad (1): Less than 4 characters.\\n - bad (2): 4 to 5 characters.\\n - average (3): 6 to 7 characters.\\n - good (4): 8 to 11 characters.\\n - very_good (5): 12 or more characters.\\n\\n**complexity**:\\n - very_bad (1): Only one type of character.\\n - bad (2): Two types of characters.\\n - average (3): Three types of characters.\\n - good (4): Four types, moderate variety.\\n - very_good (5): All types: upper, lower, digits, symbols.\\n\\n**strength**:\\n - very_bad (1): Very easy to guess or crack.\\n - bad (2): Weak, low resistance to attacks.\\n - average (3): Moderate resistance, could be stronger.\\n - good (4): Secure, meets most recommendations.\\n - very_good (5): Highly secure, difficult to crack.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: length, complexity, strength\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXrpmYmX6Mi2uSa7JEcwBtdmBJm0","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:length=very_good;complexity=very_good;strength=very_good■ \n■1:length=very_bad;complexity=very_bad;strength=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":579,"inputCost":0.0014475000000000002,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.0018375000000000002}},"meta":{"cached":false}}}
829
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-2","key":"3a63ad2e","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"house\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"car\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"cat\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"book\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"table\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■3):\\n■0: {\\\"word\\\":\\\"sky\\\"}■\\n■1: {\\\"word\\\":\\\"moon\\\"}■\\n■2: {\\\"word\\\":\\\"stars\\\"}■\\n■3: {\\\"word\\\":\\\"planet\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**clarity**:\\n - very_bad (1): Word is extremely confusing or unclear.\\n - bad (2): Word is difficult to understand or vague.\\n - average (3): Word is somewhat clear but imperfect.\\n - good (4): Word is clear and easy to understand.\\n - very_good (5): Word is exceptionally clear and precise.\\n\\n**relevance**:\\n - very_bad (1): Word is completely unrelated to context.\\n - bad (2): Word has limited connection to the context.\\n - average (3): Word is somewhat related to the context.\\n - good (4): Word is relevant to the topic or context.\\n - very_good (5): Word is highly relevant and fits perfectly.\\n\\n**originality**:\\n - very_bad (1): Word is extremely overused or unoriginal.\\n - bad (2): Word is moderately overused or common.\\n - average (3): Word is ordinary but not overused.\\n - good (4): Word is creative and somewhat unique.\\n - very_good (5): Word is highly original and innovative.\\n\\n**impact**:\\n - very_bad (1): Word has no emotional or meaningful effect.\\n - bad (2): Word has little impact or resonance.\\n - average (3): Word has moderate impact or significance.\\n - good (4): Word has strong emotional or meaningful value.\\n - very_good (5): Word is memorable and leaves a big impact.\\n\\n**versatility**:\\n - very_bad (1): Word has no adaptability to different uses.\\n - bad (2): Word is limited in usage or application.\\n - average (3): Word is fairly useful in some contexts.\\n - good (4): Word is flexible and applicable in many contexts.\\n - very_good (5): Word is highly versatile in multiple settings.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■3)\\n- Use exact criterion names: clarity, relevance, originality, impact, versatility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXs9TeHjJwX6MrFrtmPGgcjV6dUy","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n■1:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n■2:clarity=very_good;relevance=average;originality=good;impact=very_good;versatility=good■ \n■3:clarity=very_good;relevance=average;originality=average;impact=good;versatility=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":818,"inputCost":0.0020450000000000004,"outputTokens":120,"outputCost":0.0012000000000000001},"botpress":{"cost":0.0032450000000000005}},"meta":{"cached":false}}}
830
+ {"test":"zai.learn.group > learns counterintuitive grouping pattern from examples-1","key":"cf27cc74","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":12}■\\n■1: {\\\"value\\\":10}■\\n■2: {\\\"value\\\":11}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Alpha■\\n■undefined:Beta■\\n■undefined:Gamma■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: 12 mod 3 = 0 (Alpha), 10 mod 3 = 1 (Beta), 11 mod 3 = 2 (Gamma)\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":2}■\\n■1: {\\\"value\\\":5}■\\n■2: {\\\"value\\\":8}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Gamma■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers with remainder 2 when divided by 3 go to Gamma group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":1}■\\n■1: {\\\"value\\\":4}■\\n■2: {\\\"value\\\":7}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Beta■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers with remainder 1 when divided by 3 go to Beta group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":3}■\\n■1: {\\\"value\\\":6}■\\n■2: {\\\"value\\\":9}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Alpha■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers divisible by 3 (remainder 0) go to Alpha group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"**Elements (■0 to ■4):**\\n■0: {\\\"value\\\":15}■\\n■1: {\\\"value\\\":16}■\\n■2: {\\\"value\\\":17}■\\n■3: {\\\"value\\\":18}■\\n■4: {\\\"value\\\":19}■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:zai/group\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** group these numbers\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXxbwV9Ip3HdmmdqDwioh9KtralX","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:Alpha■ \n■1:Beta■ \n■2:Gamma■ \n■3:Alpha■ \n■4:Beta■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":512,"inputCost":0.00128,"outputTokens":33,"outputCost":0.00033000000000000005},"botpress":{"cost":0.00161}},"meta":{"cached":false}}}
831
+ {"test":"zai.learn.group > learns counterintuitive grouping pattern from examples-2","key":"a894594c","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":12}■\\n■1: {\\\"value\\\":10}■\\n■2: {\\\"value\\\":11}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Alpha■\\n■undefined:Beta■\\n■undefined:Gamma■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: 12 mod 3 = 0 (Alpha), 10 mod 3 = 1 (Beta), 11 mod 3 = 2 (Gamma)\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":2}■\\n■1: {\\\"value\\\":5}■\\n■2: {\\\"value\\\":8}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Gamma■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers with remainder 2 when divided by 3 go to Gamma group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":1}■\\n■1: {\\\"value\\\":4}■\\n■2: {\\\"value\\\":7}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Beta■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers with remainder 1 when divided by 3 go to Beta group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"value\\\":3}■\\n■1: {\\\"value\\\":6}■\\n■2: {\\\"value\\\":9}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Alpha■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Numbers divisible by 3 (remainder 0) go to Alpha group\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"**Existing Groups (prefer reusing these):**\\n- Alpha\\n- Beta\\n- Gamma\\n\\n**Elements (■0 to ■4):**\\n■0: {\\\"value\\\":15}■\\n■1: {\\\"value\\\":16}■\\n■2: {\\\"value\\\":17}■\\n■3: {\\\"value\\\":18}■\\n■4: {\\\"value\\\":19}■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:zai/group\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** group these numbers\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXxdmtakeBxHkowkEkO2yolzw3yW","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:Alpha■ \n■1:Beta■ \n■2:Gamma■ \n■3:Alpha■ \n■4:Beta■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":531,"inputCost":0.0013275000000000001,"outputTokens":33,"outputCost":0.00033000000000000005},"botpress":{"cost":0.0016575000000000001}},"meta":{"cached":false}}}
832
+ {"test":"zai.learn.group > learns custom grouping criteria from examples-1","key":"d7872eb9","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Mike\\\"}■\\n■1: {\\\"name\\\":\\\"Rachel\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:First Half■\\n■undefined:Second Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Mike (M) → First Half, Rachel (R) → Second Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Alice\\\"}■\\n■1: {\\\"name\\\":\\\"Bob\\\"}■\\n■2: {\\\"name\\\":\\\"Charlie\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:First Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Names starting with A-M go to First Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Nancy\\\"}■\\n■1: {\\\"name\\\":\\\"Oscar\\\"}■\\n■2: {\\\"name\\\":\\\"Paula\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Second Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Names starting with N-Z go to Second Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"**Elements (■0 to ■3):**\\n■0: {\\\"name\\\":\\\"David\\\"}■\\n■1: {\\\"name\\\":\\\"Zoe\\\"}■\\n■2: {\\\"name\\\":\\\"Emma\\\"}■\\n■3: {\\\"name\\\":\\\"Victor\\\"}■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:zai/group\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** group these names\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXxgA58PD0IL1tj3FFs34UEbuOfy","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:First Half■ \n■1:Second Half■ \n■2:First Half■ \n■3:Second Half■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":384,"inputCost":0.0009600000000000001,"outputTokens":31,"outputCost":0.00031},"botpress":{"cost":0.00127}},"meta":{"cached":false}}}
833
+ {"test":"zai.learn.group > learns custom grouping criteria from examples-2","key":"2ed58909","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Mike\\\"}■\\n■1: {\\\"name\\\":\\\"Rachel\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:First Half■\\n■undefined:Second Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Mike (M) → First Half, Rachel (R) → Second Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Alice\\\"}■\\n■1: {\\\"name\\\":\\\"Bob\\\"}■\\n■2: {\\\"name\\\":\\\"Charlie\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:First Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Names starting with A-M go to First Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Elements to group:\\n■0: {\\\"name\\\":\\\"Nancy\\\"}■\\n■1: {\\\"name\\\":\\\"Oscar\\\"}■\\n■2: {\\\"name\\\":\\\"Paula\\\"}■\\n\\nGroup each element.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■undefined:Second Half■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Names starting with N-Z go to Second Half\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"**Existing Groups (prefer reusing these):**\\n- First Half\\n- Second Half\\n\\n**Elements (■0 to ■3):**\\n■0: {\\\"name\\\":\\\"David\\\"}■\\n■1: {\\\"name\\\":\\\"Zoe\\\"}■\\n■2: {\\\"name\\\":\\\"Emma\\\"}■\\n■3: {\\\"name\\\":\\\"Victor\\\"}■\\n\\n**Task:** For each element, output one line with its group label.\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.group\",\"promptSource\":\"zai:zai.group:zai/group\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are grouping elements into cohesive groups.\\n\\n**Instructions:** group these names\\n\\n\\n**Important:**\\n- Each element gets exactly ONE group label\\n- Use EXACT SAME label for similar items (case-sensitive)\\n- Create new descriptive labels when needed\\n\\n**Output Format:**\\nOne line per element:\\n■0:Group Label■\\n■1:Group Label■\\n■END■\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXXxh4YOB9eJAsPCRYo6sO2HpiF2t","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:First Half■ \n■1:Second Half■ \n■2:First Half■ \n■3:Second Half■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":402,"inputCost":0.001005,"outputTokens":31,"outputCost":0.00031},"botpress":{"cost":0.001315}},"meta":{"cached":false}}}
834
+ {"test":"zai.sort > animal sorting > should sort animals by speed (ascending - slowest to fastest)-1","key":"441e4806","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from slowest to fastest\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Cheetah\\\",\\\"type\\\":\\\"land\\\"}\\n■1: {\\\"name\\\":\\\"Sloth\\\",\\\"type\\\":\\\"land\\\"}\\n■2: {\\\"name\\\":\\\"Horse\\\",\\\"type\\\":\\\"land\\\"}\\n■3: {\\\"name\\\":\\\"Snail\\\",\\\"type\\\":\\\"land\\\"}\\n■4: {\\\"name\\\":\\\"Human\\\",\\\"type\\\":\\\"land\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvnD5jdcdI2knbIFHyyKUVLD8Gm","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■speed■ \nvery_slow;slow;moderate;fast;very_fast \n■END■ \n\n■mobility■ \nimmobile;sluggish;walking;running;sprinting \n■END■ \n\n■energy■ \nlow_energy;moderate_energy;high_energy;maximum_energy \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":522,"inputCost":0.0013050000000000002,"outputTokens":64,"outputCost":0.00064},"botpress":{"cost":0.0019450000000000001}},"meta":{"cached":true}}}
835
+ {"test":"zai.sort > animal sorting > should sort animals by speed (ascending - slowest to fastest)-2","key":"8e3c9100","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■7):\\n■0: {\\\"name\\\":\\\"Cheetah\\\",\\\"type\\\":\\\"land\\\"}■\\n■1: {\\\"name\\\":\\\"Sloth\\\",\\\"type\\\":\\\"land\\\"}■\\n■2: {\\\"name\\\":\\\"Horse\\\",\\\"type\\\":\\\"land\\\"}■\\n■3: {\\\"name\\\":\\\"Snail\\\",\\\"type\\\":\\\"land\\\"}■\\n■4: {\\\"name\\\":\\\"Human\\\",\\\"type\\\":\\\"land\\\"}■\\n■5: {\\\"name\\\":\\\"Turtle\\\",\\\"type\\\":\\\"land\\\"}■\\n■6: {\\\"name\\\":\\\"Lion\\\",\\\"type\\\":\\\"land\\\"}■\\n■7: {\\\"name\\\":\\\"Elephant\\\",\\\"type\\\":\\\"land\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**speed**: very_slow;slow;moderate;fast;very_fast\\n**mobility**: immobile;sluggish;walking;running;sprinting\\n**energy**: low_energy;moderate_energy;high_energy;maximum_energy\\n\\nInstructions: \\\"from slowest to fastest\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■7)\\n- Use exact criterion names: speed, mobility, energy\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvo6bjZW2oBgIh8gDu82ySvdy3t","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:speed=very_fast;mobility=sprinting;energy=maximum_energy■ \n■1:speed=slow;mobility=sluggish;energy=low_energy■ \n■2:speed=fast;mobility=running;energy=high_energy■ \n■3:speed=very_slow;mobility=immobile;energy=low_energy■ \n■4:speed=moderate;mobility=walking;energy=moderate_energy■ \n■5:speed=slow;mobility=walking;energy=low_energy■ \n■6:speed=fast;mobility=running;energy=high_energy■ \n■7:speed=moderate;mobility=walking;energy=moderate_energy■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":384,"inputCost":0.0009600000000000001,"outputTokens":156,"outputCost":0.0015600000000000002},"botpress":{"cost":0.0025200000000000005}},"meta":{"cached":true}}}
836
+ {"test":"zai.sort > animal sorting > should sort animals by speed (ascending - slowest to fastest)-3","key":"f818fe16","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Horse\\\",\\\"type\\\":\\\"land\\\"}■\\n■1: {\\\"name\\\":\\\"Lion\\\",\\\"type\\\":\\\"land\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from slowest to fastest\\n\\nCriteria:\\n- speed: very_slow;slow;moderate;fast;very_fast\\n- mobility: immobile;sluggish;walking;running;sprinting\\n- energy: low_energy;moderate_energy;high_energy;maximum_energy\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvqQhenlRGjbA8cH0VwyvJBmuJ7","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■\n■1■\n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":194,"inputCost":0.000485,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.000595}},"meta":{"cached":true}}}
837
+ {"test":"zai.sort > animal sorting > should sort animals by speed (ascending - slowest to fastest)-4","key":"2e05791b","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Human\\\",\\\"type\\\":\\\"land\\\"}■\\n■1: {\\\"name\\\":\\\"Elephant\\\",\\\"type\\\":\\\"land\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from slowest to fastest\\n\\nCriteria:\\n- speed: very_slow;slow;moderate;fast;very_fast\\n- mobility: immobile;sluggish;walking;running;sprinting\\n- energy: low_energy;moderate_energy;high_energy;maximum_energy\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvq0H0LQ7zzYtUd00I3Yuvwv6dZ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To break the tie, I need the attributes for \"speed,\" \"mobility,\" and \"energy\" for these items (Human and Elephant). Could you provide them?","index":0,"stopReason":"stop"}],"usage":{"inputTokens":195,"inputCost":0.00048750000000000003,"outputTokens":34,"outputCost":0.00034},"botpress":{"cost":0.0008275000000000001}},"meta":{"cached":true}}}
838
+ {"test":"zai.sort > animal sorting > should sort animals by speed (descending - fastest to slowest)-1","key":"b35a5e24","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from fastest to slowest\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Snail\\\",\\\"type\\\":\\\"land\\\"}\\n■1: {\\\"name\\\":\\\"Cheetah\\\",\\\"type\\\":\\\"land\\\"}\\n■2: {\\\"name\\\":\\\"Turtle\\\",\\\"type\\\":\\\"land\\\"}\\n■3: {\\\"name\\\":\\\"Horse\\\",\\\"type\\\":\\\"land\\\"}\\n■4: {\\\"name\\\":\\\"Human\\\",\\\"type\\\":\\\"land\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvrY4WTGacr81pufa4XE0qHKXYc","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■speed■ \nfastest;very_fast;fast;moderate;slow;very_slow;slowest \n■END■ \n\n■animal_speed■ \ncheetah;horse;human;turtle;snail \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":522,"inputCost":0.0013050000000000002,"outputTokens":50,"outputCost":0.0005},"botpress":{"cost":0.0018050000000000002}},"meta":{"cached":true}}}
839
+ {"test":"zai.sort > animal sorting > should sort animals by speed (descending - fastest to slowest)-2","key":"77f04820","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■6):\\n■0: {\\\"name\\\":\\\"Snail\\\",\\\"type\\\":\\\"land\\\"}■\\n■1: {\\\"name\\\":\\\"Cheetah\\\",\\\"type\\\":\\\"land\\\"}■\\n■2: {\\\"name\\\":\\\"Turtle\\\",\\\"type\\\":\\\"land\\\"}■\\n■3: {\\\"name\\\":\\\"Horse\\\",\\\"type\\\":\\\"land\\\"}■\\n■4: {\\\"name\\\":\\\"Human\\\",\\\"type\\\":\\\"land\\\"}■\\n■5: {\\\"name\\\":\\\"Sloth\\\",\\\"type\\\":\\\"land\\\"}■\\n■6: {\\\"name\\\":\\\"Lion\\\",\\\"type\\\":\\\"land\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**speed**: fastest;very_fast;fast;moderate;slow;very_slow;slowest\\n**animal_speed**: cheetah;horse;human;turtle;snail\\n\\nInstructions: \\\"from fastest to slowest\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■6)\\n- Use exact criterion names: speed, animal_speed\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvtQIt0QdhHA2CZw7w77yrAtu8s","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:speed=very_slow;animal_speed=snail■ \n■1:speed=fastest;animal_speed=cheetah■ \n■2:speed=slow;animal_speed=turtle■ \n■3:speed=very_fast;animal_speed=horse■ \n■4:speed=moderate;animal_speed=human■ \n■5:speed=slowest;animal_speed=snail■ \n■6:speed=fast;animal_speed=horse■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":357,"inputCost":0.0008925000000000001,"outputTokens":104,"outputCost":0.0010400000000000001},"botpress":{"cost":0.0019325000000000002}},"meta":{"cached":true}}}
840
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (ascending - least to most dangerous)-1","key":"8022faa2","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from least dangerous to most dangerous\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Grizzly Bear\\\",\\\"habitat\\\":\\\"forest\\\"}\\n■1: {\\\"name\\\":\\\"Rabbit\\\",\\\"habitat\\\":\\\"grassland\\\"}\\n■2: {\\\"name\\\":\\\"Hippopotamus\\\",\\\"habitat\\\":\\\"river\\\"}\\n■3: {\\\"name\\\":\\\"Deer\\\",\\\"habitat\\\":\\\"forest\\\"}\\n■4: {\\\"name\\\":\\\"Great White Shark\\\",\\\"habitat\\\":\\\"ocean\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvuT5iHj3O3O1xtbkeLEpOFlcCC","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■danger_level■ \nharmless;slight;moderate;dangerous;very_dangerous;extremely_dangerous \n■END■ \n\n■predator_type■ \nnon_predator;occasional_predator;apex_predator \n■END■ \n\n■animal_size■ \nvery_small;small;medium;large;very_large;massive \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":534,"inputCost":0.001335,"outputTokens":81,"outputCost":0.0008100000000000001},"botpress":{"cost":0.002145}},"meta":{"cached":true}}}
841
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (ascending - least to most dangerous)-2","key":"d05c26e2","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■8):\\n■0: {\\\"name\\\":\\\"Grizzly Bear\\\",\\\"habitat\\\":\\\"forest\\\"}■\\n■1: {\\\"name\\\":\\\"Rabbit\\\",\\\"habitat\\\":\\\"grassland\\\"}■\\n■2: {\\\"name\\\":\\\"Hippopotamus\\\",\\\"habitat\\\":\\\"river\\\"}■\\n■3: {\\\"name\\\":\\\"Deer\\\",\\\"habitat\\\":\\\"forest\\\"}■\\n■4: {\\\"name\\\":\\\"Great White Shark\\\",\\\"habitat\\\":\\\"ocean\\\"}■\\n■5: {\\\"name\\\":\\\"Butterfly\\\",\\\"habitat\\\":\\\"garden\\\"}■\\n■6: {\\\"name\\\":\\\"Crocodile\\\",\\\"habitat\\\":\\\"swamp\\\"}■\\n■7: {\\\"name\\\":\\\"Lion\\\",\\\"habitat\\\":\\\"savanna\\\"}■\\n■8: {\\\"name\\\":\\\"Lamb\\\",\\\"habitat\\\":\\\"farm\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**danger_level**: harmless;slight;moderate;dangerous;very_dangerous;extremely_dangerous\\n**predator_type**: non_predator;occasional_predator;apex_predator\\n**animal_size**: very_small;small;medium;large;very_large;massive\\n\\nInstructions: \\\"from least dangerous to most dangerous\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■8)\\n- Use exact criterion names: danger_level, predator_type, animal_size\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvwbKyZ2AHHhwVzNjV00AgcPskP","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:danger_level=very_dangerous;predator_type=occasional_predator;animal_size=large■ \n■1:danger_level=harmless;predator_type=non_predator;animal_size=small■ \n■2:danger_level=dangerous;predator_type=occasional_predator;animal_size=very_large■ \n■3:danger_level=slight;predator_type=non_predator;animal_size=medium■ \n■4:danger_level=extremely_dangerous;predator_type=apex_predator;animal_size=very_large■ \n■5:danger_level=harmless;predator_type=non_predator;animal_size=very_small■ \n■6:danger_level=very_dangerous;predator_type=apex_predator;animal_size=large■ \n■7:danger_level=very_dangerous;predator_type=apex_predator;animal_size=large■ \n■8:danger_level=slight;predator_type=non_predator;animal_size=small■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":438,"inputCost":0.001095,"outputTokens":229,"outputCost":0.0022900000000000004},"botpress":{"cost":0.0033850000000000004}},"meta":{"cached":true}}}
842
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (ascending - least to most dangerous)-4","key":"5bad8867","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Crocodile\\\",\\\"habitat\\\":\\\"swamp\\\"}■\\n■1: {\\\"name\\\":\\\"Lion\\\",\\\"habitat\\\":\\\"savanna\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from least dangerous to most dangerous\\n\\nCriteria:\\n- danger_level: harmless;slight;moderate;dangerous;very_dangerous;extremely_dangerous\\n- predator_type: non_predator;occasional_predator;apex_predator\\n- animal_size: very_small;small;medium;large;very_large;massive\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvycVe84rjbn8Rlchq1EkAXXaRY","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To break the tie, I need the details about their respective danger_level, predator_type, and animal_size, as provided in the instructions. Without this information, I cannot accurately order them based on the given criteria. Could you provide these details for both the Crocodile and the Lion?","index":0,"stopReason":"stop"}],"usage":{"inputTokens":218,"inputCost":0.000545,"outputTokens":58,"outputCost":0.00058},"botpress":{"cost":0.0011250000000000001}},"meta":{"cached":true}}}
843
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (ascending - least to most dangerous)-3","key":"5992d006","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Grizzly Bear\\\",\\\"habitat\\\":\\\"forest\\\"}■\\n■1: {\\\"name\\\":\\\"Hippopotamus\\\",\\\"habitat\\\":\\\"river\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from least dangerous to most dangerous\\n\\nCriteria:\\n- danger_level: harmless;slight;moderate;dangerous;very_dangerous;extremely_dangerous\\n- predator_type: non_predator;occasional_predator;apex_predator\\n- animal_size: very_small;small;medium;large;very_large;massive\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvxJiJHaJTIZO3AhH2l5kSkkmze","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To break the tie, we use the criteria provided: danger_level, predator_type, and animal_size. Since no specific attributes are assigned to these animals, we assign typical values based on general knowledge.\n\n### Grizzly Bear:\n- danger_level: very_dangerous\n- predator_type: apex_predator\n- animal_size: large\n\n### Hippopotamus:\n- danger_level: extremely_dangerous\n- predator_type: occasional_predator\n- animal_size: massive\n\n### Tie-Breaking Analysis:\n1. **danger_level**: Hippopotamus (extremely_dangerous) ranks higher than Grizzly Bear (very_dangerous).\n2. **predator_type**: Grizzly Bear (apex_predator) ranks higher than Hippopotamus (occasional_predator), but this comes after danger_level.\n3. **animal_size**: Hippopotamus (massive) ranks higher than Grizzly Bear (large), following the previous criteria.\n\nBased on these comparisons, the Hippopotamus is more dangerous than the Grizzly Bear.\n\n### Final Order:\n■0■ \n■1■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":218,"inputCost":0.000545,"outputTokens":230,"outputCost":0.0023000000000000004},"botpress":{"cost":0.0028450000000000003}},"meta":{"cached":true}}}
844
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (descending - most to least dangerous)-1","key":"389b88f6","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from most dangerous to least dangerous\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Rabbit\\\",\\\"type\\\":\\\"mammal\\\"}\\n■1: {\\\"name\\\":\\\"Lion\\\",\\\"type\\\":\\\"mammal\\\"}\\n■2: {\\\"name\\\":\\\"Butterfly\\\",\\\"type\\\":\\\"insect\\\"}\\n■3: {\\\"name\\\":\\\"Crocodile\\\",\\\"type\\\":\\\"reptile\\\"}\\n■4: {\\\"name\\\":\\\"Deer\\\",\\\"type\\\":\\\"mammal\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYvz58vAlm3RZNVqkg0w7F1vpoZu","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■danger■ \nextremely_dangerous;very_dangerous;moderately_dangerous;slightly_dangerous;mostly_harmless;harmless \n■END■ \n\n■threat_level■ \ndeadly;predator;aggressive;neutral;cautious;docile \n■END■ \n\n■risk■ \nmaximum;high;moderate;low;minimal;none \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":533,"inputCost":0.0013325000000000001,"outputTokens":87,"outputCost":0.0008700000000000001},"botpress":{"cost":0.0022025000000000005}},"meta":{"cached":true}}}
845
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (descending - most to least dangerous)-2","key":"4852d9c","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■6):\\n■0: {\\\"name\\\":\\\"Rabbit\\\",\\\"type\\\":\\\"mammal\\\"}■\\n■1: {\\\"name\\\":\\\"Lion\\\",\\\"type\\\":\\\"mammal\\\"}■\\n■2: {\\\"name\\\":\\\"Butterfly\\\",\\\"type\\\":\\\"insect\\\"}■\\n■3: {\\\"name\\\":\\\"Crocodile\\\",\\\"type\\\":\\\"reptile\\\"}■\\n■4: {\\\"name\\\":\\\"Deer\\\",\\\"type\\\":\\\"mammal\\\"}■\\n■5: {\\\"name\\\":\\\"Cobra\\\",\\\"type\\\":\\\"reptile\\\"}■\\n■6: {\\\"name\\\":\\\"Squirrel\\\",\\\"type\\\":\\\"mammal\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**danger**: extremely_dangerous;very_dangerous;moderately_dangerous;slightly_dangerous;mostly_harmless;harmless\\n**threat_level**: deadly;predator;aggressive;neutral;cautious;docile\\n**risk**: maximum;high;moderate;low;minimal;none\\n\\nInstructions: \\\"from most dangerous to least dangerous\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■6)\\n- Use exact criterion names: danger, threat_level, risk\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw1pob7HSmoWGJVVEb5Dz2coWyV","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:danger=harmless;threat_level=docile;risk=none■ \n■1:danger=very_dangerous;threat_level=predator;risk=high■ \n■2:danger=harmless;threat_level=docile;risk=none■ \n■3:danger=extremely_dangerous;threat_level=deadly;risk=maximum■ \n■4:danger=slightly_dangerous;threat_level=cautious;risk=low■ \n■5:danger=very_dangerous;threat_level=aggressive;risk=high■ \n■6:danger=mostly_harmless;threat_level=neutral;risk=minimal■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":407,"inputCost":0.0010175000000000002,"outputTokens":154,"outputCost":0.0015400000000000001},"botpress":{"cost":0.0025575000000000003}},"meta":{"cached":true}}}
846
+ {"test":"zai.sort > animal sorting > should sort animals by danger level (descending - most to least dangerous)-3","key":"cebe775f","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Rabbit\\\",\\\"type\\\":\\\"mammal\\\"}■\\n■1: {\\\"name\\\":\\\"Butterfly\\\",\\\"type\\\":\\\"insect\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from most dangerous to least dangerous\\n\\nCriteria:\\n- danger: extremely_dangerous;very_dangerous;moderately_dangerous;slightly_dangerous;mostly_harmless;harmless\\n- threat_level: deadly;predator;aggressive;neutral;cautious;docile\\n- risk: maximum;high;moderate;low;minimal;none\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw2pK2vJDgchZMMSg0lW2DSm56L","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":220,"inputCost":0.00055,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.00066}},"meta":{"cached":true}}}
847
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (spam to urgent bills)-1","key":"1ca130f3","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from least urgent (spam) to most urgent (bills to pay)\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"subject\\\":\\\"Electricity bill due tomorrow\\\",\\\"from\\\":\\\"utility@electric.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}\\n■1: {\\\"subject\\\":\\\"Hot singles in your area!\\\",\\\"from\\\":\\\"spam@xyz.com\\\",\\\"date\\\":\\\"2024-01-14\\\"}\\n■2: {\\\"subject\\\":\\\"Team meeting notes\\\",\\\"from\\\":\\\"colleague@work.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}\\n■3: {\\\"subject\\\":\\\"Your credit card payment is overdue\\\",\\\"from\\\":\\\"bank@chase.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}\\n■4: {\\\"subject\\\":\\\"Get rich quick scheme\\\",\\\"from\\\":\\\"scam@fake.com\\\",\\\"date\\\":\\\"2024-01-13\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw3IwBRXqsi7tHQVvwWJOZ6QCp9","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■urgency■ \nspam;promotional;routine;important;urgent;bills_to_pay \n■END■ \n\n■category■ \nspam;scams;general_updates;work_related;financial \n■END■ \n\n■priority■ \nlow;moderate;high;critical \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":609,"inputCost":0.0015225000000000002,"outputTokens":62,"outputCost":0.00062},"botpress":{"cost":0.0021425000000000003}},"meta":{"cached":true}}}
848
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (spam to urgent bills)-2","key":"fb90b15a","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■8):\\n■0: {\\\"subject\\\":\\\"Electricity bill due tomorrow\\\",\\\"from\\\":\\\"utility@electric.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■1: {\\\"subject\\\":\\\"Hot singles in your area!\\\",\\\"from\\\":\\\"spam@xyz.com\\\",\\\"date\\\":\\\"2024-01-14\\\"}■\\n■2: {\\\"subject\\\":\\\"Team meeting notes\\\",\\\"from\\\":\\\"colleague@work.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■3: {\\\"subject\\\":\\\"Your credit card payment is overdue\\\",\\\"from\\\":\\\"bank@chase.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■4: {\\\"subject\\\":\\\"Get rich quick scheme\\\",\\\"from\\\":\\\"scam@fake.com\\\",\\\"date\\\":\\\"2024-01-13\\\"}■\\n■5: {\\\"subject\\\":\\\"Newsletter: Weekly updates\\\",\\\"from\\\":\\\"newsletter@blog.com\\\",\\\"date\\\":\\\"2024-01-14\\\"}■\\n■6: {\\\"subject\\\":\\\"Rent payment due in 2 days\\\",\\\"from\\\":\\\"landlord@property.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■7: {\\\"subject\\\":\\\"Congratulations! You won the lottery!\\\",\\\"from\\\":\\\"notreal@scam.com\\\",\\\"date\\\":\\\"2024-01-12\\\"}■\\n■8: {\\\"subject\\\":\\\"Project deadline reminder\\\",\\\"from\\\":\\\"manager@work.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**urgency**: spam;promotional;routine;important;urgent;bills_to_pay\\n**category**: spam;scams;general_updates;work_related;financial\\n**priority**: low;moderate;high;critical\\n\\nInstructions: \\\"from least urgent (spam) to most urgent (bills to pay)\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■8)\\n- Use exact criterion names: urgency, category, priority\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw4T5Ebeco2PVPC3PXpU42jCggn","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:urgency=bills_to_pay;category=financial;priority=critical■ \n■1:urgency=spam;category=spam;priority=low■ \n■2:urgency=routine;category=work_related;priority=moderate■ \n■3:urgency=urgent;category=financial;priority=critical■ \n■4:urgency=spam;category=scams;priority=low■ \n■5:urgency=promotional;category=general_updates;priority=low■ \n■6:urgency=urgent;category=financial;priority=critical■ \n■7:urgency=spam;category=scams;priority=low■ \n■8:urgency=important;category=work_related;priority=high■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":548,"inputCost":0.0013700000000000001,"outputTokens":163,"outputCost":0.0016300000000000002},"botpress":{"cost":0.003}},"meta":{"cached":true}}}
849
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (spam to urgent bills)-3","key":"a2330f5e","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"subject\\\":\\\"Your credit card payment is overdue\\\",\\\"from\\\":\\\"bank@chase.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■1: {\\\"subject\\\":\\\"Rent payment due in 2 days\\\",\\\"from\\\":\\\"landlord@property.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from least urgent (spam) to most urgent (bills to pay)\\n\\nCriteria:\\n- urgency: spam;promotional;routine;important;urgent;bills_to_pay\\n- category: spam;scams;general_updates;work_related;financial\\n- priority: low;moderate;high;critical\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw60j2ctxBQIkQyEzYTPv4S5oWQ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■1■ \n■0■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":237,"inputCost":0.0005925,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.0007025}},"meta":{"cached":true}}}
850
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (spam to urgent bills)-4","key":"9fa7d40f","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"subject\\\":\\\"Get rich quick scheme\\\",\\\"from\\\":\\\"scam@fake.com\\\",\\\"date\\\":\\\"2024-01-13\\\"}■\\n■1: {\\\"subject\\\":\\\"Congratulations! You won the lottery!\\\",\\\"from\\\":\\\"notreal@scam.com\\\",\\\"date\\\":\\\"2024-01-12\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from least urgent (spam) to most urgent (bills to pay)\\n\\nCriteria:\\n- urgency: spam;promotional;routine;important;urgent;bills_to_pay\\n- category: spam;scams;general_updates;work_related;financial\\n- priority: low;moderate;high;critical\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw6RbnOUPmKStVlOVWLrakxLwkw","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":237,"inputCost":0.0005925,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.0007025}},"meta":{"cached":true}}}
851
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (urgent bills to spam)-1","key":"525771c6","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from most urgent (bills to pay) to least urgent (spam)\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"subject\\\":\\\"Newsletter subscription\\\",\\\"from\\\":\\\"news@site.com\\\",\\\"date\\\":\\\"2024-01-14\\\"}\\n■1: {\\\"subject\\\":\\\"Insurance payment required\\\",\\\"from\\\":\\\"insurance@company.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}\\n■2: {\\\"subject\\\":\\\"Win a free iPhone!\\\",\\\"from\\\":\\\"spam@ads.com\\\",\\\"date\\\":\\\"2024-01-13\\\"}\\n■3: {\\\"subject\\\":\\\"Mortgage payment due\\\",\\\"from\\\":\\\"bank@mortgage.com\\\",\\\"date\\\":\\\"2024-01-16\\\"}\\n■4: {\\\"subject\\\":\\\"Social media notification\\\",\\\"from\\\":\\\"noreply@social.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw6nYoFJQgeVcOeslvGtN1ieSAg","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■urgency■ \nbills;important;notifications;promotions;spam \n■END■ \n\n■priority■ \ncritical;high;medium;low;junk \n■END■ \n\n■type■ \nfinancial;important;general;social;advertisement \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":602,"inputCost":0.001505,"outputTokens":58,"outputCost":0.00058},"botpress":{"cost":0.002085}},"meta":{"cached":true}}}
852
+ {"test":"zai.sort > email priority sorting > should sort emails by priority (urgent bills to spam)-2","key":"c71a6880","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■4):\\n■0: {\\\"subject\\\":\\\"Newsletter subscription\\\",\\\"from\\\":\\\"news@site.com\\\",\\\"date\\\":\\\"2024-01-14\\\"}■\\n■1: {\\\"subject\\\":\\\"Insurance payment required\\\",\\\"from\\\":\\\"insurance@company.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n■2: {\\\"subject\\\":\\\"Win a free iPhone!\\\",\\\"from\\\":\\\"spam@ads.com\\\",\\\"date\\\":\\\"2024-01-13\\\"}■\\n■3: {\\\"subject\\\":\\\"Mortgage payment due\\\",\\\"from\\\":\\\"bank@mortgage.com\\\",\\\"date\\\":\\\"2024-01-16\\\"}■\\n■4: {\\\"subject\\\":\\\"Social media notification\\\",\\\"from\\\":\\\"noreply@social.com\\\",\\\"date\\\":\\\"2024-01-15\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**urgency**: bills;important;notifications;promotions;spam\\n**priority**: critical;high;medium;low;junk\\n**type**: financial;important;general;social;advertisement\\n\\nInstructions: \\\"from most urgent (bills to pay) to least urgent (spam)\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■4)\\n- Use exact criterion names: urgency, priority, type\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw76qcCBFylQvuqGavXbZDqY2aJ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:urgency=promotions;priority=low;type=advertisement■ \n■1:urgency=bills;priority=high;type=financial■ \n■2:urgency=spam;priority=junk;type=advertisement■ \n■3:urgency=bills;priority=critical;type=financial■ \n■4:urgency=notifications;priority=medium;type=social■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":413,"inputCost":0.0010325,"outputTokens":91,"outputCost":0.0009100000000000001},"botpress":{"cost":0.0019425000000000002}},"meta":{"cached":true}}}
853
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by urgency (based on ARR and customer tenure)-1","key":"1f5f26c8","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from least urgent to most urgent (consider both ARR and how long they have been customers - high ARR + long tenure = most urgent)\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":\\\"T-001\\\",\\\"customer\\\":\\\"StartupCo\\\",\\\"issue\\\":\\\"Login not working\\\",\\\"arr\\\":5000,\\\"customerSince\\\":\\\"2023-01-15\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T10:00:00Z\\\"}\\n■1: {\\\"id\\\":\\\"T-002\\\",\\\"customer\\\":\\\"EnterpriseCorp\\\",\\\"issue\\\":\\\"System down - production outage\\\",\\\"arr\\\":500000,\\\"customerSince\\\":\\\"2020-03-10\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T09:00:00Z\\\"}\\n■2: {\\\"id\\\":\\\"T-003\\\",\\\"customer\\\":\\\"MediumBiz\\\",\\\"issue\\\":\\\"Feature request for dashboard\\\",\\\"arr\\\":50000,\\\"customerSince\\\":\\\"2022-06-20\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T11:00:00Z\\\"}\\n■3: {\\\"id\\\":\\\"T-004\\\",\\\"customer\\\":\\\"NewCustomer\\\",\\\"issue\\\":\\\"Cannot access account\\\",\\\"arr\\\":2000,\\\"customerSince\\\":\\\"2024-01-01\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T08:00:00Z\\\"}\\n■4: {\\\"id\\\":\\\"T-005\\\",\\\"customer\\\":\\\"LoyalClient\\\",\\\"issue\\\":\\\"Critical bug affecting workflow\\\",\\\"arr\\\":120000,\\\"customerSince\\\":\\\"2019-01-01\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T10:30:00Z\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYw94JikY9q1leqK5rad1fi8rB4B","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■urgency■ \nlow_arr_short_tenure;low_arr_medium_tenure;low_arr_long_tenure;medium_arr_short_tenure;medium_arr_medium_tenure;medium_arr_long_tenure;high_arr_short_tenure;high_arr_medium_tenure;high_arr_long_tenure \n■END■ \n\n■arr_tenure■ \nminimal_investment;low_value_recent;low_value_established;moderate_value_recent;moderate_value_established;high_value_loyal;highest_value_loyal \n■END■ \n\n■priority■ \nleast_impactful;minor_issue;functional_support;workflow_blocker;production_outage;critical_outage \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":769,"inputCost":0.0019225000000000002,"outputTokens":149,"outputCost":0.0014900000000000002},"botpress":{"cost":0.0034125000000000006}},"meta":{"cached":true}}}
854
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by urgency (based on ARR and customer tenure)-2","key":"ecf8bd0c","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■4):\\n■0: {\\\"id\\\":\\\"T-001\\\",\\\"customer\\\":\\\"StartupCo\\\",\\\"issue\\\":\\\"Login not working\\\",\\\"arr\\\":5000,\\\"customerSince\\\":\\\"2023-01-15\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T10:00:00Z\\\"}■\\n■1: {\\\"id\\\":\\\"T-002\\\",\\\"customer\\\":\\\"EnterpriseCorp\\\",\\\"issue\\\":\\\"System down - production outage\\\",\\\"arr\\\":500000,\\\"customerSince\\\":\\\"2020-03-10\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T09:00:00Z\\\"}■\\n■2: {\\\"id\\\":\\\"T-003\\\",\\\"customer\\\":\\\"MediumBiz\\\",\\\"issue\\\":\\\"Feature request for dashboard\\\",\\\"arr\\\":50000,\\\"customerSince\\\":\\\"2022-06-20\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T11:00:00Z\\\"}■\\n■3: {\\\"id\\\":\\\"T-004\\\",\\\"customer\\\":\\\"NewCustomer\\\",\\\"issue\\\":\\\"Cannot access account\\\",\\\"arr\\\":2000,\\\"customerSince\\\":\\\"2024-01-01\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T08:00:00Z\\\"}■\\n■4: {\\\"id\\\":\\\"T-005\\\",\\\"customer\\\":\\\"LoyalClient\\\",\\\"issue\\\":\\\"Critical bug affecting workflow\\\",\\\"arr\\\":120000,\\\"customerSince\\\":\\\"2019-01-01\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-15T10:30:00Z\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**urgency**: low_arr_short_tenure;low_arr_medium_tenure;low_arr_long_tenure;medium_arr_short_tenure;medium_arr_medium_tenure;medium_arr_long_tenure;high_arr_short_tenure;high_arr_medium_tenure;high_arr_long_tenure\\n**arr_tenure**: minimal_investment;low_value_recent;low_value_established;moderate_value_recent;moderate_value_established;high_value_loyal;highest_value_loyal\\n**priority**: least_impactful;minor_issue;functional_support;workflow_blocker;production_outage;critical_outage\\n\\nInstructions: \\\"from least urgent to most urgent (consider both ARR and how long they have been customers - high ARR + long tenure = most urgent)\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■4)\\n- Use exact criterion names: urgency, arr_tenure, priority\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwBzpdL3iTw3l0F3ScVcNUhABSE","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:urgency=low_arr_short_tenure;arr_tenure=low_value_recent■ \n■1:urgency=high_arr_long_tenure;arr_tenure=highest_value_loyal■ \n■2:urgency=medium_arr_medium_tenure;arr_tenure=moderate_value_recent■ \n■3:urgency=low_arr_short_tenure;arr_tenure=minimal_investment■ \n■4:urgency=high_arr_long_tenure;arr_tenure=high_value_loyal■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":675,"inputCost":0.0016875000000000002,"outputTokens":122,"outputCost":0.0012200000000000002},"botpress":{"cost":0.0029075000000000004}},"meta":{"cached":true}}}
855
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by creation date (oldest to newest)-1","key":"90bce7f0","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from oldest to newest by creation date\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":\\\"T-101\\\",\\\"title\\\":\\\"Bug in checkout\\\",\\\"createdAt\\\":\\\"2024-01-15T14:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■1: {\\\"id\\\":\\\"T-102\\\",\\\"title\\\":\\\"Feature request\\\",\\\"createdAt\\\":\\\"2024-01-10T09:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■2: {\\\"id\\\":\\\"T-103\\\",\\\"title\\\":\\\"Login issue\\\",\\\"createdAt\\\":\\\"2024-01-18T11:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■3: {\\\"id\\\":\\\"T-104\\\",\\\"title\\\":\\\"Payment failed\\\",\\\"createdAt\\\":\\\"2024-01-12T16:30:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■4: {\\\"id\\\":\\\"T-105\\\",\\\"title\\\":\\\"UI glitch\\\",\\\"createdAt\\\":\\\"2024-01-08T08:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwD1mwnEUfgVO4AVXTTUgf5W1lz","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■creation_date■ \noldest;older;recent;newer;newest \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":641,"inputCost":0.0016025000000000002,"outputTokens":22,"outputCost":0.00022},"botpress":{"cost":0.0018225000000000001}},"meta":{"cached":true}}}
856
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by creation date (oldest to newest)-2","key":"b0f5ef95","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■4):\\n■0: {\\\"id\\\":\\\"T-101\\\",\\\"title\\\":\\\"Bug in checkout\\\",\\\"createdAt\\\":\\\"2024-01-15T14:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■1: {\\\"id\\\":\\\"T-102\\\",\\\"title\\\":\\\"Feature request\\\",\\\"createdAt\\\":\\\"2024-01-10T09:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■2: {\\\"id\\\":\\\"T-103\\\",\\\"title\\\":\\\"Login issue\\\",\\\"createdAt\\\":\\\"2024-01-18T11:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■3: {\\\"id\\\":\\\"T-104\\\",\\\"title\\\":\\\"Payment failed\\\",\\\"createdAt\\\":\\\"2024-01-12T16:30:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■4: {\\\"id\\\":\\\"T-105\\\",\\\"title\\\":\\\"UI glitch\\\",\\\"createdAt\\\":\\\"2024-01-08T08:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**creation_date**: oldest;older;recent;newer;newest\\n\\nInstructions: \\\"from oldest to newest by creation date\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■4)\\n- Use exact criterion names: creation_date\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwEsqP9hkWORSuGpYNmUtM7fdDM","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:creation_date=newer■ \n■1:creation_date=older■ \n■2:creation_date=newest■ \n■3:creation_date=recent■ \n■4:creation_date=oldest■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":421,"inputCost":0.0010525,"outputTokens":49,"outputCost":0.0004900000000000001},"botpress":{"cost":0.0015425}},"meta":{"cached":true}}}
857
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by creation date (newest to oldest)-1","key":"a964f44c","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from newest to oldest by creation date\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":\\\"T-201\\\",\\\"title\\\":\\\"Issue A\\\",\\\"createdAt\\\":\\\"2024-01-10T10:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■1: {\\\"id\\\":\\\"T-202\\\",\\\"title\\\":\\\"Issue B\\\",\\\"createdAt\\\":\\\"2024-01-15T12:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■2: {\\\"id\\\":\\\"T-203\\\",\\\"title\\\":\\\"Issue C\\\",\\\"createdAt\\\":\\\"2024-01-08T09:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n■3: {\\\"id\\\":\\\"T-204\\\",\\\"title\\\":\\\"Issue D\\\",\\\"createdAt\\\":\\\"2024-01-20T14:00:00Z\\\",\\\"status\\\":\\\"open\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwF99lIMDsk2PrHcjtQMacUYUFL","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■creation_date■ \nnewest;recent;older;oldest \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":604,"inputCost":0.00151,"outputTokens":19,"outputCost":0.00019},"botpress":{"cost":0.0017000000000000001}},"meta":{"cached":true}}}
858
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by creation date (newest to oldest)-2","key":"d995212d","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■3):\\n■0: {\\\"id\\\":\\\"T-201\\\",\\\"title\\\":\\\"Issue A\\\",\\\"createdAt\\\":\\\"2024-01-10T10:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■1: {\\\"id\\\":\\\"T-202\\\",\\\"title\\\":\\\"Issue B\\\",\\\"createdAt\\\":\\\"2024-01-15T12:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■2: {\\\"id\\\":\\\"T-203\\\",\\\"title\\\":\\\"Issue C\\\",\\\"createdAt\\\":\\\"2024-01-08T09:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n■3: {\\\"id\\\":\\\"T-204\\\",\\\"title\\\":\\\"Issue D\\\",\\\"createdAt\\\":\\\"2024-01-20T14:00:00Z\\\",\\\"status\\\":\\\"open\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**creation_date**: newest;recent;older;oldest\\n\\nInstructions: \\\"from newest to oldest by creation date\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■3)\\n- Use exact criterion names: creation_date\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwGx7LeqDQQbKOvyAuCL0tJLmd5","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:creation_date=older■ \n■1:creation_date=recent■ \n■2:creation_date=oldest■ \n■3:creation_date=newest■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":379,"inputCost":0.0009475000000000001,"outputTokens":40,"outputCost":0.0004},"botpress":{"cost":0.0013475000000000002}},"meta":{"cached":true}}}
859
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by status and time open (priority: open+old, then open+new, then closed)-1","key":"38443329","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"prioritize by status and age: highest priority = open tickets that have been open longest; lowest priority = closed tickets\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":\\\"T-301\\\",\\\"title\\\":\\\"Old open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-05T10:00:00Z\\\",\\\"resolvedAt\\\":null}\\n■1: {\\\"id\\\":\\\"T-302\\\",\\\"title\\\":\\\"Recently closed\\\",\\\"status\\\":\\\"closed\\\",\\\"createdAt\\\":\\\"2024-01-10T10:00:00Z\\\",\\\"resolvedAt\\\":\\\"2024-01-18T15:00:00Z\\\"}\\n■2: {\\\"id\\\":\\\"T-303\\\",\\\"title\\\":\\\"Very old open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-01T08:00:00Z\\\",\\\"resolvedAt\\\":null}\\n■3: {\\\"id\\\":\\\"T-304\\\",\\\"title\\\":\\\"New open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-18T14:00:00Z\\\",\\\"resolvedAt\\\":null}\\n■4: {\\\"id\\\":\\\"T-305\\\",\\\"title\\\":\\\"Old closed ticket\\\",\\\"status\\\":\\\"closed\\\",\\\"createdAt\\\":\\\"2024-01-03T09:00:00Z\\\",\\\"resolvedAt\\\":\\\"2024-01-15T10:00:00Z\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwMOV6JzvTt3cMc2o2hNVvsVD1J","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■status■ \nopen_longest;open_older;open_recent;closed_recent;closed_older \n■END■ \n\n■age■ \nvery_old;old;recent;newest \n■END■ \n\n■priority■ \ncritical;high;medium;low;none \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":710,"inputCost":0.001775,"outputTokens":62,"outputCost":0.00062},"botpress":{"cost":0.002395}},"meta":{"cached":true}}}
860
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by status and time open (priority: open+old, then open+new, then closed)-2","key":"5a1966be","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■5):\\n■0: {\\\"id\\\":\\\"T-301\\\",\\\"title\\\":\\\"Old open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-05T10:00:00Z\\\",\\\"resolvedAt\\\":null}■\\n■1: {\\\"id\\\":\\\"T-302\\\",\\\"title\\\":\\\"Recently closed\\\",\\\"status\\\":\\\"closed\\\",\\\"createdAt\\\":\\\"2024-01-10T10:00:00Z\\\",\\\"resolvedAt\\\":\\\"2024-01-18T15:00:00Z\\\"}■\\n■2: {\\\"id\\\":\\\"T-303\\\",\\\"title\\\":\\\"Very old open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-01T08:00:00Z\\\",\\\"resolvedAt\\\":null}■\\n■3: {\\\"id\\\":\\\"T-304\\\",\\\"title\\\":\\\"New open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-18T14:00:00Z\\\",\\\"resolvedAt\\\":null}■\\n■4: {\\\"id\\\":\\\"T-305\\\",\\\"title\\\":\\\"Old closed ticket\\\",\\\"status\\\":\\\"closed\\\",\\\"createdAt\\\":\\\"2024-01-03T09:00:00Z\\\",\\\"resolvedAt\\\":\\\"2024-01-15T10:00:00Z\\\"}■\\n■5: {\\\"id\\\":\\\"T-306\\\",\\\"title\\\":\\\"Medium age open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-12T11:00:00Z\\\",\\\"resolvedAt\\\":null}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**status**: open_longest;open_older;open_recent;closed_recent;closed_older\\n**age**: very_old;old;recent;newest\\n**priority**: critical;high;medium;low;none\\n\\nInstructions: \\\"prioritize by status and age: highest priority = open tickets that have been open longest; lowest priority = closed tickets\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■5)\\n- Use exact criterion names: status, age, priority\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwNRHZKHbkran47r2PZXVZ6xFZe","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:status=open_longest;age=old■ \n■1:status=closed_recent;age=newest■ \n■2:status=open_longest;age=very_old■ \n■3:status=open_recent;age=newest■ \n■4:status=closed_older;age=old■ \n■5:status=open_older;age=recent■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":571,"inputCost":0.0014275000000000002,"outputTokens":82,"outputCost":0.0008200000000000001},"botpress":{"cost":0.0022475000000000004}},"meta":{"cached":true}}}
861
+ {"test":"zai.sort > help desk ticket sorting > should sort tickets by status and time open (priority: open+old, then open+new, then closed)-3","key":"f168512","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":\\\"T-304\\\",\\\"title\\\":\\\"New open ticket\\\",\\\"status\\\":\\\"open\\\",\\\"createdAt\\\":\\\"2024-01-18T14:00:00Z\\\",\\\"resolvedAt\\\":null}■\\n■1: {\\\"id\\\":\\\"T-305\\\",\\\"title\\\":\\\"Old closed ticket\\\",\\\"status\\\":\\\"closed\\\",\\\"createdAt\\\":\\\"2024-01-03T09:00:00Z\\\",\\\"resolvedAt\\\":\\\"2024-01-15T10:00:00Z\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: prioritize by status and age: highest priority = open tickets that have been open longest; lowest priority = closed tickets\\n\\nCriteria:\\n- status: open_longest;open_older;open_recent;closed_recent;closed_older\\n- age: very_old;old;recent;newest\\n- priority: critical;high;medium;low;none\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwOHYHSNxPDsEDAibGjWIcvL0ct","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":282,"inputCost":0.000705,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.000815}},"meta":{"cached":true}}}
862
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-1","key":"33597ed5","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from smallest value to largest value\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":0,\\\"value\\\":23}\\n■1: {\\\"id\\\":1,\\\"value\\\":40}\\n■2: {\\\"id\\\":2,\\\"value\\\":57}\\n■3: {\\\"id\\\":3,\\\"value\\\":74}\\n■4: {\\\"id\\\":4,\\\"value\\\":91}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwPFmL9stRzSubDlikaM11SJg3F","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■value■ \nsmallest;small;medium;large;largest \n■END■ \n\n■range■ \nlow;below_average;average;above_average;high \n■END■ \n\n■scale■ \nminimum;low;moderate;high;maximum \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":519,"inputCost":0.0012975,"outputTokens":58,"outputCost":0.00058},"botpress":{"cost":0.0018775}},"meta":{"cached":true}}}
863
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-3","key":"4fe8ea17","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":50,\\\"value\\\":873}■\\n■1: {\\\"id\\\":51,\\\"value\\\":890}■\\n■2: {\\\"id\\\":52,\\\"value\\\":907}■\\n■3: {\\\"id\\\":53,\\\"value\\\":924}■\\n■4: {\\\"id\\\":54,\\\"value\\\":941}■\\n■5: {\\\"id\\\":55,\\\"value\\\":958}■\\n■6: {\\\"id\\\":56,\\\"value\\\":975}■\\n■7: {\\\"id\\\":57,\\\"value\\\":992}■\\n■8: {\\\"id\\\":58,\\\"value\\\":9}■\\n■9: {\\\"id\\\":59,\\\"value\\\":26}■\\n■10: {\\\"id\\\":60,\\\"value\\\":43}■\\n■11: {\\\"id\\\":61,\\\"value\\\":60}■\\n■12: {\\\"id\\\":62,\\\"value\\\":77}■\\n■13: {\\\"id\\\":63,\\\"value\\\":94}■\\n■14: {\\\"id\\\":64,\\\"value\\\":111}■\\n■15: {\\\"id\\\":65,\\\"value\\\":128}■\\n■16: {\\\"id\\\":66,\\\"value\\\":145}■\\n■17: {\\\"id\\\":67,\\\"value\\\":162}■\\n■18: {\\\"id\\\":68,\\\"value\\\":179}■\\n■19: {\\\"id\\\":69,\\\"value\\\":196}■\\n■20: {\\\"id\\\":70,\\\"value\\\":213}■\\n■21: {\\\"id\\\":71,\\\"value\\\":230}■\\n■22: {\\\"id\\\":72,\\\"value\\\":247}■\\n■23: {\\\"id\\\":73,\\\"value\\\":264}■\\n■24: {\\\"id\\\":74,\\\"value\\\":281}■\\n■25: {\\\"id\\\":75,\\\"value\\\":298}■\\n■26: {\\\"id\\\":76,\\\"value\\\":315}■\\n■27: {\\\"id\\\":77,\\\"value\\\":332}■\\n■28: {\\\"id\\\":78,\\\"value\\\":349}■\\n■29: {\\\"id\\\":79,\\\"value\\\":366}■\\n■30: {\\\"id\\\":80,\\\"value\\\":383}■\\n■31: {\\\"id\\\":81,\\\"value\\\":400}■\\n■32: {\\\"id\\\":82,\\\"value\\\":417}■\\n■33: {\\\"id\\\":83,\\\"value\\\":434}■\\n■34: {\\\"id\\\":84,\\\"value\\\":451}■\\n■35: {\\\"id\\\":85,\\\"value\\\":468}■\\n■36: {\\\"id\\\":86,\\\"value\\\":485}■\\n■37: {\\\"id\\\":87,\\\"value\\\":502}■\\n■38: {\\\"id\\\":88,\\\"value\\\":519}■\\n■39: {\\\"id\\\":89,\\\"value\\\":536}■\\n■40: {\\\"id\\\":90,\\\"value\\\":553}■\\n■41: {\\\"id\\\":91,\\\"value\\\":570}■\\n■42: {\\\"id\\\":92,\\\"value\\\":587}■\\n■43: {\\\"id\\\":93,\\\"value\\\":604}■\\n■44: {\\\"id\\\":94,\\\"value\\\":621}■\\n■45: {\\\"id\\\":95,\\\"value\\\":638}■\\n■46: {\\\"id\\\":96,\\\"value\\\":655}■\\n■47: {\\\"id\\\":97,\\\"value\\\":672}■\\n■48: {\\\"id\\\":98,\\\"value\\\":689}■\\n■49: {\\\"id\\\":99,\\\"value\\\":706}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQSzU3DoGiw80kb7EFnkHxPDGD","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank the items based on their \"value\", we will categorize and assign labels for each criterion (value, range, scale) using the defined sequences:\n\n### Criteria Definitions:\n- **value**: smallest;small;medium;large;largest\n- **range**: low;below_average;average;above_average;high\n- **scale**: minimum;low;moderate;high;maximum\n\n### Reasoning:\n1. Based on the range of \"value\" (9 to 992), the dataset will be divided into categories ensuring equal distribution among the 5 labels for each criterion. Values are grouped proportionally in ascending order.\n2. Labels are distributed to reflect the change from the lowest rank (\"smallest\", \"low\", \"minimum\") to the highest (\"largest\", \"high\", \"maximum\").\n\n---\n\nOutput:\n```\n■0:value=medium;range=average;scale=moderate■\n■1:value=medium;range=average;scale=moderate■\n■2:value=medium;range=above_average;scale=high■\n■3:value=large;range=above_average;scale=high■\n■4:value=large;range=above_average;scale=high■\n■5:value=large;range=above_average;scale=high■\n■6:value=large;range=high;scale=high■\n■7:value=largest;range=high;scale=maximum■\n■8:value=smallest;range=low;scale=minimum■\n■9:value=smallest;range=low;scale=low■\n■10:value=small;range=low;scale=low■\n■11:value=small;range=low;scale=low■\n■12:value=small;range=below_average;scale=low■\n■13:value=small;range=below_average;scale=low■\n■14:value=small;range=below_average;scale=low■\n■15:value=small;range=below_average;scale=low■\n■16:value=small;range=below_average;scale=moderate■\n■17:value=medium;range=average;scale=moderate■\n■18:value=medium;range=average;scale=moderate■\n■19:value=medium;range=average;scale=moderate■\n■20:value=medium;range=average;scale=moderate■\n■21:value=medium;range=above_average;scale=moderate■\n■22:value=medium;range=average;scale=moderate■\n■23:value-large. above-muted","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":544,"outputCost":0.00544},"botpress":{"cost":0.00784}},"meta":{"cached":true}}}
864
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-2","key":"3c9497e4","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":0,\\\"value\\\":23}■\\n■1: {\\\"id\\\":1,\\\"value\\\":40}■\\n■2: {\\\"id\\\":2,\\\"value\\\":57}■\\n■3: {\\\"id\\\":3,\\\"value\\\":74}■\\n■4: {\\\"id\\\":4,\\\"value\\\":91}■\\n■5: {\\\"id\\\":5,\\\"value\\\":108}■\\n■6: {\\\"id\\\":6,\\\"value\\\":125}■\\n■7: {\\\"id\\\":7,\\\"value\\\":142}■\\n■8: {\\\"id\\\":8,\\\"value\\\":159}■\\n■9: {\\\"id\\\":9,\\\"value\\\":176}■\\n■10: {\\\"id\\\":10,\\\"value\\\":193}■\\n■11: {\\\"id\\\":11,\\\"value\\\":210}■\\n■12: {\\\"id\\\":12,\\\"value\\\":227}■\\n■13: {\\\"id\\\":13,\\\"value\\\":244}■\\n■14: {\\\"id\\\":14,\\\"value\\\":261}■\\n■15: {\\\"id\\\":15,\\\"value\\\":278}■\\n■16: {\\\"id\\\":16,\\\"value\\\":295}■\\n■17: {\\\"id\\\":17,\\\"value\\\":312}■\\n■18: {\\\"id\\\":18,\\\"value\\\":329}■\\n■19: {\\\"id\\\":19,\\\"value\\\":346}■\\n■20: {\\\"id\\\":20,\\\"value\\\":363}■\\n■21: {\\\"id\\\":21,\\\"value\\\":380}■\\n■22: {\\\"id\\\":22,\\\"value\\\":397}■\\n■23: {\\\"id\\\":23,\\\"value\\\":414}■\\n■24: {\\\"id\\\":24,\\\"value\\\":431}■\\n■25: {\\\"id\\\":25,\\\"value\\\":448}■\\n■26: {\\\"id\\\":26,\\\"value\\\":465}■\\n■27: {\\\"id\\\":27,\\\"value\\\":482}■\\n■28: {\\\"id\\\":28,\\\"value\\\":499}■\\n■29: {\\\"id\\\":29,\\\"value\\\":516}■\\n■30: {\\\"id\\\":30,\\\"value\\\":533}■\\n■31: {\\\"id\\\":31,\\\"value\\\":550}■\\n■32: {\\\"id\\\":32,\\\"value\\\":567}■\\n■33: {\\\"id\\\":33,\\\"value\\\":584}■\\n■34: {\\\"id\\\":34,\\\"value\\\":601}■\\n■35: {\\\"id\\\":35,\\\"value\\\":618}■\\n■36: {\\\"id\\\":36,\\\"value\\\":635}■\\n■37: {\\\"id\\\":37,\\\"value\\\":652}■\\n■38: {\\\"id\\\":38,\\\"value\\\":669}■\\n■39: {\\\"id\\\":39,\\\"value\\\":686}■\\n■40: {\\\"id\\\":40,\\\"value\\\":703}■\\n■41: {\\\"id\\\":41,\\\"value\\\":720}■\\n■42: {\\\"id\\\":42,\\\"value\\\":737}■\\n■43: {\\\"id\\\":43,\\\"value\\\":754}■\\n■44: {\\\"id\\\":44,\\\"value\\\":771}■\\n■45: {\\\"id\\\":45,\\\"value\\\":788}■\\n■46: {\\\"id\\\":46,\\\"value\\\":805}■\\n■47: {\\\"id\\\":47,\\\"value\\\":822}■\\n■48: {\\\"id\\\":48,\\\"value\\\":839}■\\n■49: {\\\"id\\\":49,\\\"value\\\":856}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQ5oa09Lvwe4ezL19xMu3nXQx9","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank these items using the given labeled scales (\"value\", \"range\", \"scale\"), I will break down the values evenly into the corresponding labels. Since the range of \"value\" goes from 23 (■0) to 856 (■49), we divide the range into five equal segments to assign the labels. Here's how I approach it:\n\n- \"value\":\n - smallest (23-199)\n - small (200-374)\n - medium (375-549)\n - large (550-724)\n - largest (725-856)\n\n- \"range\":\n - low (23-199)\n - below_average (200-374)\n - average (375-549)\n - above_average (550-724)\n - high (725-856)\n\n- \"scale\":\n - minimum (23-199)\n - low (200-374)\n - moderate (375-549)\n - high (550-724)\n - maximum (725-856)\n\nNow we apply the labels for each item (■0 to ■49):\n\n```\n■0:value=smallest;range=low;scale=minimum■\n■1:value=smallest;range=low;scale=minimum■\n■2:value=smallest;range=low;scale=minimum■\n■3:value=smallest;range=low;scale=minimum■\n■4:value=smallest;range=low;scale=minimum■\n■5:value=smallest;range=low;scale=minimum■\n■6:value=smallest;range=low;scale=minimum■\n■7:value=smallest;range=low;scale=minimum■\n■8:value=smallest;range=low;scale=minimum■\n■9:value=smallest;range=low;scale=minimum■\n■10:value=smallest;range=low;scale=minimum■\n■11:value=small;range=below_average;scale=low■\n■12:value=small;range=below_average;scale=low■\n■13:value=small;range=below_average;scale=low■\n■14:value=small;range=below_average;scale=low■\n■15:value=small;range=below_average;scale=low■\n■16:value=small;range=below_average;scale=low■\n■17:value=small;range=below_average;scale=low■\n■18:value=small;range=below_average;scale=low■\n■19:value=small;range=below_average;scale=low■\n■20:value=small;range=below_average;scale=low■\n■21:value=small;range=below_average;scale=low■\n■22:value=medium;range=average;scale=moderate■\n■23:value=medium;range=average;scale=moderate■\n■24:value=medium;range=average;scale=moderate■\n■25:value=medium;range=average;scale=moderate■\n■26:value=medium;range=average;scale=moderate■\n■27:value=medium;range=average;scale=moderate■\n■28:value=medium;range=average;scale=moderate■\n■29:value=medium;range=average;scale=moderate■\n■30:value=medium;range=average;scale=moderate■\n■31:value=medium;range=average;scale=moderate■\n■32:value=medium;range=average;scale=moderate■\n■33:value=large;range=above_average;scale=high■\n■34:value=large;range=above_average;scale=high■\n■35:value=large;range=above_average;scale=high■\n■36:value=large;range=above_average;scale=high■\n■37:value=large;range=above_average;scale=high■\n■38:value=large;range=above_average;scale=high■\n■39:value=large;range=above_average;scale=high■\n■40:value=large;range=above_average;scale=high■\n■41:value=large;range=above_average;scale=high■\n■42:value=large;range=above_average;scale=high■\n■43:value=largest;range=high;scale=maximum■\n■44:value=largest;range=high;scale=maximum■\n■45:value=largest;range=high;scale=maximum■\n■46:value=largest;range=high;scale=maximum■\n■47:value=largest;range=high;scale=maximum■\n■48:value=largest;range=high;scale=maximum■\n■49:value=largest;range=high;scale=maximum■\n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":1018,"outputCost":0.010180000000000002},"botpress":{"cost":0.012580000000000001}},"meta":{"cached":true}}}
865
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-11","key":"93b7fe43","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":450,\\\"value\\\":673}■\\n■1: {\\\"id\\\":451,\\\"value\\\":690}■\\n■2: {\\\"id\\\":452,\\\"value\\\":707}■\\n■3: {\\\"id\\\":453,\\\"value\\\":724}■\\n■4: {\\\"id\\\":454,\\\"value\\\":741}■\\n■5: {\\\"id\\\":455,\\\"value\\\":758}■\\n■6: {\\\"id\\\":456,\\\"value\\\":775}■\\n■7: {\\\"id\\\":457,\\\"value\\\":792}■\\n■8: {\\\"id\\\":458,\\\"value\\\":809}■\\n■9: {\\\"id\\\":459,\\\"value\\\":826}■\\n■10: {\\\"id\\\":460,\\\"value\\\":843}■\\n■11: {\\\"id\\\":461,\\\"value\\\":860}■\\n■12: {\\\"id\\\":462,\\\"value\\\":877}■\\n■13: {\\\"id\\\":463,\\\"value\\\":894}■\\n■14: {\\\"id\\\":464,\\\"value\\\":911}■\\n■15: {\\\"id\\\":465,\\\"value\\\":928}■\\n■16: {\\\"id\\\":466,\\\"value\\\":945}■\\n■17: {\\\"id\\\":467,\\\"value\\\":962}■\\n■18: {\\\"id\\\":468,\\\"value\\\":979}■\\n■19: {\\\"id\\\":469,\\\"value\\\":996}■\\n■20: {\\\"id\\\":470,\\\"value\\\":13}■\\n■21: {\\\"id\\\":471,\\\"value\\\":30}■\\n■22: {\\\"id\\\":472,\\\"value\\\":47}■\\n■23: {\\\"id\\\":473,\\\"value\\\":64}■\\n■24: {\\\"id\\\":474,\\\"value\\\":81}■\\n■25: {\\\"id\\\":475,\\\"value\\\":98}■\\n■26: {\\\"id\\\":476,\\\"value\\\":115}■\\n■27: {\\\"id\\\":477,\\\"value\\\":132}■\\n■28: {\\\"id\\\":478,\\\"value\\\":149}■\\n■29: {\\\"id\\\":479,\\\"value\\\":166}■\\n■30: {\\\"id\\\":480,\\\"value\\\":183}■\\n■31: {\\\"id\\\":481,\\\"value\\\":200}■\\n■32: {\\\"id\\\":482,\\\"value\\\":217}■\\n■33: {\\\"id\\\":483,\\\"value\\\":234}■\\n■34: {\\\"id\\\":484,\\\"value\\\":251}■\\n■35: {\\\"id\\\":485,\\\"value\\\":268}■\\n■36: {\\\"id\\\":486,\\\"value\\\":285}■\\n■37: {\\\"id\\\":487,\\\"value\\\":302}■\\n■38: {\\\"id\\\":488,\\\"value\\\":319}■\\n■39: {\\\"id\\\":489,\\\"value\\\":336}■\\n■40: {\\\"id\\\":490,\\\"value\\\":353}■\\n■41: {\\\"id\\\":491,\\\"value\\\":370}■\\n■42: {\\\"id\\\":492,\\\"value\\\":387}■\\n■43: {\\\"id\\\":493,\\\"value\\\":404}■\\n■44: {\\\"id\\\":494,\\\"value\\\":421}■\\n■45: {\\\"id\\\":495,\\\"value\\\":438}■\\n■46: {\\\"id\\\":496,\\\"value\\\":455}■\\n■47: {\\\"id\\\":497,\\\"value\\\":472}■\\n■48: {\\\"id\\\":498,\\\"value\\\":489}■\\n■49: {\\\"id\\\":499,\\\"value\\\":506}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQztfI3q7LlBADMZIqIYhJ60g0","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank each item, we will assign labels for the criteria \"value,\" \"range,\" and \"scale\" based on the `value` of each item, from smallest to largest.\n\n- **value (smallest → largest):**\n - smallest: Lowest values\n - small: Below-average values\n - medium: Middle-range values\n - large: High-range values\n - largest: Highest values\n\n- **range (low → high):**\n - low: Lowest range (`value` closest to minimum)\n - below_average: Slightly higher but still below average\n - average: Around the median `value`\n - above_average: Higher-than-average `value`\n - high: Highest range (`value` closest to maximum)\n\n- **scale (minimum → maximum):**\n - minimum: Smallest possible `values`\n - low: Lower but not the smallest range\n - moderate: Mid-range `values`\n - high: Larger values\n - maximum: Largest possible `values`\n\nHere is the sorted and ranked output:\n\n```\n■0:value=medium;range=average;scale=moderate■\n■1:value=medium;range=average;scale=moderate■\n■2:value=medium;range=above_average;scale=moderate■\n■3:value=medium;range=above_average;scale=moderate■\n■4:value=medium;range=above_average;scale=moderate■\n■5:value=large;range=above_average;scale=high■\n■6:value=large;range=above_average;scale=high■\n■7:value=large;range=high;scale=high■\n■8:value=large;range=high;scale=high■\n■9:value=large;range=high;scale=high■\n■10:value=large;range=high;scale=high■\n■11:value=large;range=high;scale=high■\n■12:value=large;range=high;scale=high■\n■13:value=large;range=high;scale=high■\n■14:value=largest;range=high;scale=maximum■\n■15:value=largest;range=high;scale=maximum■\n■16:value=largest;range=high;scale=maximum■\n■17:value=largest;range=high;scale=maximum■\n■18:value=largest;range=high;scale=maximum■\n■19:value=largest;range=high;scale=maximum■\n■20:value=smallest;range=low;scale=minimum■\n■21:value=smallest;range=low;scale=minimum■\n■22:value=smallest;range=low;scale=minimum■\n■23:value=smallest;range=low;scale=minimum■\n■24:value=smallest;range=low;scale=low■\n■25:value=smallest;range=low;scale=low■\n■26:value=small;range=low;scale=low■\n■27:value=small;range=low;scale=low■\n■28:value=small;range=below_average;scale=low■\n■29:value=small;range=below_average;scale=low■\n■30:value=small;range=below_average;scale=low■\n■31:value=small;range=below_average;scale=low■\n■32:value=small;range=below_average;scale=low■\n■33:value=small;range=average;scale=moderate■\n■34:value=medium;range=average;scale=moderate■\n■35:value=medium;range=average;scale=moderate■\n■36:value=medium;range=average;scale=moderate■\n■37:value=medium;range=average;scale=moderate■\n■38:value=medium;range=average;scale=moderate■\n■39:value=medium;range=above_average;scale=moderate■\n■40:value=medium;range=above_average;scale=moderate■\n■41:value=medium;range=above_average;scale=moderate■\n■42:value=medium;range=above_average;scale=moderate■\n■43:value=medium;range=above_average;scale=moderate■\n■44:value=medium;range=high;scale=high■\n■45:value=large;range=high;scale=high■\n■46:value=large;range=high;scale=high■\n■47:value=large;range=high;scale=high■\n■48:value=large;range=high;scale=high■\n■49:value=large;range=high;scale=high■\n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":1009,"outputCost":0.01009},"botpress":{"cost":0.012490000000000001}},"meta":{"cached":true}}}
866
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-8","key":"d56db2e6","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":300,\\\"value\\\":123}■\\n■1: {\\\"id\\\":301,\\\"value\\\":140}■\\n■2: {\\\"id\\\":302,\\\"value\\\":157}■\\n■3: {\\\"id\\\":303,\\\"value\\\":174}■\\n■4: {\\\"id\\\":304,\\\"value\\\":191}■\\n■5: {\\\"id\\\":305,\\\"value\\\":208}■\\n■6: {\\\"id\\\":306,\\\"value\\\":225}■\\n■7: {\\\"id\\\":307,\\\"value\\\":242}■\\n■8: {\\\"id\\\":308,\\\"value\\\":259}■\\n■9: {\\\"id\\\":309,\\\"value\\\":276}■\\n■10: {\\\"id\\\":310,\\\"value\\\":293}■\\n■11: {\\\"id\\\":311,\\\"value\\\":310}■\\n■12: {\\\"id\\\":312,\\\"value\\\":327}■\\n■13: {\\\"id\\\":313,\\\"value\\\":344}■\\n■14: {\\\"id\\\":314,\\\"value\\\":361}■\\n■15: {\\\"id\\\":315,\\\"value\\\":378}■\\n■16: {\\\"id\\\":316,\\\"value\\\":395}■\\n■17: {\\\"id\\\":317,\\\"value\\\":412}■\\n■18: {\\\"id\\\":318,\\\"value\\\":429}■\\n■19: {\\\"id\\\":319,\\\"value\\\":446}■\\n■20: {\\\"id\\\":320,\\\"value\\\":463}■\\n■21: {\\\"id\\\":321,\\\"value\\\":480}■\\n■22: {\\\"id\\\":322,\\\"value\\\":497}■\\n■23: {\\\"id\\\":323,\\\"value\\\":514}■\\n■24: {\\\"id\\\":324,\\\"value\\\":531}■\\n■25: {\\\"id\\\":325,\\\"value\\\":548}■\\n■26: {\\\"id\\\":326,\\\"value\\\":565}■\\n■27: {\\\"id\\\":327,\\\"value\\\":582}■\\n■28: {\\\"id\\\":328,\\\"value\\\":599}■\\n■29: {\\\"id\\\":329,\\\"value\\\":616}■\\n■30: {\\\"id\\\":330,\\\"value\\\":633}■\\n■31: {\\\"id\\\":331,\\\"value\\\":650}■\\n■32: {\\\"id\\\":332,\\\"value\\\":667}■\\n■33: {\\\"id\\\":333,\\\"value\\\":684}■\\n■34: {\\\"id\\\":334,\\\"value\\\":701}■\\n■35: {\\\"id\\\":335,\\\"value\\\":718}■\\n■36: {\\\"id\\\":336,\\\"value\\\":735}■\\n■37: {\\\"id\\\":337,\\\"value\\\":752}■\\n■38: {\\\"id\\\":338,\\\"value\\\":769}■\\n■39: {\\\"id\\\":339,\\\"value\\\":786}■\\n■40: {\\\"id\\\":340,\\\"value\\\":803}■\\n■41: {\\\"id\\\":341,\\\"value\\\":820}■\\n■42: {\\\"id\\\":342,\\\"value\\\":837}■\\n■43: {\\\"id\\\":343,\\\"value\\\":854}■\\n■44: {\\\"id\\\":344,\\\"value\\\":871}■\\n■45: {\\\"id\\\":345,\\\"value\\\":888}■\\n■46: {\\\"id\\\":346,\\\"value\\\":905}■\\n■47: {\\\"id\\\":347,\\\"value\\\":922}■\\n■48: {\\\"id\\\":348,\\\"value\\\":939}■\\n■49: {\\\"id\\\":349,\\\"value\\\":956}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwRXT7EsGCPjwNulMysvEdbLXRq","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To assign labels for each criterion, the main given data is \"value\". Using normalized scaling and the provided label options, I'll determine the corresponding labels for each criterion. Here's how the values are distributed approximately:\n\n- **value**: smallest → largest\n - Range distribution is divided into equal parts across the dataset based on the total range (123 to 956). \n - Values will align with labels (smallest, small, medium, large, largest).\n\n- **range**: low → high\n - \"range\" uses a similar distribution to align with normalized data spread ranges.\n\n- **scale**: minimum → maximum\n - Scale mapping aligns with the above normalized distribution.\n\nNow, I'll calculate for all 50 items:\n\n■0:value=smallest;range=low;scale=minimum■ \n■1:value=smallest;range=low;scale=minimum■ \n■2:value=smallest;range=low;scale=minimum■ \n■3:value=smallest;range=low;scale=minimum■ \n■4:value=small;range=low;scale=minimum■ \n■5:value=small;range=low;scale=minimum■ \n■6:value=small;range=low;scale=low■ \n■7:value=small;range=low;scale=low■ \n■8:value=small;range=below_average;scale=low■ \n■9:value=small;range=below_average;scale=low■ \n■10:value=medium;range=below_average;scale=moderate■ \n■11:value=medium;range=below_average;scale=moderate■ \n■12:value=medium;range=average;scale=moderate■ \n■13:value=medium;range=average;scale=moderate■ \n■14:value=medium;range=average;scale=moderate■ \n■15:value=medium;range=average;scale=moderate■ \n■16:value=medium;range=average;scale=moderate■ \n■17:value=medium;range=above_average;scale=high■ \n■18:value=medium;range=above_average;scale=high■ \n■19:value=medium;range=above_average;scale=high■ \n■20:value=large;range=above_average;scale=high■ \n■21:value=large;range=above_average;scale=high■ \n■22:value=large;range=high;scale=maximum■ \n■23:value=large;range=high;scale=maximum■ \n■24:value=large;range=high;scale=maximum■ \n■25:value=large;range=high;scale=maximum■ \n■26:value=large;range=high;scale=maximum■ \n■27:value=largest;range=high;scale=maximum■ \n■28:value=largest;range=high;scale=maximum■ \n■29:value=largest;range=high;scale=maximum■ \n■30:value=largest;range=high;scale=maximum■ \n■31:value=largest;range=high;scale=maximum■ \n■32:value=largest;range=high;scale=maximum■ \n■33:value=largest;range=high;scale=maximum■ \n■34:value=largest;range=high;scale=maximum■ \n■35:value=largest;range=high;scale=maximum■ \n■36:value=largest;range=high;scale=maximum■ \n■37:value=largest;range=high;scale=maximum■ \n■38:value=largest;range=high;scale=maximum■ \n■39:value=largest;range=high;scale=maximum■ \n■40:value=largest;range=high;scale=maximum■ \n■41:value=largest;range=high;scale=maximum■ \n■42:value=largest;range=high;scale=maximum■ \n■43:value=largest;range=high;scale=maximum■ \n■44:value=largest;range=high;scale=maximum■ \n■45:value=largest;range=high;scale=maximum■ \n■46:value=largest;range=high;scale=maximum■ \n■47:value=largest;range=high;scale=maximum■ \n■48:value=largest;range=high;scale=maximum■ \n■49:value=largest;range=high;scale=maximum■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":922,"outputCost":0.00922},"botpress":{"cost":0.011620000000000002}},"meta":{"cached":true}}}
867
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-9","key":"4c8b28a9","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":350,\\\"value\\\":973}■\\n■1: {\\\"id\\\":351,\\\"value\\\":990}■\\n■2: {\\\"id\\\":352,\\\"value\\\":7}■\\n■3: {\\\"id\\\":353,\\\"value\\\":24}■\\n■4: {\\\"id\\\":354,\\\"value\\\":41}■\\n■5: {\\\"id\\\":355,\\\"value\\\":58}■\\n■6: {\\\"id\\\":356,\\\"value\\\":75}■\\n■7: {\\\"id\\\":357,\\\"value\\\":92}■\\n■8: {\\\"id\\\":358,\\\"value\\\":109}■\\n■9: {\\\"id\\\":359,\\\"value\\\":126}■\\n■10: {\\\"id\\\":360,\\\"value\\\":143}■\\n■11: {\\\"id\\\":361,\\\"value\\\":160}■\\n■12: {\\\"id\\\":362,\\\"value\\\":177}■\\n■13: {\\\"id\\\":363,\\\"value\\\":194}■\\n■14: {\\\"id\\\":364,\\\"value\\\":211}■\\n■15: {\\\"id\\\":365,\\\"value\\\":228}■\\n■16: {\\\"id\\\":366,\\\"value\\\":245}■\\n■17: {\\\"id\\\":367,\\\"value\\\":262}■\\n■18: {\\\"id\\\":368,\\\"value\\\":279}■\\n■19: {\\\"id\\\":369,\\\"value\\\":296}■\\n■20: {\\\"id\\\":370,\\\"value\\\":313}■\\n■21: {\\\"id\\\":371,\\\"value\\\":330}■\\n■22: {\\\"id\\\":372,\\\"value\\\":347}■\\n■23: {\\\"id\\\":373,\\\"value\\\":364}■\\n■24: {\\\"id\\\":374,\\\"value\\\":381}■\\n■25: {\\\"id\\\":375,\\\"value\\\":398}■\\n■26: {\\\"id\\\":376,\\\"value\\\":415}■\\n■27: {\\\"id\\\":377,\\\"value\\\":432}■\\n■28: {\\\"id\\\":378,\\\"value\\\":449}■\\n■29: {\\\"id\\\":379,\\\"value\\\":466}■\\n■30: {\\\"id\\\":380,\\\"value\\\":483}■\\n■31: {\\\"id\\\":381,\\\"value\\\":500}■\\n■32: {\\\"id\\\":382,\\\"value\\\":517}■\\n■33: {\\\"id\\\":383,\\\"value\\\":534}■\\n■34: {\\\"id\\\":384,\\\"value\\\":551}■\\n■35: {\\\"id\\\":385,\\\"value\\\":568}■\\n■36: {\\\"id\\\":386,\\\"value\\\":585}■\\n■37: {\\\"id\\\":387,\\\"value\\\":602}■\\n■38: {\\\"id\\\":388,\\\"value\\\":619}■\\n■39: {\\\"id\\\":389,\\\"value\\\":636}■\\n■40: {\\\"id\\\":390,\\\"value\\\":653}■\\n■41: {\\\"id\\\":391,\\\"value\\\":670}■\\n■42: {\\\"id\\\":392,\\\"value\\\":687}■\\n■43: {\\\"id\\\":393,\\\"value\\\":704}■\\n■44: {\\\"id\\\":394,\\\"value\\\":721}■\\n■45: {\\\"id\\\":395,\\\"value\\\":738}■\\n■46: {\\\"id\\\":396,\\\"value\\\":755}■\\n■47: {\\\"id\\\":397,\\\"value\\\":772}■\\n■48: {\\\"id\\\":398,\\\"value\\\":789}■\\n■49: {\\\"id\\\":399,\\\"value\\\":806}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwRkogJ83BjwDoawSnRygudpBg6","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To determine the ranks for each item using the specified labeled scales, we will split the range of possible values (from smallest to largest) into corresponding sections for each scale (value, range, scale) and assign labels accordingly.\n\n---\n\n### Step-by-step process:\n1. Use the **value** scale: smallest, small, medium, large, and largest.\n2. Use the **range** scale: low, below_average, average, above_average, and high.\n3. Use the **scale** scale: minimum, low, moderate, high, and maximum.\n4. Partition the values between 7 (lowest) and 990 (highest) to assign labels.\n \n---\n\n### Assigning thresholds\nFor 50 items, it's easier to divide into 5 categories for each criterion, meaning each category spans about 200 units (from 7 to 990):\n- **value criterion thresholds** (adjusted for item spreads):\n - smallest: 7–206\n - small: 207–405\n - medium: 406–604\n - large: 605–803\n - largest: 804–990\n\n- **range criterion thresholds**:\n - low: 7–206\n - below_average: 207–405\n - average: 406–604\n - above_average: 605–803\n - high: 804–990\n\n- **scale criterion thresholds**:\n - minimum: 7–206\n - low: 207–405\n - moderate: 406–604\n - high: 605–803\n - maximum: 804–990\n\nNow, let's assign appropriate labels for each item.\n\n---\n\n### Ranked Items\n■0:value=largest;range=high;scale=maximum■ \n■1:value=largest;range=high;scale=maximum■ \n■2:value=smallest;range=low;scale=minimum■ \n■3:value=smallest;range=low;scale=minimum■ \n■4:value=smallest;range=low;scale=minimum■ \n■5:value=smallest;range=low;scale=minimum■ \n■6:value=smallest;range=low;scale=minimum■ \n■7:value=smallest;range=low;scale=minimum■ \n■8:value=smallest;range=low;scale=minimum■ \n■9:value=smallest;range=low;scale=minimum■ \n■10:value=smallest;range=low;scale=minimum■ \n■11:value=smallest;range=low;scale=minimum■ \n■12:value=smallest;range=low;scale=minimum■ \n■13:value=smallest;range=low;scale=minimum■ \n■14:value=smallest;range=low;scale=minimum■ \n■15:value=smallest;range=low;scale=minimum■ \n■16:value=small;range=below_average;scale=low■ \n■17:value=small;range=below_average;scale=low■ \n■18:value=small;range=below_average;scale=low■ \n■19:value=small;range=below_average;scale=low■ \n■20:value=small;range=below_average;scale=low■ \n■21:value=small;range=below_average;scale=low■ \n■22:value=small;range=below_average;scale=low■ \n■23:value=small;range=below_average;scale=low■ \n■24:value=medium;range=average;scale=moderate■ \n■25:value=medium;range=average;scale=moderate■ \n■26:value=medium;range=average;scale=moderate■ \n■27:value=medium;range=average;scale=moderate■ \n■28:value=medium;range=average;scale=moderate■ \n■29:value=medium;range=average;scale=moderate■ \n■30:value=medium;range=average;scale=moderate■ \n■31:value=medium;range=average;scale=moderate■ \n■32:value=medium;range=average;scale=moderate■ \n■33:value=medium;range=average;scale=moderate■ \n■34:value=medium;range=average;scale=moderate■ \n■35:value=large;range=above_average;scale=high■ \n■36:value=large;range=above_average;scale=high■ \n■37:value=large;range=above_average;scale=high■ \n■38:value=large;range=above_average;scale=high■ \n■39:value=large;range=above_average;scale=high■ \n■40:value=large;range=above_average;scale=high■ \n■41:value=large;range=above_average;scale=high■ \n■42:value=large;range=above_average;scale=high■ \n■43:value=large;range=above_average;scale=high■ \n■44:value=large;range=above_average;scale=high■ \n■45:value=large;range=above_average;scale=high■ \n■46:value=largest;range=high;scale=maximum■ \n■47:value=largest;range=high;scale=maximum■ \n■48:value=largest;range=high;scale=maximum■ \n■49:value=largest;range=high;scale=maximum■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":1151,"outputCost":0.011510000000000001},"botpress":{"cost":0.013910000000000002}},"meta":{"cached":true}}}
868
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-10","key":"9c495650","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":400,\\\"value\\\":823}■\\n■1: {\\\"id\\\":401,\\\"value\\\":840}■\\n■2: {\\\"id\\\":402,\\\"value\\\":857}■\\n■3: {\\\"id\\\":403,\\\"value\\\":874}■\\n■4: {\\\"id\\\":404,\\\"value\\\":891}■\\n■5: {\\\"id\\\":405,\\\"value\\\":908}■\\n■6: {\\\"id\\\":406,\\\"value\\\":925}■\\n■7: {\\\"id\\\":407,\\\"value\\\":942}■\\n■8: {\\\"id\\\":408,\\\"value\\\":959}■\\n■9: {\\\"id\\\":409,\\\"value\\\":976}■\\n■10: {\\\"id\\\":410,\\\"value\\\":993}■\\n■11: {\\\"id\\\":411,\\\"value\\\":10}■\\n■12: {\\\"id\\\":412,\\\"value\\\":27}■\\n■13: {\\\"id\\\":413,\\\"value\\\":44}■\\n■14: {\\\"id\\\":414,\\\"value\\\":61}■\\n■15: {\\\"id\\\":415,\\\"value\\\":78}■\\n■16: {\\\"id\\\":416,\\\"value\\\":95}■\\n■17: {\\\"id\\\":417,\\\"value\\\":112}■\\n■18: {\\\"id\\\":418,\\\"value\\\":129}■\\n■19: {\\\"id\\\":419,\\\"value\\\":146}■\\n■20: {\\\"id\\\":420,\\\"value\\\":163}■\\n■21: {\\\"id\\\":421,\\\"value\\\":180}■\\n■22: {\\\"id\\\":422,\\\"value\\\":197}■\\n■23: {\\\"id\\\":423,\\\"value\\\":214}■\\n■24: {\\\"id\\\":424,\\\"value\\\":231}■\\n■25: {\\\"id\\\":425,\\\"value\\\":248}■\\n■26: {\\\"id\\\":426,\\\"value\\\":265}■\\n■27: {\\\"id\\\":427,\\\"value\\\":282}■\\n■28: {\\\"id\\\":428,\\\"value\\\":299}■\\n■29: {\\\"id\\\":429,\\\"value\\\":316}■\\n■30: {\\\"id\\\":430,\\\"value\\\":333}■\\n■31: {\\\"id\\\":431,\\\"value\\\":350}■\\n■32: {\\\"id\\\":432,\\\"value\\\":367}■\\n■33: {\\\"id\\\":433,\\\"value\\\":384}■\\n■34: {\\\"id\\\":434,\\\"value\\\":401}■\\n■35: {\\\"id\\\":435,\\\"value\\\":418}■\\n■36: {\\\"id\\\":436,\\\"value\\\":435}■\\n■37: {\\\"id\\\":437,\\\"value\\\":452}■\\n■38: {\\\"id\\\":438,\\\"value\\\":469}■\\n■39: {\\\"id\\\":439,\\\"value\\\":486}■\\n■40: {\\\"id\\\":440,\\\"value\\\":503}■\\n■41: {\\\"id\\\":441,\\\"value\\\":520}■\\n■42: {\\\"id\\\":442,\\\"value\\\":537}■\\n■43: {\\\"id\\\":443,\\\"value\\\":554}■\\n■44: {\\\"id\\\":444,\\\"value\\\":571}■\\n■45: {\\\"id\\\":445,\\\"value\\\":588}■\\n■46: {\\\"id\\\":446,\\\"value\\\":605}■\\n■47: {\\\"id\\\":447,\\\"value\\\":622}■\\n■48: {\\\"id\\\":448,\\\"value\\\":639}■\\n■49: {\\\"id\\\":449,\\\"value\\\":656}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwRnjV5U0P7abCAkTI0KlfhD3gS","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank each item according to the criteria in ascending order (from smallest to largest), here's the labeled result:\n\n■0:value=large;range=above_average■ \n■1:value=large;range=above_average■ \n■2:value=large;range=above_average■ \n■3:value=large;range=above_average■ \n■4:value=large;range=above_average■ \n■5:value=large;range=above_average■ \n■6:value=large;range=high■ \n■7:value=large;range=high■ \n■8:value=large;range=high■ \n■9:value=large;range=high■ \n■10:value=large;range=high■ \n■11:value=smallest;range=low■ \n■12:value=smallest;range=low■   \n■12:value=b-above-low-present -Query-a-test_SS","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":187,"outputCost":0.0018700000000000001},"botpress":{"cost":0.00427}},"meta":{"cached":true}}}
869
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-7","key":"e5c0c739","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":250,\\\"value\\\":273}■\\n■1: {\\\"id\\\":251,\\\"value\\\":290}■\\n■2: {\\\"id\\\":252,\\\"value\\\":307}■\\n■3: {\\\"id\\\":253,\\\"value\\\":324}■\\n■4: {\\\"id\\\":254,\\\"value\\\":341}■\\n■5: {\\\"id\\\":255,\\\"value\\\":358}■\\n■6: {\\\"id\\\":256,\\\"value\\\":375}■\\n■7: {\\\"id\\\":257,\\\"value\\\":392}■\\n■8: {\\\"id\\\":258,\\\"value\\\":409}■\\n■9: {\\\"id\\\":259,\\\"value\\\":426}■\\n■10: {\\\"id\\\":260,\\\"value\\\":443}■\\n■11: {\\\"id\\\":261,\\\"value\\\":460}■\\n■12: {\\\"id\\\":262,\\\"value\\\":477}■\\n■13: {\\\"id\\\":263,\\\"value\\\":494}■\\n■14: {\\\"id\\\":264,\\\"value\\\":511}■\\n■15: {\\\"id\\\":265,\\\"value\\\":528}■\\n■16: {\\\"id\\\":266,\\\"value\\\":545}■\\n■17: {\\\"id\\\":267,\\\"value\\\":562}■\\n■18: {\\\"id\\\":268,\\\"value\\\":579}■\\n■19: {\\\"id\\\":269,\\\"value\\\":596}■\\n■20: {\\\"id\\\":270,\\\"value\\\":613}■\\n■21: {\\\"id\\\":271,\\\"value\\\":630}■\\n■22: {\\\"id\\\":272,\\\"value\\\":647}■\\n■23: {\\\"id\\\":273,\\\"value\\\":664}■\\n■24: {\\\"id\\\":274,\\\"value\\\":681}■\\n■25: {\\\"id\\\":275,\\\"value\\\":698}■\\n■26: {\\\"id\\\":276,\\\"value\\\":715}■\\n■27: {\\\"id\\\":277,\\\"value\\\":732}■\\n■28: {\\\"id\\\":278,\\\"value\\\":749}■\\n■29: {\\\"id\\\":279,\\\"value\\\":766}■\\n■30: {\\\"id\\\":280,\\\"value\\\":783}■\\n■31: {\\\"id\\\":281,\\\"value\\\":800}■\\n■32: {\\\"id\\\":282,\\\"value\\\":817}■\\n■33: {\\\"id\\\":283,\\\"value\\\":834}■\\n■34: {\\\"id\\\":284,\\\"value\\\":851}■\\n■35: {\\\"id\\\":285,\\\"value\\\":868}■\\n■36: {\\\"id\\\":286,\\\"value\\\":885}■\\n■37: {\\\"id\\\":287,\\\"value\\\":902}■\\n■38: {\\\"id\\\":288,\\\"value\\\":919}■\\n■39: {\\\"id\\\":289,\\\"value\\\":936}■\\n■40: {\\\"id\\\":290,\\\"value\\\":953}■\\n■41: {\\\"id\\\":291,\\\"value\\\":970}■\\n■42: {\\\"id\\\":292,\\\"value\\\":987}■\\n■43: {\\\"id\\\":293,\\\"value\\\":4}■\\n■44: {\\\"id\\\":294,\\\"value\\\":21}■\\n■45: {\\\"id\\\":295,\\\"value\\\":38}■\\n■46: {\\\"id\\\":296,\\\"value\\\":55}■\\n■47: {\\\"id\\\":297,\\\"value\\\":72}■\\n■48: {\\\"id\\\":298,\\\"value\\\":89}■\\n■49: {\\\"id\\\":299,\\\"value\\\":106}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQsI03IRLiWRMhSzWp5UBP76km","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank the items, I will classify values into the labeled scales (`value`, `range`, and `scale`) based on ordered categories. The categorization will be determined based on a logical distribution of values within the range provided.\n\nFirst, let's analyze the range of the `value` field:\n- Minimum value: 4\n- Maximum value: 987\n- Range: 987 - 4 = 983\n\nI will distribute these values into categories (`smallest`, `small`, `medium`, `large`, `largest`) and assign corresponding labels for `range` (`low`, `below_average`, `average`, `above_average`, `high`) and `scale` (`minimum`, `low`, `moderate`, `high`, `maximum`). The boundaries will fall approximately like this:\n\n| Category | Value Range |\n|----------------|-----------------|\n| smallest | 4 - 198 |\n| small | 199 - 392 |\n| medium | 393 - 586 |\n| large | 587 - 780 |\n| largest | 781 - 987 |\n\nWith this distribution, here is the ranked output for each item:\n\n■0:value=small;range=low■ \n■1:value=small;range=low■ \n■2:value=small;range=low■ \n■3:value=small;range=below_average■ \n■4:value=small;range=below_average■ \n■5:value=small;range=below_average■ \n■6:value=small;range=below_average■ \n■7:value=small;range=below_average■ \n■8:value=medium;range=average■ \n■9:value=medium;range=average■ \n■10:value=medium;range=average■ \n■11:value=medium;range=average■ \n■12:value=medium;range=average■ \n■13:value=medium;range=average■ \n■14:value=medium;range=average■ \n■15:value=medium;range=average■ \n■16:value=medium;range=above_average■ \n■17:value=medium;range=above_average■ \n■18:value=medium;range=above_average■ \n■19:value=medium;range=above_average■ \n■20:value=medium;range=above_average■ \n■21:value=large;range=high■ \n■22:value=large;range=high■ \n■23:value=large;range=high■ \n■24:value=large;range=high■ \n■25:value=large;range=high■ \n■26:value=large;range=high■ \n■27:value=large;range=high■ \n■28:value=largest;range=high■ \n■29:value=largest;range=high■ \n■30:value=largest;range=high■ \n■31:value=largest;range=high■ \n■32:value=largest;range=high■ \n■33:value=largest;range=high■ \n■34:value=largest;range=high■ \n■35:value=largest;range=high■ \n■36:value=largest;range=high■ \n■37:value=largest;range=highest■>\n\n\nremaining","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":672,"outputCost":0.00672},"botpress":{"cost":0.00912}},"meta":{"cached":true}}}
870
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-6","key":"3ba76bc2","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":200,\\\"value\\\":423}■\\n■1: {\\\"id\\\":201,\\\"value\\\":440}■\\n■2: {\\\"id\\\":202,\\\"value\\\":457}■\\n■3: {\\\"id\\\":203,\\\"value\\\":474}■\\n■4: {\\\"id\\\":204,\\\"value\\\":491}■\\n■5: {\\\"id\\\":205,\\\"value\\\":508}■\\n■6: {\\\"id\\\":206,\\\"value\\\":525}■\\n■7: {\\\"id\\\":207,\\\"value\\\":542}■\\n■8: {\\\"id\\\":208,\\\"value\\\":559}■\\n■9: {\\\"id\\\":209,\\\"value\\\":576}■\\n■10: {\\\"id\\\":210,\\\"value\\\":593}■\\n■11: {\\\"id\\\":211,\\\"value\\\":610}■\\n■12: {\\\"id\\\":212,\\\"value\\\":627}■\\n■13: {\\\"id\\\":213,\\\"value\\\":644}■\\n■14: {\\\"id\\\":214,\\\"value\\\":661}■\\n■15: {\\\"id\\\":215,\\\"value\\\":678}■\\n■16: {\\\"id\\\":216,\\\"value\\\":695}■\\n■17: {\\\"id\\\":217,\\\"value\\\":712}■\\n■18: {\\\"id\\\":218,\\\"value\\\":729}■\\n■19: {\\\"id\\\":219,\\\"value\\\":746}■\\n■20: {\\\"id\\\":220,\\\"value\\\":763}■\\n■21: {\\\"id\\\":221,\\\"value\\\":780}■\\n■22: {\\\"id\\\":222,\\\"value\\\":797}■\\n■23: {\\\"id\\\":223,\\\"value\\\":814}■\\n■24: {\\\"id\\\":224,\\\"value\\\":831}■\\n■25: {\\\"id\\\":225,\\\"value\\\":848}■\\n■26: {\\\"id\\\":226,\\\"value\\\":865}■\\n■27: {\\\"id\\\":227,\\\"value\\\":882}■\\n■28: {\\\"id\\\":228,\\\"value\\\":899}■\\n■29: {\\\"id\\\":229,\\\"value\\\":916}■\\n■30: {\\\"id\\\":230,\\\"value\\\":933}■\\n■31: {\\\"id\\\":231,\\\"value\\\":950}■\\n■32: {\\\"id\\\":232,\\\"value\\\":967}■\\n■33: {\\\"id\\\":233,\\\"value\\\":984}■\\n■34: {\\\"id\\\":234,\\\"value\\\":1}■\\n■35: {\\\"id\\\":235,\\\"value\\\":18}■\\n■36: {\\\"id\\\":236,\\\"value\\\":35}■\\n■37: {\\\"id\\\":237,\\\"value\\\":52}■\\n■38: {\\\"id\\\":238,\\\"value\\\":69}■\\n■39: {\\\"id\\\":239,\\\"value\\\":86}■\\n■40: {\\\"id\\\":240,\\\"value\\\":103}■\\n■41: {\\\"id\\\":241,\\\"value\\\":120}■\\n■42: {\\\"id\\\":242,\\\"value\\\":137}■\\n■43: {\\\"id\\\":243,\\\"value\\\":154}■\\n■44: {\\\"id\\\":244,\\\"value\\\":171}■\\n■45: {\\\"id\\\":245,\\\"value\\\":188}■\\n■46: {\\\"id\\\":246,\\\"value\\\":205}■\\n■47: {\\\"id\\\":247,\\\"value\\\":222}■\\n■48: {\\\"id\\\":248,\\\"value\\\":239}■\\n■49: {\\\"id\\\":249,\\\"value\\\":256}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQz5IEmssqU9bRgSG06hJmZkVp","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank the provided items using the labeled scales, each item is analyzed by its \"value\". The values are scaled, and judgments based on \"value\", \"range\", and \"scale\" are completed. Here's the sorted ranking from smallest to largest value:\n\n■0:value=medium;range=below_average■ \n■1:value=medium;range=below_average■ \n■2:value=below majority","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":81,"outputCost":0.0008100000000000001},"botpress":{"cost":0.00321}},"meta":{"cached":true}}}
871
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-5","key":"3014b8e5","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":150,\\\"value\\\":573}■\\n■1: {\\\"id\\\":151,\\\"value\\\":590}■\\n■2: {\\\"id\\\":152,\\\"value\\\":607}■\\n■3: {\\\"id\\\":153,\\\"value\\\":624}■\\n■4: {\\\"id\\\":154,\\\"value\\\":641}■\\n■5: {\\\"id\\\":155,\\\"value\\\":658}■\\n■6: {\\\"id\\\":156,\\\"value\\\":675}■\\n■7: {\\\"id\\\":157,\\\"value\\\":692}■\\n■8: {\\\"id\\\":158,\\\"value\\\":709}■\\n■9: {\\\"id\\\":159,\\\"value\\\":726}■\\n■10: {\\\"id\\\":160,\\\"value\\\":743}■\\n■11: {\\\"id\\\":161,\\\"value\\\":760}■\\n■12: {\\\"id\\\":162,\\\"value\\\":777}■\\n■13: {\\\"id\\\":163,\\\"value\\\":794}■\\n■14: {\\\"id\\\":164,\\\"value\\\":811}■\\n■15: {\\\"id\\\":165,\\\"value\\\":828}■\\n■16: {\\\"id\\\":166,\\\"value\\\":845}■\\n■17: {\\\"id\\\":167,\\\"value\\\":862}■\\n■18: {\\\"id\\\":168,\\\"value\\\":879}■\\n■19: {\\\"id\\\":169,\\\"value\\\":896}■\\n■20: {\\\"id\\\":170,\\\"value\\\":913}■\\n■21: {\\\"id\\\":171,\\\"value\\\":930}■\\n■22: {\\\"id\\\":172,\\\"value\\\":947}■\\n■23: {\\\"id\\\":173,\\\"value\\\":964}■\\n■24: {\\\"id\\\":174,\\\"value\\\":981}■\\n■25: {\\\"id\\\":175,\\\"value\\\":998}■\\n■26: {\\\"id\\\":176,\\\"value\\\":15}■\\n■27: {\\\"id\\\":177,\\\"value\\\":32}■\\n■28: {\\\"id\\\":178,\\\"value\\\":49}■\\n■29: {\\\"id\\\":179,\\\"value\\\":66}■\\n■30: {\\\"id\\\":180,\\\"value\\\":83}■\\n■31: {\\\"id\\\":181,\\\"value\\\":100}■\\n■32: {\\\"id\\\":182,\\\"value\\\":117}■\\n■33: {\\\"id\\\":183,\\\"value\\\":134}■\\n■34: {\\\"id\\\":184,\\\"value\\\":151}■\\n■35: {\\\"id\\\":185,\\\"value\\\":168}■\\n■36: {\\\"id\\\":186,\\\"value\\\":185}■\\n■37: {\\\"id\\\":187,\\\"value\\\":202}■\\n■38: {\\\"id\\\":188,\\\"value\\\":219}■\\n■39: {\\\"id\\\":189,\\\"value\\\":236}■\\n■40: {\\\"id\\\":190,\\\"value\\\":253}■\\n■41: {\\\"id\\\":191,\\\"value\\\":270}■\\n■42: {\\\"id\\\":192,\\\"value\\\":287}■\\n■43: {\\\"id\\\":193,\\\"value\\\":304}■\\n■44: {\\\"id\\\":194,\\\"value\\\":321}■\\n■45: {\\\"id\\\":195,\\\"value\\\":338}■\\n■46: {\\\"id\\\":196,\\\"value\\\":355}■\\n■47: {\\\"id\\\":197,\\\"value\\\":372}■\\n■48: {\\\"id\\\":198,\\\"value\\\":389}■\\n■49: {\\\"id\\\":199,\\\"value\\\":406}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwRaYB9RDBlwAJ2oX7mTkyoUfpF","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank these items in ascending order, we will assign labels based on their \"value\" values. The labels for each criterion are distributed as evenly as possible across the range of values (from 15 to 998). Here's the labeled ranking:\n\n■0:value=medium;range=average;scale=moderate■ \n■1:value=medium;range=average;scale=moderate■ \n■2:value=medium;range=average;scale=moderate■ \n■3:value=medium;range=average;scale=moderate■ \n■4:value=medium;range=average;scale=moderate■ \n■5:value=medium;range=average;scale=moderate■ \n■6:value=large;range=above_average;scale=high■ \n■7:value=large;range=above_average;scale=high■ \n■8:value=large;range=above_average;scale=high■ \n■9:value=large;range=above_average;scale=high■ \n■10:value=large;range=above_average;scale=high■ \n■11:value=large;range=above_average;scale=high■ \n■12:value=large;range=above_average;scale=high■ \n■13:value=large;range=above_average;scale=high■ \n■14:value=large;range=above_average;scale=high■ \n■15:value=large;range=high;scale=high■ \n■16:value=large;range=high;scale=high■ \n■17:value=large;range=high;scale=maximum■ \n■18:value=large;range=high;scale=maximum■ \n■19:value=largest;range=high;scale=maximum■ \n■20:value=largest;range=high;scale=maximum■ \n■21:value=largest;range=high;scale=maximum■ \n■22:value=largest;range=high;scale=maximum■ \n■23:value=largest;range=high;scale=maximum■ \n■24:value=largest;range=high;scale=maximum■ \n■25:value=largest;range=high;scale=maximum■ \n■26:value=smallest;range=low;scale=minimum■ \n■27:value=smallest;range=low;scale=minimum■ \n■28:value=smallest;range=low;scale=minimum■ \n■29:value=smallest;range=low;scale=minimum■ \n■30:value=smallest;range=low;scale=minimum■ \n■31:value=small;range=below_average;scale=low■ \n■32:value=small;range=below_average;scale=low■ \n■33:value=small;range=below_average;scale=low■ \n■34:value=small;range=below_average;scale=low■ \n■35:value=small;range=below_average;scale=low■ \n■36:value=small;range=below_average;scale=low■ \n■37:value=small;range=below_average;scale=low■ \n■38:value=medium;range=average;scale=moderate■ \n■39:value=medium;range=average;scale=moderate■ \n■40:value=medium;range=average;scale=moderate■ \n■41:value=medium;range=average;scale=moderate■ \n■42:value=medium;range=average;scale=moderate■ \n■43:value=medium;range=average;scale=moderate■ \n---","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":743,"outputCost":0.007430000000000001},"botpress":{"cost":0.009830000000000002}},"meta":{"cached":true}}}
872
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-4","key":"821d87d8","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■49):\\n■0: {\\\"id\\\":100,\\\"value\\\":723}■\\n■1: {\\\"id\\\":101,\\\"value\\\":740}■\\n■2: {\\\"id\\\":102,\\\"value\\\":757}■\\n■3: {\\\"id\\\":103,\\\"value\\\":774}■\\n■4: {\\\"id\\\":104,\\\"value\\\":791}■\\n■5: {\\\"id\\\":105,\\\"value\\\":808}■\\n■6: {\\\"id\\\":106,\\\"value\\\":825}■\\n■7: {\\\"id\\\":107,\\\"value\\\":842}■\\n■8: {\\\"id\\\":108,\\\"value\\\":859}■\\n■9: {\\\"id\\\":109,\\\"value\\\":876}■\\n■10: {\\\"id\\\":110,\\\"value\\\":893}■\\n■11: {\\\"id\\\":111,\\\"value\\\":910}■\\n■12: {\\\"id\\\":112,\\\"value\\\":927}■\\n■13: {\\\"id\\\":113,\\\"value\\\":944}■\\n■14: {\\\"id\\\":114,\\\"value\\\":961}■\\n■15: {\\\"id\\\":115,\\\"value\\\":978}■\\n■16: {\\\"id\\\":116,\\\"value\\\":995}■\\n■17: {\\\"id\\\":117,\\\"value\\\":12}■\\n■18: {\\\"id\\\":118,\\\"value\\\":29}■\\n■19: {\\\"id\\\":119,\\\"value\\\":46}■\\n■20: {\\\"id\\\":120,\\\"value\\\":63}■\\n■21: {\\\"id\\\":121,\\\"value\\\":80}■\\n■22: {\\\"id\\\":122,\\\"value\\\":97}■\\n■23: {\\\"id\\\":123,\\\"value\\\":114}■\\n■24: {\\\"id\\\":124,\\\"value\\\":131}■\\n■25: {\\\"id\\\":125,\\\"value\\\":148}■\\n■26: {\\\"id\\\":126,\\\"value\\\":165}■\\n■27: {\\\"id\\\":127,\\\"value\\\":182}■\\n■28: {\\\"id\\\":128,\\\"value\\\":199}■\\n■29: {\\\"id\\\":129,\\\"value\\\":216}■\\n■30: {\\\"id\\\":130,\\\"value\\\":233}■\\n■31: {\\\"id\\\":131,\\\"value\\\":250}■\\n■32: {\\\"id\\\":132,\\\"value\\\":267}■\\n■33: {\\\"id\\\":133,\\\"value\\\":284}■\\n■34: {\\\"id\\\":134,\\\"value\\\":301}■\\n■35: {\\\"id\\\":135,\\\"value\\\":318}■\\n■36: {\\\"id\\\":136,\\\"value\\\":335}■\\n■37: {\\\"id\\\":137,\\\"value\\\":352}■\\n■38: {\\\"id\\\":138,\\\"value\\\":369}■\\n■39: {\\\"id\\\":139,\\\"value\\\":386}■\\n■40: {\\\"id\\\":140,\\\"value\\\":403}■\\n■41: {\\\"id\\\":141,\\\"value\\\":420}■\\n■42: {\\\"id\\\":142,\\\"value\\\":437}■\\n■43: {\\\"id\\\":143,\\\"value\\\":454}■\\n■44: {\\\"id\\\":144,\\\"value\\\":471}■\\n■45: {\\\"id\\\":145,\\\"value\\\":488}■\\n■46: {\\\"id\\\":146,\\\"value\\\":505}■\\n■47: {\\\"id\\\":147,\\\"value\\\":522}■\\n■48: {\\\"id\\\":148,\\\"value\\\":539}■\\n■49: {\\\"id\\\":149,\\\"value\\\":556}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**value**: smallest;small;medium;large;largest\\n**range**: low;below_average;average;above_average;high\\n**scale**: minimum;low;moderate;high;maximum\\n\\nInstructions: \\\"from smallest value to largest value\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■49)\\n- Use exact criterion names: value, range, scale\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwQbcRlVY0Fh8rbX12q3dbkioxn","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To rank the items, we allocate one label from each criterion (value, range, scale) to every item based on its value. Lower values correspond to labels closer to the start of each scale, while higher values correspond to labels near the end.\n\nLet’s process and categorize accordingly.\n\n---\n\n■0:value=medium;range=average;scale=moderate■ \n■1:value=medium;range=average;scale=moderate■ \n■2:value=medium;range=average;scale=moderate■ \n■3:value=medium;range=average;scale=moderate■ \n■4:value=medium;range=average;scale=moderate■ \n■5:value=medium;range=above_average;scale=moderate■ \n■6:value=medium;range=above_average;scale=moderate■ \n■7:value=large;range=above_average;scale=high■ \n■8:value=large;range=above_average;scale=high■ \n■9:value=large;range=above_average;scale=high■ \n■10:value=large;range=high;scale=high■ \n■11:value=large;range=high;scale=high■ \n■12:value=largest;range=high;scale=maximum■ \n■13:value=largest;range=high;scale=maximum■ \n■14:value=largest;range=high;scale=maximum■ \n■15:value=largest;range=high;scale=maximum■ \n■16:value=largest;range=high;scale=maximum■ \n■17:value=smallest;range=low;scale=minimum■ \n■18:value=smallest;range=low;scale=minimum■ \n■19:value=smallest;range=low;scale=minimum■ \n■20:value=smallest;range=low;scale=minimum■ \n■21:value=smallest;range=low;scale=low■ \n■22:value=small;range=low;scale=low■ \n■23:value=small;range=below_average;scale=low■ \n■24:value=small;range=below_average;scale=low■ \n■25:value=small;range=below_average;scale=low■ \n■26:value=small;range=below_average;scale=low■ \n■27:value=small;range=below_average;scale=low■ \n■28:value=small;range=average;scale=moderate■ \n■29:value=medium;range=average;scale=moderate■ \n■30:value=medium;range=average;scale=moderate■ \n■31:value=medium;range=average;scale=moderate■ \n■32:value=medium;range=average;scale=moderate■ \n■33:value=medium;range=average;scale=moderate■ \n■34:value=medium;range=average;scale=moderate■ \n■35:value=medium;range=average;scale=moderate■ \n■36:value=medium;range=average;scale=moderate■ \n■37:value=medium;range=above_average;scale=moderate■ \n■38:value=medium;range=above_average;scale=moderate■ \n■39:value=medium;range=above_average;scale=moderate■ \n■40:value=medium;range=above_average;scale=moderate■ \n■41:value=medium;range=above_average;scale=moderate■ \n■42:value=large;range=above_average;scale=high■ \n■43:value=large;range=above_average;scale=high■ \n■44:value=large;range=above_average;scale=high■ \n■45:value=large;range=above_average;scale=high■ \n■46:value=large;range=high;scale=high■ \n■47:value=large;range=high;scale=high■ \n■48:value=largest;range=high;scale=maximum■ \n■49:value=largest;range=high;scale=maximum■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":960,"inputCost":0.0024000000000000002,"outputTokens":856,"outputCost":0.00856},"botpress":{"cost":0.010960000000000001}},"meta":{"cached":true}}}
873
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-17","key":"b3164d30","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":52,\\\"value\\\":907}■\\n■1: {\\\"id\\\":278,\\\"value\\\":749}■\\n■2: {\\\"id\\\":279,\\\"value\\\":766}■\\n■3: {\\\"id\\\":280,\\\"value\\\":783}■\\n■4: {\\\"id\\\":281,\\\"value\\\":800}■\\n■5: {\\\"id\\\":282,\\\"value\\\":817}■\\n■6: {\\\"id\\\":283,\\\"value\\\":834}■\\n■7: {\\\"id\\\":284,\\\"value\\\":851}■\\n■8: {\\\"id\\\":285,\\\"value\\\":868}■\\n■9: {\\\"id\\\":286,\\\"value\\\":885}■\\n■10: {\\\"id\\\":317,\\\"value\\\":412}■\\n■11: {\\\"id\\\":318,\\\"value\\\":429}■\\n■12: {\\\"id\\\":319,\\\"value\\\":446}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 13 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbKMwQ6nO4dhcR7yGNot4ySmjQ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■10■ \n■11■ \n■12■ \n■1■ \n■2■ \n■3■ \n■4■ \n■5■ \n■6■ \n■7■ \n■8■ \n■9■ \n■0■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":343,"inputCost":0.0008575000000000001,"outputTokens":55,"outputCost":0.00055},"botpress":{"cost":0.0014075000000000001}},"meta":{"cached":true}}}
874
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-18","key":"71d92b08","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":56,\\\"value\\\":975}■\\n■1: {\\\"id\\\":110,\\\"value\\\":893}■\\n■2: {\\\"id\\\":111,\\\"value\\\":910}■\\n■3: {\\\"id\\\":146,\\\"value\\\":505}■\\n■4: {\\\"id\\\":147,\\\"value\\\":522}■\\n■5: {\\\"id\\\":165,\\\"value\\\":828}■\\n■6: {\\\"id\\\":166,\\\"value\\\":845}■\\n■7: {\\\"id\\\":457,\\\"value\\\":792}■\\n■8: {\\\"id\\\":458,\\\"value\\\":809}■\\n■9: {\\\"id\\\":459,\\\"value\\\":826}■\\n■10: {\\\"id\\\":460,\\\"value\\\":843}■\\n■11: {\\\"id\\\":461,\\\"value\\\":860}■\\n■12: {\\\"id\\\":462,\\\"value\\\":877}■\\n■13: {\\\"id\\\":463,\\\"value\\\":894}■\\n■14: {\\\"id\\\":495,\\\"value\\\":438}■\\n■15: {\\\"id\\\":496,\\\"value\\\":455}■\\n■16: {\\\"id\\\":497,\\\"value\\\":472}■\\n■17: {\\\"id\\\":498,\\\"value\\\":489}■\\n■18: {\\\"id\\\":499,\\\"value\\\":506}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 19 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbSDoXsPlJLKnUUrwC8WtbIXYC","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■14■ \n■15■ \n■16■ \n■17■ \n■3■ \n■18■ \n■4■ \n■7■ \n■8■ \n■9■ \n■5■ \n■10■ \n■6■ \n■11■ \n■12■ \n■1■ \n■13■ \n■2■ \n■0■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":427,"inputCost":0.0010675,"outputTokens":79,"outputCost":0.00079},"botpress":{"cost":0.0018575000000000002}},"meta":{"cached":true}}}
875
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-15","key":"7809c51b","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":33,\\\"value\\\":584}■\\n■1: {\\\"id\\\":34,\\\"value\\\":601}■\\n■2: {\\\"id\\\":35,\\\"value\\\":618}■\\n■3: {\\\"id\\\":36,\\\"value\\\":635}■\\n■4: {\\\"id\\\":37,\\\"value\\\":652}■\\n■5: {\\\"id\\\":38,\\\"value\\\":669}■\\n■6: {\\\"id\\\":39,\\\"value\\\":686}■\\n■7: {\\\"id\\\":40,\\\"value\\\":703}■\\n■8: {\\\"id\\\":41,\\\"value\\\":720}■\\n■9: {\\\"id\\\":42,\\\"value\\\":737}■\\n■10: {\\\"id\\\":53,\\\"value\\\":924}■\\n■11: {\\\"id\\\":54,\\\"value\\\":941}■\\n■12: {\\\"id\\\":55,\\\"value\\\":958}■\\n■13: {\\\"id\\\":107,\\\"value\\\":842}■\\n■14: {\\\"id\\\":108,\\\"value\\\":859}■\\n■15: {\\\"id\\\":109,\\\"value\\\":876}■\\n■16: {\\\"id\\\":142,\\\"value\\\":437}■\\n■17: {\\\"id\\\":143,\\\"value\\\":454}■\\n■18: {\\\"id\\\":144,\\\"value\\\":471}■\\n■19: {\\\"id\\\":145,\\\"value\\\":488}■\\n■20: {\\\"id\\\":156,\\\"value\\\":675}■\\n■21: {\\\"id\\\":157,\\\"value\\\":692}■\\n■22: {\\\"id\\\":158,\\\"value\\\":709}■\\n■23: {\\\"id\\\":159,\\\"value\\\":726}■\\n■24: {\\\"id\\\":160,\\\"value\\\":743}■\\n■25: {\\\"id\\\":161,\\\"value\\\":760}■\\n■26: {\\\"id\\\":162,\\\"value\\\":777}■\\n■27: {\\\"id\\\":163,\\\"value\\\":794}■\\n■28: {\\\"id\\\":164,\\\"value\\\":811}■\\n■29: {\\\"id\\\":320,\\\"value\\\":463}■\\n■30: {\\\"id\\\":321,\\\"value\\\":480}■\\n■31: {\\\"id\\\":385,\\\"value\\\":568}■\\n■32: {\\\"id\\\":386,\\\"value\\\":585}■\\n■33: {\\\"id\\\":387,\\\"value\\\":602}■\\n■34: {\\\"id\\\":388,\\\"value\\\":619}■\\n■35: {\\\"id\\\":389,\\\"value\\\":636}■\\n■36: {\\\"id\\\":390,\\\"value\\\":653}■\\n■37: {\\\"id\\\":391,\\\"value\\\":670}■\\n■38: {\\\"id\\\":392,\\\"value\\\":687}■\\n■39: {\\\"id\\\":393,\\\"value\\\":704}■\\n■40: {\\\"id\\\":394,\\\"value\\\":721}■\\n■41: {\\\"id\\\":395,\\\"value\\\":738}■\\n■42: {\\\"id\\\":455,\\\"value\\\":758}■\\n■43: {\\\"id\\\":456,\\\"value\\\":775}■\\n■44: {\\\"id\\\":494,\\\"value\\\":421}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 45 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbdXwnYElKD4A1VPSzxukKoEJM","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■44■ \n■16■ \n■17■ \n■29■ \n■18■ \n■30■ \n■19■ \n■0■ \n■31■ \n■32■ \n■1■ \n■33■ \n■2■ \n■34■ \n■3■ \n■35■ \n■4■ \n■36■ \n■20■ \n■37■ \n■6■ \n■38■ \n■7■ \n■39■ \n■8■ \n■40■ \n■9■ \n■41■ \n■42■ \n■25■ \n■43■ \n■26■ \n■27■ \n■28■ \n■13■ \n■14■ \n■15■ \n■10■ \n■11■ \n■12■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":791,"inputCost":0.0019775,"outputTokens":163,"outputCost":0.0016300000000000002},"botpress":{"cost":0.0036075000000000005}},"meta":{"cached":true}}}
876
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-20","key":"c52f9e57","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":60,\\\"value\\\":43}■\\n■1: {\\\"id\\\":61,\\\"value\\\":60}■\\n■2: {\\\"id\\\":122,\\\"value\\\":97}■\\n■3: {\\\"id\\\":253,\\\"value\\\":324}■\\n■4: {\\\"id\\\":254,\\\"value\\\":341}■\\n■5: {\\\"id\\\":255,\\\"value\\\":358}■\\n■6: {\\\"id\\\":256,\\\"value\\\":375}■\\n■7: {\\\"id\\\":257,\\\"value\\\":392}■\\n■8: {\\\"id\\\":306,\\\"value\\\":225}■\\n■9: {\\\"id\\\":307,\\\"value\\\":242}■\\n■10: {\\\"id\\\":476,\\\"value\\\":115}■\\n■11: {\\\"id\\\":477,\\\"value\\\":132}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 12 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbqIhKIVm4CJErXMDbGl1Zdxmc","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n■10■ \n■11■ \n■2■ \n■8■ \n■9■ \n■3■ \n■4■ \n■5■ \n■6■ \n■7■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":329,"inputCost":0.0008225000000000001,"outputTokens":51,"outputCost":0.00051},"botpress":{"cost":0.0013325000000000001}},"meta":{"cached":true}}}
877
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-21","key":"fc962a75","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":66,\\\"value\\\":145}■\\n■1: {\\\"id\\\":258,\\\"value\\\":409}■\\n■2: {\\\"id\\\":259,\\\"value\\\":426}■\\n■3: {\\\"id\\\":260,\\\"value\\\":443}■\\n■4: {\\\"id\\\":261,\\\"value\\\":460}■\\n■5: {\\\"id\\\":262,\\\"value\\\":477}■\\n■6: {\\\"id\\\":263,\\\"value\\\":494}■\\n■7: {\\\"id\\\":264,\\\"value\\\":511}■\\n■8: {\\\"id\\\":265,\\\"value\\\":528}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 9 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbWcUy8fo8U6MfgWlqMOT2gT3s","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n■2■ \n■3■ \n■4■ \n■5■ \n■6■ \n■7■ \n■8■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":287,"inputCost":0.0007175,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.0011075}},"meta":{"cached":true}}}
878
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-19","key":"768571e8","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":59,\\\"value\\\":26}■\\n■1: {\\\"id\\\":121,\\\"value\\\":80}■\\n■2: {\\\"id\\\":250,\\\"value\\\":273}■\\n■3: {\\\"id\\\":251,\\\"value\\\":290}■\\n■4: {\\\"id\\\":252,\\\"value\\\":307}■\\n■5: {\\\"id\\\":304,\\\"value\\\":191}■\\n■6: {\\\"id\\\":305,\\\"value\\\":208}■\\n■7: {\\\"id\\\":474,\\\"value\\\":81}■\\n■8: {\\\"id\\\":475,\\\"value\\\":98}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 9 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbeD0TL5nMIg4vLRzX8nHZjSal","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To order these based on the given criteria, prioritize the smallest \"value\" first. \n\nResult:\n\n■0■ \n■1■ \n■7■ \n■8■ \n■5■ \n■6■ \n■2■ \n■3■ \n■4■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":287,"inputCost":0.0007175,"outputTokens":59,"outputCost":0.00059},"botpress":{"cost":0.0013075}},"meta":{"cached":true}}}
879
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-14","key":"16f536a1","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":22,\\\"value\\\":397}■\\n■1: {\\\"id\\\":23,\\\"value\\\":414}■\\n■2: {\\\"id\\\":24,\\\"value\\\":431}■\\n■3: {\\\"id\\\":25,\\\"value\\\":448}■\\n■4: {\\\"id\\\":26,\\\"value\\\":465}■\\n■5: {\\\"id\\\":27,\\\"value\\\":482}■\\n■6: {\\\"id\\\":28,\\\"value\\\":499}■\\n■7: {\\\"id\\\":29,\\\"value\\\":516}■\\n■8: {\\\"id\\\":30,\\\"value\\\":533}■\\n■9: {\\\"id\\\":31,\\\"value\\\":550}■\\n■10: {\\\"id\\\":32,\\\"value\\\":567}■\\n■11: {\\\"id\\\":50,\\\"value\\\":873}■\\n■12: {\\\"id\\\":51,\\\"value\\\":890}■\\n■13: {\\\"id\\\":67,\\\"value\\\":162}■\\n■14: {\\\"id\\\":68,\\\"value\\\":179}■\\n■15: {\\\"id\\\":69,\\\"value\\\":196}■\\n■16: {\\\"id\\\":70,\\\"value\\\":213}■\\n■17: {\\\"id\\\":72,\\\"value\\\":247}■\\n■18: {\\\"id\\\":73,\\\"value\\\":264}■\\n■19: {\\\"id\\\":74,\\\"value\\\":281}■\\n■20: {\\\"id\\\":75,\\\"value\\\":298}■\\n■21: {\\\"id\\\":76,\\\"value\\\":315}■\\n■22: {\\\"id\\\":77,\\\"value\\\":332}■\\n■23: {\\\"id\\\":78,\\\"value\\\":349}■\\n■24: {\\\"id\\\":79,\\\"value\\\":366}■\\n■25: {\\\"id\\\":80,\\\"value\\\":383}■\\n■26: {\\\"id\\\":81,\\\"value\\\":400}■\\n■27: {\\\"id\\\":82,\\\"value\\\":417}■\\n■28: {\\\"id\\\":83,\\\"value\\\":434}■\\n■29: {\\\"id\\\":84,\\\"value\\\":451}■\\n■30: {\\\"id\\\":85,\\\"value\\\":468}■\\n■31: {\\\"id\\\":86,\\\"value\\\":485}■\\n■32: {\\\"id\\\":87,\\\"value\\\":502}■\\n■33: {\\\"id\\\":88,\\\"value\\\":519}■\\n■34: {\\\"id\\\":89,\\\"value\\\":536}■\\n■35: {\\\"id\\\":90,\\\"value\\\":553}■\\n■36: {\\\"id\\\":91,\\\"value\\\":570}■\\n■37: {\\\"id\\\":92,\\\"value\\\":587}■\\n■38: {\\\"id\\\":93,\\\"value\\\":604}■\\n■39: {\\\"id\\\":94,\\\"value\\\":621}■\\n■40: {\\\"id\\\":95,\\\"value\\\":638}■\\n■41: {\\\"id\\\":96,\\\"value\\\":655}■\\n■42: {\\\"id\\\":97,\\\"value\\\":672}■\\n■43: {\\\"id\\\":98,\\\"value\\\":689}■\\n■44: {\\\"id\\\":99,\\\"value\\\":706}■\\n■45: {\\\"id\\\":100,\\\"value\\\":723}■\\n■46: {\\\"id\\\":101,\\\"value\\\":740}■\\n■47: {\\\"id\\\":102,\\\"value\\\":757}■\\n■48: {\\\"id\\\":103,\\\"value\\\":774}■\\n■49: {\\\"id\\\":104,\\\"value\\\":791}■\\n■50: {\\\"id\\\":129,\\\"value\\\":216}■\\n■51: {\\\"id\\\":130,\\\"value\\\":233}■\\n■52: {\\\"id\\\":131,\\\"value\\\":250}■\\n■53: {\\\"id\\\":132,\\\"value\\\":267}■\\n■54: {\\\"id\\\":133,\\\"value\\\":284}■\\n■55: {\\\"id\\\":134,\\\"value\\\":301}■\\n■56: {\\\"id\\\":135,\\\"value\\\":318}■\\n■57: {\\\"id\\\":136,\\\"value\\\":335}■\\n■58: {\\\"id\\\":150,\\\"value\\\":573}■\\n■59: {\\\"id\\\":151,\\\"value\\\":590}■\\n■60: {\\\"id\\\":152,\\\"value\\\":607}■\\n■61: {\\\"id\\\":153,\\\"value\\\":624}■\\n■62: {\\\"id\\\":154,\\\"value\\\":641}■\\n■63: {\\\"id\\\":155,\\\"value\\\":658}■\\n■64: {\\\"id\\\":188,\\\"value\\\":219}■\\n■65: {\\\"id\\\":189,\\\"value\\\":236}■\\n■66: {\\\"id\\\":190,\\\"value\\\":253}■\\n■67: {\\\"id\\\":191,\\\"value\\\":270}■\\n■68: {\\\"id\\\":192,\\\"value\\\":287}■\\n■69: {\\\"id\\\":193,\\\"value\\\":304}■\\n■70: {\\\"id\\\":194,\\\"value\\\":321}■\\n■71: {\\\"id\\\":195,\\\"value\\\":338}■\\n■72: {\\\"id\\\":196,\\\"value\\\":355}■\\n■73: {\\\"id\\\":197,\\\"value\\\":372}■\\n■74: {\\\"id\\\":198,\\\"value\\\":389}■\\n■75: {\\\"id\\\":199,\\\"value\\\":406}■\\n■76: {\\\"id\\\":202,\\\"value\\\":457}■\\n■77: {\\\"id\\\":203,\\\"value\\\":474}■\\n■78: {\\\"id\\\":204,\\\"value\\\":491}■\\n■79: {\\\"id\\\":205,\\\"value\\\":508}■\\n■80: {\\\"id\\\":206,\\\"value\\\":525}■\\n■81: {\\\"id\\\":207,\\\"value\\\":542}■\\n■82: {\\\"id\\\":208,\\\"value\\\":559}■\\n■83: {\\\"id\\\":209,\\\"value\\\":576}■\\n■84: {\\\"id\\\":210,\\\"value\\\":593}■\\n■85: {\\\"id\\\":211,\\\"value\\\":610}■\\n■86: {\\\"id\\\":212,\\\"value\\\":627}■\\n■87: {\\\"id\\\":213,\\\"value\\\":644}■\\n■88: {\\\"id\\\":214,\\\"value\\\":661}■\\n■89: {\\\"id\\\":215,\\\"value\\\":678}■\\n■90: {\\\"id\\\":216,\\\"value\\\":695}■\\n■91: {\\\"id\\\":217,\\\"value\\\":712}■\\n■92: {\\\"id\\\":218,\\\"value\\\":729}■\\n■93: {\\\"id\\\":219,\\\"value\\\":746}■\\n■94: {\\\"id\\\":220,\\\"value\\\":763}■\\n■95: {\\\"id\\\":221,\\\"value\\\":780}■\\n■96: {\\\"id\\\":222,\\\"value\\\":797}■\\n■97: {\\\"id\\\":223,\\\"value\\\":814}■\\n■98: {\\\"id\\\":224,\\\"value\\\":831}■\\n■99: {\\\"id\\\":225,\\\"value\\\":848}■\\n■100: {\\\"id\\\":226,\\\"value\\\":865}■\\n■101: {\\\"id\\\":227,\\\"value\\\":882}■\\n■102: {\\\"id\\\":228,\\\"value\\\":899}■\\n■103: {\\\"id\\\":229,\\\"value\\\":916}■\\n■104: {\\\"id\\\":230,\\\"value\\\":933}■\\n■105: {\\\"id\\\":231,\\\"value\\\":950}■\\n■106: {\\\"id\\\":232,\\\"value\\\":967}■\\n■107: {\\\"id\\\":233,\\\"value\\\":984}■\\n■108: {\\\"id\\\":234,\\\"value\\\":1}■\\n■109: {\\\"id\\\":235,\\\"value\\\":18}■\\n■110: {\\\"id\\\":236,\\\"value\\\":35}■\\n■111: {\\\"id\\\":237,\\\"value\\\":52}■\\n■112: {\\\"id\\\":238,\\\"value\\\":69}■\\n■113: {\\\"id\\\":239,\\\"value\\\":86}■\\n■114: {\\\"id\\\":240,\\\"value\\\":103}■\\n■115: {\\\"id\\\":241,\\\"value\\\":120}■\\n■116: {\\\"id\\\":242,\\\"value\\\":137}■\\n■117: {\\\"id\\\":243,\\\"value\\\":154}■\\n■118: {\\\"id\\\":244,\\\"value\\\":171}■\\n■119: {\\\"id\\\":245,\\\"value\\\":188}■\\n■120: {\\\"id\\\":246,\\\"value\\\":205}■\\n■121: {\\\"id\\\":247,\\\"value\\\":222}■\\n■122: {\\\"id\\\":248,\\\"value\\\":239}■\\n■123: {\\\"id\\\":249,\\\"value\\\":256}■\\n■124: {\\\"id\\\":287,\\\"value\\\":902}■\\n■125: {\\\"id\\\":288,\\\"value\\\":919}■\\n■126: {\\\"id\\\":289,\\\"value\\\":936}■\\n■127: {\\\"id\\\":290,\\\"value\\\":953}■\\n■128: {\\\"id\\\":291,\\\"value\\\":970}■\\n■129: {\\\"id\\\":292,\\\"value\\\":987}■\\n■130: {\\\"id\\\":293,\\\"value\\\":4}■\\n■131: {\\\"id\\\":294,\\\"value\\\":21}■\\n■132: {\\\"id\\\":295,\\\"value\\\":38}■\\n■133: {\\\"id\\\":296,\\\"value\\\":55}■\\n■134: {\\\"id\\\":297,\\\"value\\\":72}■\\n■135: {\\\"id\\\":298,\\\"value\\\":89}■\\n■136: {\\\"id\\\":299,\\\"value\\\":106}■\\n■137: {\\\"id\\\":312,\\\"value\\\":327}■\\n■138: {\\\"id\\\":313,\\\"value\\\":344}■\\n■139: {\\\"id\\\":314,\\\"value\\\":361}■\\n■140: {\\\"id\\\":315,\\\"value\\\":378}■\\n■141: {\\\"id\\\":316,\\\"value\\\":395}■\\n■142: {\\\"id\\\":374,\\\"value\\\":381}■\\n■143: {\\\"id\\\":375,\\\"value\\\":398}■\\n■144: {\\\"id\\\":376,\\\"value\\\":415}■\\n■145: {\\\"id\\\":377,\\\"value\\\":432}■\\n■146: {\\\"id\\\":378,\\\"value\\\":449}■\\n■147: {\\\"id\\\":379,\\\"value\\\":466}■\\n■148: {\\\"id\\\":380,\\\"value\\\":483}■\\n■149: {\\\"id\\\":381,\\\"value\\\":500}■\\n■150: {\\\"id\\\":382,\\\"value\\\":517}■\\n■151: {\\\"id\\\":383,\\\"value\\\":534}■\\n■152: {\\\"id\\\":384,\\\"value\\\":551}■\\n■153: {\\\"id\\\":400,\\\"value\\\":823}■\\n■154: {\\\"id\\\":401,\\\"value\\\":840}■\\n■155: {\\\"id\\\":402,\\\"value\\\":857}■\\n■156: {\\\"id\\\":403,\\\"value\\\":874}■\\n■157: {\\\"id\\\":404,\\\"value\\\":891}■\\n■158: {\\\"id\\\":405,\\\"value\\\":908}■\\n■159: {\\\"id\\\":413,\\\"value\\\":44}■\\n■160: {\\\"id\\\":414,\\\"value\\\":61}■\\n■161: {\\\"id\\\":415,\\\"value\\\":78}■\\n■162: {\\\"id\\\":416,\\\"value\\\":95}■\\n■163: {\\\"id\\\":417,\\\"value\\\":112}■\\n■164: {\\\"id\\\":418,\\\"value\\\":129}■\\n■165: {\\\"id\\\":419,\\\"value\\\":146}■\\n■166: {\\\"id\\\":420,\\\"value\\\":163}■\\n■167: {\\\"id\\\":421,\\\"value\\\":180}■\\n■168: {\\\"id\\\":422,\\\"value\\\":197}■\\n■169: {\\\"id\\\":423,\\\"value\\\":214}■\\n■170: {\\\"id\\\":424,\\\"value\\\":231}■\\n■171: {\\\"id\\\":425,\\\"value\\\":248}■\\n■172: {\\\"id\\\":426,\\\"value\\\":265}■\\n■173: {\\\"id\\\":427,\\\"value\\\":282}■\\n■174: {\\\"id\\\":428,\\\"value\\\":299}■\\n■175: {\\\"id\\\":429,\\\"value\\\":316}■\\n■176: {\\\"id\\\":430,\\\"value\\\":333}■\\n■177: {\\\"id\\\":431,\\\"value\\\":350}■\\n■178: {\\\"id\\\":432,\\\"value\\\":367}■\\n■179: {\\\"id\\\":433,\\\"value\\\":384}■\\n■180: {\\\"id\\\":434,\\\"value\\\":401}■\\n■181: {\\\"id\\\":435,\\\"value\\\":418}■\\n■182: {\\\"id\\\":436,\\\"value\\\":435}■\\n■183: {\\\"id\\\":437,\\\"value\\\":452}■\\n■184: {\\\"id\\\":438,\\\"value\\\":469}■\\n■185: {\\\"id\\\":439,\\\"value\\\":486}■\\n■186: {\\\"id\\\":440,\\\"value\\\":503}■\\n■187: {\\\"id\\\":441,\\\"value\\\":520}■\\n■188: {\\\"id\\\":442,\\\"value\\\":537}■\\n■189: {\\\"id\\\":443,\\\"value\\\":554}■\\n■190: {\\\"id\\\":444,\\\"value\\\":571}■\\n■191: {\\\"id\\\":445,\\\"value\\\":588}■\\n■192: {\\\"id\\\":446,\\\"value\\\":605}■\\n■193: {\\\"id\\\":447,\\\"value\\\":622}■\\n■194: {\\\"id\\\":448,\\\"value\\\":639}■\\n■195: {\\\"id\\\":449,\\\"value\\\":656}■\\n■196: {\\\"id\\\":450,\\\"value\\\":673}■\\n■197: {\\\"id\\\":451,\\\"value\\\":690}■\\n■198: {\\\"id\\\":484,\\\"value\\\":251}■\\n■199: {\\\"id\\\":485,\\\"value\\\":268}■\\n■200: {\\\"id\\\":486,\\\"value\\\":285}■\\n■201: {\\\"id\\\":487,\\\"value\\\":302}■\\n■202: {\\\"id\\\":488,\\\"value\\\":319}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 203 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbJUVBXyPYA2j7I7KPDitnhtsZ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■108■ \n■130■ \n■109■ \n■131■ \n■110■ \n■132■ \n■111■ \n■133■ \n■112■ \n■134■ \n■113■ \n■135■ \n■114■ \n■136■ \n■115■ \n■137■ \n■116■ \n■165■ \n■167■ \n■119■ \n■245■ \n■120■ \n■246■ \n■169■ \n■423■ \n■50■ \n■188■ \n■64■ \n■121■ \n■247■ \n■171■ \n■425■ \n■198■ \n■249■ \n■190■ \n■67■ \n■66■ \n■484■ \n■129■ \n■131■ \n■191■ \n■195■ \n■71■ \n■136■ \n■429■ \n■123■ \n■121■ \n■433■ \n■72■ \n■231■ \n■429■ \n■93■","index":0,"stopReason":"stop"}],"usage":{"inputTokens":3003,"inputCost":0.0075075,"outputTokens":207,"outputCost":0.0020700000000000002},"botpress":{"cost":0.009577500000000001}},"meta":{"cached":true}}}
880
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-13","key":"c7a99104","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":11,\\\"value\\\":210}■\\n■1: {\\\"id\\\":12,\\\"value\\\":227}■\\n■2: {\\\"id\\\":13,\\\"value\\\":244}■\\n■3: {\\\"id\\\":14,\\\"value\\\":261}■\\n■4: {\\\"id\\\":15,\\\"value\\\":278}■\\n■5: {\\\"id\\\":16,\\\"value\\\":295}■\\n■6: {\\\"id\\\":17,\\\"value\\\":312}■\\n■7: {\\\"id\\\":18,\\\"value\\\":329}■\\n■8: {\\\"id\\\":19,\\\"value\\\":346}■\\n■9: {\\\"id\\\":20,\\\"value\\\":363}■\\n■10: {\\\"id\\\":21,\\\"value\\\":380}■\\n■11: {\\\"id\\\":62,\\\"value\\\":77}■\\n■12: {\\\"id\\\":63,\\\"value\\\":94}■\\n■13: {\\\"id\\\":64,\\\"value\\\":111}■\\n■14: {\\\"id\\\":65,\\\"value\\\":128}■\\n■15: {\\\"id\\\":123,\\\"value\\\":114}■\\n■16: {\\\"id\\\":124,\\\"value\\\":131}■\\n■17: {\\\"id\\\":125,\\\"value\\\":148}■\\n■18: {\\\"id\\\":126,\\\"value\\\":165}■\\n■19: {\\\"id\\\":127,\\\"value\\\":182}■\\n■20: {\\\"id\\\":181,\\\"value\\\":100}■\\n■21: {\\\"id\\\":182,\\\"value\\\":117}■\\n■22: {\\\"id\\\":183,\\\"value\\\":134}■\\n■23: {\\\"id\\\":184,\\\"value\\\":151}■\\n■24: {\\\"id\\\":185,\\\"value\\\":168}■\\n■25: {\\\"id\\\":186,\\\"value\\\":185}■\\n■26: {\\\"id\\\":187,\\\"value\\\":202}■\\n■27: {\\\"id\\\":200,\\\"value\\\":423}■\\n■28: {\\\"id\\\":201,\\\"value\\\":440}■\\n■29: {\\\"id\\\":308,\\\"value\\\":259}■\\n■30: {\\\"id\\\":309,\\\"value\\\":276}■\\n■31: {\\\"id\\\":366,\\\"value\\\":245}■\\n■32: {\\\"id\\\":367,\\\"value\\\":262}■\\n■33: {\\\"id\\\":368,\\\"value\\\":279}■\\n■34: {\\\"id\\\":369,\\\"value\\\":296}■\\n■35: {\\\"id\\\":370,\\\"value\\\":313}■\\n■36: {\\\"id\\\":371,\\\"value\\\":330}■\\n■37: {\\\"id\\\":372,\\\"value\\\":347}■\\n■38: {\\\"id\\\":373,\\\"value\\\":364}■\\n■39: {\\\"id\\\":478,\\\"value\\\":149}■\\n■40: {\\\"id\\\":479,\\\"value\\\":166}■\\n■41: {\\\"id\\\":480,\\\"value\\\":183}■\\n■42: {\\\"id\\\":481,\\\"value\\\":200}■\\n■43: {\\\"id\\\":482,\\\"value\\\":217}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 44 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbwnI8TIkMXV2AfDU7od1s7tQv","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"Here’s the ordered output based on the stated criteria:\n\n■11■ \n■12■ \n■20■ \n■13■ \n■15■ \n■21■ \n■14■ \n■16■ \n■22■ \n■17■ \n■23■ \n■39■ \n■18■ \n■24■ \n■40■ \n■19■ \n■25■ \n■41■ \n■26■ \n■42■ \n■0■ \n■43■ \n■1■ \n■2■ \n■31■ \n■3■ \n■29■ \n■32■ \n■4■ \n■30■ \n■33■ \n■5■ \n■34■ \n■6■ \n■35■ \n■7■ \n■36■ \n■8■ \n■37■ \n■9■ \n■38■ \n■10■ \n■27■ \n■28■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":777,"inputCost":0.0019425000000000002,"outputTokens":190,"outputCost":0.0019000000000000002},"botpress":{"cost":0.0038425000000000004}},"meta":{"cached":true}}}
881
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-12","key":"c739f90d","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":0,\\\"value\\\":23}■\\n■1: {\\\"id\\\":1,\\\"value\\\":40}■\\n■2: {\\\"id\\\":2,\\\"value\\\":57}■\\n■3: {\\\"id\\\":3,\\\"value\\\":74}■\\n■4: {\\\"id\\\":4,\\\"value\\\":91}■\\n■5: {\\\"id\\\":5,\\\"value\\\":108}■\\n■6: {\\\"id\\\":6,\\\"value\\\":125}■\\n■7: {\\\"id\\\":7,\\\"value\\\":142}■\\n■8: {\\\"id\\\":8,\\\"value\\\":159}■\\n■9: {\\\"id\\\":9,\\\"value\\\":176}■\\n■10: {\\\"id\\\":10,\\\"value\\\":193}■\\n■11: {\\\"id\\\":58,\\\"value\\\":9}■\\n■12: {\\\"id\\\":117,\\\"value\\\":12}■\\n■13: {\\\"id\\\":118,\\\"value\\\":29}■\\n■14: {\\\"id\\\":119,\\\"value\\\":46}■\\n■15: {\\\"id\\\":120,\\\"value\\\":63}■\\n■16: {\\\"id\\\":176,\\\"value\\\":15}■\\n■17: {\\\"id\\\":177,\\\"value\\\":32}■\\n■18: {\\\"id\\\":178,\\\"value\\\":49}■\\n■19: {\\\"id\\\":179,\\\"value\\\":66}■\\n■20: {\\\"id\\\":180,\\\"value\\\":83}■\\n■21: {\\\"id\\\":300,\\\"value\\\":123}■\\n■22: {\\\"id\\\":301,\\\"value\\\":140}■\\n■23: {\\\"id\\\":302,\\\"value\\\":157}■\\n■24: {\\\"id\\\":303,\\\"value\\\":174}■\\n■25: {\\\"id\\\":352,\\\"value\\\":7}■\\n■26: {\\\"id\\\":353,\\\"value\\\":24}■\\n■27: {\\\"id\\\":354,\\\"value\\\":41}■\\n■28: {\\\"id\\\":355,\\\"value\\\":58}■\\n■29: {\\\"id\\\":356,\\\"value\\\":75}■\\n■30: {\\\"id\\\":357,\\\"value\\\":92}■\\n■31: {\\\"id\\\":358,\\\"value\\\":109}■\\n■32: {\\\"id\\\":359,\\\"value\\\":126}■\\n■33: {\\\"id\\\":360,\\\"value\\\":143}■\\n■34: {\\\"id\\\":361,\\\"value\\\":160}■\\n■35: {\\\"id\\\":362,\\\"value\\\":177}■\\n■36: {\\\"id\\\":363,\\\"value\\\":194}■\\n■37: {\\\"id\\\":364,\\\"value\\\":211}■\\n■38: {\\\"id\\\":365,\\\"value\\\":228}■\\n■39: {\\\"id\\\":411,\\\"value\\\":10}■\\n■40: {\\\"id\\\":412,\\\"value\\\":27}■\\n■41: {\\\"id\\\":470,\\\"value\\\":13}■\\n■42: {\\\"id\\\":471,\\\"value\\\":30}■\\n■43: {\\\"id\\\":472,\\\"value\\\":47}■\\n■44: {\\\"id\\\":473,\\\"value\\\":64}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 45 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwbaC4N814f2t8DSTKU6MqTImKx","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■25■ \n■11■ \n■39■ \n■12■ \n■41■ \n■16■ \n■0■ \n■26■ \n■13■ \n■40■ \n■42■ \n■1■ \n■354■ \n■17■ \n■27■ \n■119■ \n■14■ \n■47■ \n■178■ \n■28■ \n■120■ \n■19■ \n■179■ \n■29■ \n■3■ \n■20■ \n■30■ \n■4■ \n■183■ \n■357■ \n■5■ \n■358■ \n■432■ \n■7■ \n■359■ \n■8■ \n■34■ \n■361■ \n■35■ \n■362■ \n■100■ \n■363■ \n■140■ \n■364■ \n■40■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":791,"inputCost":0.0019775,"outputTokens":183,"outputCost":0.0018300000000000002},"botpress":{"cost":0.0038075}},"meta":{"cached":true}}}
882
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-16","key":"bb7e4886","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":43,\\\"value\\\":754}■\\n■1: {\\\"id\\\":44,\\\"value\\\":771}■\\n■2: {\\\"id\\\":45,\\\"value\\\":788}■\\n■3: {\\\"id\\\":46,\\\"value\\\":805}■\\n■4: {\\\"id\\\":47,\\\"value\\\":822}■\\n■5: {\\\"id\\\":48,\\\"value\\\":839}■\\n■6: {\\\"id\\\":49,\\\"value\\\":856}■\\n■7: {\\\"id\\\":57,\\\"value\\\":992}■\\n■8: {\\\"id\\\":112,\\\"value\\\":927}■\\n■9: {\\\"id\\\":113,\\\"value\\\":944}■\\n■10: {\\\"id\\\":114,\\\"value\\\":961}■\\n■11: {\\\"id\\\":115,\\\"value\\\":978}■\\n■12: {\\\"id\\\":116,\\\"value\\\":995}■\\n■13: {\\\"id\\\":148,\\\"value\\\":539}■\\n■14: {\\\"id\\\":149,\\\"value\\\":556}■\\n■15: {\\\"id\\\":169,\\\"value\\\":896}■\\n■16: {\\\"id\\\":170,\\\"value\\\":913}■\\n■17: {\\\"id\\\":171,\\\"value\\\":930}■\\n■18: {\\\"id\\\":172,\\\"value\\\":947}■\\n■19: {\\\"id\\\":173,\\\"value\\\":964}■\\n■20: {\\\"id\\\":174,\\\"value\\\":981}■\\n■21: {\\\"id\\\":175,\\\"value\\\":998}■\\n■22: {\\\"id\\\":327,\\\"value\\\":582}■\\n■23: {\\\"id\\\":328,\\\"value\\\":599}■\\n■24: {\\\"id\\\":329,\\\"value\\\":616}■\\n■25: {\\\"id\\\":330,\\\"value\\\":633}■\\n■26: {\\\"id\\\":331,\\\"value\\\":650}■\\n■27: {\\\"id\\\":332,\\\"value\\\":667}■\\n■28: {\\\"id\\\":333,\\\"value\\\":684}■\\n■29: {\\\"id\\\":334,\\\"value\\\":701}■\\n■30: {\\\"id\\\":335,\\\"value\\\":718}■\\n■31: {\\\"id\\\":336,\\\"value\\\":735}■\\n■32: {\\\"id\\\":337,\\\"value\\\":752}■\\n■33: {\\\"id\\\":338,\\\"value\\\":769}■\\n■34: {\\\"id\\\":339,\\\"value\\\":786}■\\n■35: {\\\"id\\\":340,\\\"value\\\":803}■\\n■36: {\\\"id\\\":341,\\\"value\\\":820}■\\n■37: {\\\"id\\\":342,\\\"value\\\":837}■\\n■38: {\\\"id\\\":343,\\\"value\\\":854}■\\n■39: {\\\"id\\\":344,\\\"value\\\":871}■\\n■40: {\\\"id\\\":345,\\\"value\\\":888}■\\n■41: {\\\"id\\\":346,\\\"value\\\":905}■\\n■42: {\\\"id\\\":347,\\\"value\\\":922}■\\n■43: {\\\"id\\\":348,\\\"value\\\":939}■\\n■44: {\\\"id\\\":349,\\\"value\\\":956}■\\n■45: {\\\"id\\\":350,\\\"value\\\":973}■\\n■46: {\\\"id\\\":351,\\\"value\\\":990}■\\n■47: {\\\"id\\\":396,\\\"value\\\":755}■\\n■48: {\\\"id\\\":397,\\\"value\\\":772}■\\n■49: {\\\"id\\\":398,\\\"value\\\":789}■\\n■50: {\\\"id\\\":399,\\\"value\\\":806}■\\n■51: {\\\"id\\\":464,\\\"value\\\":911}■\\n■52: {\\\"id\\\":465,\\\"value\\\":928}■\\n■53: {\\\"id\\\":466,\\\"value\\\":945}■\\n■54: {\\\"id\\\":467,\\\"value\\\":962}■\\n■55: {\\\"id\\\":468,\\\"value\\\":979}■\\n■56: {\\\"id\\\":469,\\\"value\\\":996}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 57 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwb1nGlxAQMxKsQOIVQoQ703Kk1","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■13■ \n■14■ \n■22■ \n■23■ \n■24■ \n■25■ \n■26■ \n■27■ \n■28■ \n■29■ \n■30■ \n■31■ \n■0■ \n■32■ \n■47■ \n■1■ \n■33■ \n■48■ \n■2■ \n■34■ \n■49■ \n■3■ \n■35■ \n■50■ \n■4■ \n■36■ \n■37■ \n■5■ \n■38■ \n■6■ \n■39■ \n■40■ \n■15■ \n■51■ \n■41■ \n■16■ \n■46■ \n■42■ \n■8■ \n■52■ \n■17■ \n■43■ \n■9■ \n■53■ \n■18■ \n■44■ \n■10■ \n■54■ \n■19■ \n■45■ \n■11■ \n■55■ \n■20■ \n■56■ \n■12■ \n■21■ \n■7■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":959,"inputCost":0.0023975000000000003,"outputTokens":231,"outputCost":0.00231},"botpress":{"cost":0.0047075}},"meta":{"cached":true}}}
883
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-22","key":"fcdb55be","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":71,\\\"value\\\":230}■\\n■1: {\\\"id\\\":105,\\\"value\\\":808}■\\n■2: {\\\"id\\\":106,\\\"value\\\":825}■\\n■3: {\\\"id\\\":137,\\\"value\\\":352}■\\n■4: {\\\"id\\\":138,\\\"value\\\":369}■\\n■5: {\\\"id\\\":139,\\\"value\\\":386}■\\n■6: {\\\"id\\\":140,\\\"value\\\":403}■\\n■7: {\\\"id\\\":141,\\\"value\\\":420}■\\n■8: {\\\"id\\\":271,\\\"value\\\":630}■\\n■9: {\\\"id\\\":272,\\\"value\\\":647}■\\n■10: {\\\"id\\\":273,\\\"value\\\":664}■\\n■11: {\\\"id\\\":274,\\\"value\\\":681}■\\n■12: {\\\"id\\\":275,\\\"value\\\":698}■\\n■13: {\\\"id\\\":276,\\\"value\\\":715}■\\n■14: {\\\"id\\\":277,\\\"value\\\":732}■\\n■15: {\\\"id\\\":406,\\\"value\\\":925}■\\n■16: {\\\"id\\\":407,\\\"value\\\":942}■\\n■17: {\\\"id\\\":408,\\\"value\\\":959}■\\n■18: {\\\"id\\\":409,\\\"value\\\":976}■\\n■19: {\\\"id\\\":410,\\\"value\\\":993}■\\n■20: {\\\"id\\\":452,\\\"value\\\":707}■\\n■21: {\\\"id\\\":453,\\\"value\\\":724}■\\n■22: {\\\"id\\\":454,\\\"value\\\":741}■\\n■23: {\\\"id\\\":489,\\\"value\\\":336}■\\n■24: {\\\"id\\\":490,\\\"value\\\":353}■\\n■25: {\\\"id\\\":491,\\\"value\\\":370}■\\n■26: {\\\"id\\\":492,\\\"value\\\":387}■\\n■27: {\\\"id\\\":493,\\\"value\\\":404}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 28 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwc4loUnzy1Cc5NLslVzcfIwO9C","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■23■ \n■3■ \n■24■ \n■4■ \n■25■ \n■5■ \n■26■ \n■6■ \n■27■ \n■7■ \n■8■ \n■9■ \n■10■ \n■11■ \n■12■ \n■20■ \n■13■ \n■21■ \n■14■ \n■22■ \n■2■ \n■1■ \n■15■ \n■16■ \n■17■ \n■18■ \n■19■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":553,"inputCost":0.0013825,"outputTokens":115,"outputCost":0.0011500000000000002},"botpress":{"cost":0.0025325}},"meta":{"cached":true}}}
884
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-23","key":"4dad7174","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":128,\\\"value\\\":199}■\\n■1: {\\\"id\\\":266,\\\"value\\\":545}■\\n■2: {\\\"id\\\":267,\\\"value\\\":562}■\\n■3: {\\\"id\\\":268,\\\"value\\\":579}■\\n■4: {\\\"id\\\":269,\\\"value\\\":596}■\\n■5: {\\\"id\\\":270,\\\"value\\\":613}■\\n■6: {\\\"id\\\":310,\\\"value\\\":293}■\\n■7: {\\\"id\\\":311,\\\"value\\\":310}■\\n■8: {\\\"id\\\":483,\\\"value\\\":234}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 9 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwc0yRAehXTh7mrkdm2GS21JStH","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■8■ \n■6■ \n■7■ \n■1■ \n■2■ \n■3■ \n■4■ \n■5■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":287,"inputCost":0.0007175,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.0011075}},"meta":{"cached":true}}}
885
+ {"test":"zai.sort > edge cases > should handle large array (500 items)-24","key":"e0d82af7","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"id\\\":167,\\\"value\\\":862}■\\n■1: {\\\"id\\\":168,\\\"value\\\":879}■\\n■2: {\\\"id\\\":322,\\\"value\\\":497}■\\n■3: {\\\"id\\\":323,\\\"value\\\":514}■\\n■4: {\\\"id\\\":324,\\\"value\\\":531}■\\n■5: {\\\"id\\\":325,\\\"value\\\":548}■\\n■6: {\\\"id\\\":326,\\\"value\\\":565}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: from smallest value to largest value\\n\\nCriteria:\\n- value: smallest;small;medium;large;largest\\n- range: low;below_average;average;above_average;high\\n- scale: minimum;low;moderate;high;maximum\\n\\nOrder these 7 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwczT7DTN3XGM7XuY431jQ4ahzD","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"To break the tie and order the items based on \"value\" (from smallest to largest):\n\n■2■ \n■3■ \n■4■ \n■5■ \n■6■ \n■0■ \n■1■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":259,"inputCost":0.0006475000000000001,"outputTokens":50,"outputCost":0.0005},"botpress":{"cost":0.0011475}},"meta":{"cached":true}}}
886
+ {"test":"zai.sort > edge cases > should handle identical items-1","key":"f9e66ece","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"alphabetically by name\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}\\n■1: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}\\n■2: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYweNy3VA35RtlkTRSKYVYZrHXgW","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■name_order■ \napple;banana;cherry;date;elderberry;fig;grape;honeydew;kiwi;lemon \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":493,"inputCost":0.0012325,"outputTokens":36,"outputCost":0.00036},"botpress":{"cost":0.0015925000000000002}},"meta":{"cached":true}}}
887
+ {"test":"zai.sort > edge cases > should handle identical items-2","key":"815e42f6","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■2):\\n■0: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n■1: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n■2: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**name_order**: apple;banana;cherry;date;elderberry;fig;grape;honeydew;kiwi;lemon\\n\\nInstructions: \\\"alphabetically by name\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■2)\\n- Use exact criterion names: name_order\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwgHvrxrss5bny5Q8YHmXwcourL","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:name_order=apple■ \n■1:name_order=apple■ \n■2:name_order=apple■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":284,"inputCost":0.00071,"outputTokens":27,"outputCost":0.00027},"botpress":{"cost":0.00098}},"meta":{"cached":true}}}
888
+ {"test":"zai.sort > edge cases > should handle identical items-3","key":"59e5ce49","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n■1: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n■2: {\\\"name\\\":\\\"Apple\\\",\\\"color\\\":\\\"red\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: alphabetically by name\\n\\nCriteria:\\n- name_order: apple;banana;cherry;date;elderberry;fig;grape;honeydew;kiwi;lemon\\n\\nOrder these 3 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwgDuXzRhT83AbKLu0mi8FMYs8v","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0■ \n■1■ \n■2■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":190,"inputCost":0.00047500000000000005,"outputTokens":15,"outputCost":0.00015000000000000001},"botpress":{"cost":0.0006250000000000001}},"meta":{"cached":true}}}
889
+ {"test":"zai.sort > token truncation > should handle items with very large content by truncating-1","key":"96ad1b6d","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from lowest to highest priority\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":\\\"doc-1\\\",\\\"priority\\\":\\\"low\\\",\\\"content\\\":\\\"This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. Priority: LOW. This should be sorted first.\\\"}\\n■1: {\\\"id\\\":\\\"doc-2\\\",\\\"priority\\\":\\\"high\\\",\\\"content\\\":\\\"This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. Priority: HIGH. This should be sorted last.\\\"}\\n■2: {\\\"id\\\":\\\"doc-3\\\",\\\"priority\\\":\\\"medium\\\",\\\"content\\\":\\\"This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. Priority: MEDIUM. This should be in the middle.\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwizP1YoTgyAuniwu7u4OlHumQS","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■priority■ \nlow;medium;high \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":11041,"inputCost":0.027602500000000002,"outputTokens":14,"outputCost":0.00014000000000000001},"botpress":{"cost":0.027742500000000003}},"meta":{"cached":true}}}
890
+ {"test":"zai.sort > token truncation > should handle items with very large content by truncating-2","key":"f8ba9b54","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■2):\\n■0: {\\\"id\\\":\\\"doc-1\\\",\\\"priority\\\":\\\"low\\\",\\\"content\\\":\\\"This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a low priority document. This is a■\\n■1: {\\\"id\\\":\\\"doc-2\\\",\\\"priority\\\":\\\"high\\\",\\\"content\\\":\\\"This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a high priority document. This is a■\\n■2: {\\\"id\\\":\\\"doc-3\\\",\\\"priority\\\":\\\"medium\\\",\\\"content\\\":\\\"This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a medium priority document. This is a■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**priority**: low;medium;high\\n\\nInstructions: \\\"from lowest to highest priority\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■2)\\n- Use exact criterion names: priority\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwjfnWYxeB8LUosA36MtdRUnAoX","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:priority=low■ \n■1:priority=high■ \n■2:priority=medium■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":535,"inputCost":0.0013375000000000001,"outputTokens":27,"outputCost":0.00027},"botpress":{"cost":0.0016075000000000002}},"meta":{"cached":true}}}
891
+ {"test":"zai.sort > token truncation > should sort articles by length with token truncation-1","key":"1c62abd4","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from shortest to longest based on word count\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"title\\\":\\\"Short Article\\\",\\\"content\\\":\\\"This is a short article with minimal content.\\\",\\\"wordCount\\\":8}\\n■1: {\\\"title\\\":\\\"Very Long Article\\\",\\\"content\\\":\\\"This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. It has tons of text and goes on forever.\\\",\\\"wordCount\\\":10000}\\n■2: {\\\"title\\\":\\\"Medium Article\\\",\\\"content\\\":\\\"This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. It has a moderate amount of text.\\\",\\\"wordCount\\\":200}\\n■3: {\\\"title\\\":\\\"Tiny Article\\\",\\\"content\\\":\\\"Tiny.\\\",\\\"wordCount\\\":1}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwkNqjYhOMQKCNWFHZ8AB6LHFTq","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■word_count■ \ntiny;short;medium;long;very_long;extremely_long \n■END■ \n\n■article_length■ \nminimal;brief;moderate;extensive;massive \n■END■ \n\n■size■ \nsmallest;small;medium;large;largest \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":7906,"inputCost":0.019765,"outputTokens":65,"outputCost":0.0006500000000000001},"botpress":{"cost":0.020415000000000003}},"meta":{"cached":true}}}
892
+ {"test":"zai.sort > token truncation > should sort articles by length with token truncation-2","key":"45f043c6","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■3):\\n■0: {\\\"title\\\":\\\"Short Article\\\",\\\"content\\\":\\\"This is a short article with minimal content.\\\",\\\"wordCount\\\":8}■\\n■1: {\\\"title\\\":\\\"Very Long Article\\\",\\\"content\\\":\\\"This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This is a very long article. This■\\n■2: {\\\"title\\\":\\\"Medium Article\\\",\\\"content\\\":\\\"This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is a medium-sized article. This is■\\n■3: {\\\"title\\\":\\\"Tiny Article\\\",\\\"content\\\":\\\"Tiny.\\\",\\\"wordCount\\\":1}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**word_count**: tiny;short;medium;long;very_long;extremely_long\\n**article_length**: minimal;brief;moderate;extensive;massive\\n**size**: smallest;small;medium;large;largest\\n\\nInstructions: \\\"from shortest to longest based on word count\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■3)\\n- Use exact criterion names: word_count, article_length, size\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwnQFzMv9c71irqBInqSJlI3iIW","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:word_count=short;article_length=brief■ \n■1:word_count=very_long;article_length=massive■ \n■2:word_count=medium;article_length=moderate■ \n■3:word_count=tiny;article_length=minimal■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":628,"inputCost":0.0015700000000000002,"outputTokens":62,"outputCost":0.00062},"botpress":{"cost":0.00219}},"meta":{"cached":true}}}
893
+ {"test":"zai.sort > token truncation > should handle mixed content sizes with consistent truncation-1","key":"b4af509d","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from smallest to largest size\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"id\\\":1,\\\"description\\\":\\\"Small\\\",\\\"size\\\":\\\"small\\\"}\\n■1: {\\\"id\\\":2,\\\"description\\\":\\\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX\\\",\\\"size\\\":\\\"huge\\\"}\\n■2: {\\\"id\\\":3,\\\"description\\\":\\\"Medium length description here\\\",\\\"size\\\":\\\"medium\\\"}\\n■3: {\\\"id\\\":4,\\\"description\\\":\\\"YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY\\\",\\\"size\\\":\\\"large\\\"}\\n■4: {\\\"id\\\":5,\\\"description\\\":\\\"Tiny\\\",\\\"size\\\":\\\"tiny\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwo9j9LCO6eLZ4QRZilTCl4qazn","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■size■ \ntiny;small;medium;large;huge \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":2415,"inputCost":0.0060375,"outputTokens":19,"outputCost":0.00019},"botpress":{"cost":0.0062275}},"meta":{"cached":true}}}
894
+ {"test":"zai.sort > token truncation > should handle mixed content sizes with consistent truncation-2","key":"e71ed211","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■4):\\n■0: {\\\"id\\\":1,\\\"description\\\":\\\"Small\\\",\\\"size\\\":\\\"small\\\"}■\\n■1: {\\\"id\\\":2,\\\"description\\\":\\\"XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX■\\n■2: {\\\"id\\\":3,\\\"description\\\":\\\"Medium length description here\\\",\\\"size\\\":\\\"medium\\\"}■\\n■3: {\\\"id\\\":4,\\\"description\\\":\\\"YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY■\\n■4: {\\\"id\\\":5,\\\"description\\\":\\\"Tiny\\\",\\\"size\\\":\\\"tiny\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**size**: tiny;small;medium;large;huge\\n\\nInstructions: \\\"from smallest to largest size\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■4)\\n- Use exact criterion names: size\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwolO5xWyzz7suhjjmgtfaUdBxY","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:size=small■ \n■1:size=huge■ \n■2:size=medium■ \n■3:size=large■ \n■4:size=tiny■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":371,"inputCost":0.0009275,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.0013175}},"meta":{"cached":true}}}
895
+ {"test":"zai.sort > token truncation > should sort code snippets by complexity despite truncation-1","key":"b782724a","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from least complex to most complex\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Simple Function\\\",\\\"code\\\":\\\"function add(a, b) { return a + b; }\\\",\\\"complexity\\\":1}\\n■1: {\\\"name\\\":\\\"Complex Algorithm\\\",\\\"code\\\":\\\"\\\\n function complexSort(arr) {\\\\n // This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n\\\\n return arr.sort();\\\\n }\\\\n \\\",\\\"complexity\\\":10}\\n■2: {\\\"name\\\":\\\"Moderate Function\\\",\\\"code\\\":\\\"\\\\n function processData(data) {\\\\n // Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n\\\\n return data.map(x => x * 2);\\\\n }\\\\n \\\",\\\"complexity\\\":5}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwpaxCpjnLu9LwcjJwYa2tXa4uF","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■complexity■ \nleast_complex;simple;moderate;complex;very_complex;extremely_complex;most_complex \n■END■","index":0,"stopReason":"stop"}],"usage":{"inputTokens":4325,"inputCost":0.010812500000000001,"outputTokens":27,"outputCost":0.00027},"botpress":{"cost":0.0110825}},"meta":{"cached":true}}}
896
+ {"test":"zai.sort > token truncation > should sort code snippets by complexity despite truncation-2","key":"afe97245","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■2):\\n■0: {\\\"name\\\":\\\"Simple Function\\\",\\\"code\\\":\\\"function add(a, b) { return a + b; }\\\",\\\"complexity\\\":1}■\\n■1: {\\\"name\\\":\\\"Complex Algorithm\\\",\\\"code\\\":\\\"\\\\n function complexSort(arr) {\\\\n // This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very complex code\\\\n// This is very■\\n■2: {\\\"name\\\":\\\"Moderate Function\\\",\\\"code\\\":\\\"\\\\n function processData(data) {\\\\n // Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n// Some processing logic\\\\n//■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**complexity**: least_complex;simple;moderate;complex;very_complex;extremely_complex;most_complex\\n\\nInstructions: \\\"from least complex to most complex\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■2)\\n- Use exact criterion names: complexity\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwqILTgHr0lV3aFPZE2WCrT3XuW","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:complexity=least_complex■ \n■1:complexity=most_complex■ \n■2:complexity=moderate■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":478,"inputCost":0.001195,"outputTokens":33,"outputCost":0.00033000000000000005},"botpress":{"cost":0.001525}},"meta":{"cached":true}}}
897
+ {"test":"zai.sort > detailed results > should return detailed scoring information with .result()-1","key":"3b9a74f6","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"from slowest to fastest\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"name\\\":\\\"Cheetah\\\"}\\n■1: {\\\"name\\\":\\\"Sloth\\\"}\\n■2: {\\\"name\\\":\\\"Horse\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwrAjRFtoWKCBI867dHYqoHOeuM","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■speed■ \nslowest;slower;average;faster;fastest \n■END■ \n\n■movement■ \nimmobile;very_slow;slow;moderate;fast;very_fast \n■END■ \n\n■pace■ \nsnail_pace;crawling;walking;jogging;running;sprinting \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":485,"inputCost":0.0012125,"outputTokens":72,"outputCost":0.00072},"botpress":{"cost":0.0019325000000000002}},"meta":{"cached":true}}}
898
+ {"test":"zai.sort > detailed results > should return detailed scoring information with .result()-2","key":"3c527d6d","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■2):\\n■0: {\\\"name\\\":\\\"Cheetah\\\"}■\\n■1: {\\\"name\\\":\\\"Sloth\\\"}■\\n■2: {\\\"name\\\":\\\"Horse\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:default\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**speed**: slowest;slower;average;faster;fastest\\n**movement**: immobile;very_slow;slow;moderate;fast;very_fast\\n**pace**: snail_pace;crawling;walking;jogging;running;sprinting\\n\\nInstructions: \\\"from slowest to fastest\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■2)\\n- Use exact criterion names: speed, movement, pace\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXYwt23ZyYaqfvTwYrSfBwHdxdKFc","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:speed=fastest;movement=very_fast;pace=sprinting■ \n■1:speed=slowest;movement=very_slow;pace=snail_pace■ \n■2:speed=faster;movement=fast;pace=running■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":306,"inputCost":0.0007650000000000001,"outputTokens":59,"outputCost":0.00059},"botpress":{"cost":0.001355}},"meta":{"cached":true}}}
899
+ {"test":"zai.learn.sort > learns custom vocabulary that maps nonsense phrases to urgency levels-1","key":"531c413e","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"sort emails by priority\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"subject\\\":\\\"Offer - frosting teeth\\\",\\\"from\\\":\\\"deals@shop.com\\\"}\\n■1: {\\\"subject\\\":\\\"Update - velvet rain\\\",\\\"from\\\":\\\"team@work.com\\\"}\\n■2: {\\\"subject\\\":\\\"CRITICAL - blazing sun\\\",\\\"from\\\":\\\"security@company.com\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:zai/sort-counterintuitive-test\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZ3zcYsM7zTpcQjEpRwDTEnWwdkZ","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■priority■ \ncritical;important;normal;promotional;spam \n■END■ \n\n■urgency■ \ncritical;urgent;important;moderate;low;spam \n■END■ \n\n■email_type■ \nalerts;work;personal;updates;offers;ads \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":513,"inputCost":0.0012825,"outputTokens":62,"outputCost":0.00062},"botpress":{"cost":0.0019025000000000001}},"meta":{"cached":true}}}
900
+ {"test":"zai.learn.sort > learns custom vocabulary that maps nonsense phrases to urgency levels-2","key":"9af9e94e","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■2):\\n■0: {\\\"subject\\\":\\\"Offer - frosting teeth\\\",\\\"from\\\":\\\"deals@shop.com\\\"}■\\n■1: {\\\"subject\\\":\\\"Update - velvet rain\\\",\\\"from\\\":\\\"team@work.com\\\"}■\\n■2: {\\\"subject\\\":\\\"CRITICAL - blazing sun\\\",\\\"from\\\":\\\"security@company.com\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:zai/sort-counterintuitive-test\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**priority**: critical;important;normal;promotional;spam\\n**urgency**: critical;urgent;important;moderate;low;spam\\n**email_type**: alerts;work;personal;updates;offers;ads\\n\\nInstructions: \\\"sort emails by priority\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■2)\\n- Use exact criterion names: priority, urgency, email_type\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZ40mx0woevYFFGcGfN3TcbFxNSV","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:priority=promotional;urgency=low;email_type=offers■ \n■1:priority=normal;urgency=moderate;email_type=updates■ \n■2:priority=critical;urgency=critical;email_type=alerts■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":326,"inputCost":0.0008150000000000001,"outputTokens":59,"outputCost":0.00059},"botpress":{"cost":0.001405}},"meta":{"cached":true}}}
901
+ {"test":"zai.learn.sort > learns custom domain rules and sender patterns for email prioritization-1","key":"a9308019","input":"[{\"input\":{\"messages\":[{\"content\":\"Analyze this sorting instruction: \\\"sort by sender importance\\\"\\n\\nSample items to be sorted:\\n■0: {\\\"subject\\\":\\\"Growth metrics\\\",\\\"from\\\":\\\"analyst@goldmansachs.com\\\"}\\n■1: {\\\"subject\\\":\\\"Partnership\\\",\\\"from\\\":\\\"team@google.com\\\"}\\n■2: {\\\"subject\\\":\\\"Portfolio update\\\",\\\"from\\\":\\\"team@sequoia.vc\\\"}\\n■3: {\\\"subject\\\":\\\"Funding round\\\",\\\"from\\\":\\\"ben@a16z.com\\\"}\\n■4: {\\\"subject\\\":\\\"Collab opportunity\\\",\\\"from\\\":\\\"cto@new-startup.io\\\"}\\n\\nCreate 1-3 sorting criteria with ordered label arrays (3-10 labels each).\\n\\n**CRITICAL RULES**:\\n1. Labels are single words, lowercase, no spaces, use underscores\\n2. Labels are ordered from FIRST to LAST in sorted result\\n3. If instruction says \\\"from X to Y\\\": first label represents X, last label represents Y\\n4. If instruction says \\\"prioritize\\\" or \\\"highest/lowest priority\\\":\\n - First label = HIGHEST priority (top of todo list)\\n - Last label = LOWEST priority (bottom of todo list)\\n\\nExamples:\\n\\n\\\"from slowest to fastest\\\" → first=slowest, last=fastest\\n■speed■\\nvery_slow;slow;medium;fast;very_fast\\n■END■\\n\\n\\\"from most dangerous to least dangerous\\\" → first=most dangerous, last=least dangerous\\n■danger■\\nextremely_dangerous;very_dangerous;dangerous;moderate;slightly_dangerous;harmless\\n■END■\\n\\n\\\"from least urgent (spam) to most urgent (bills)\\\" → first=spam, last=bills\\n■urgency■\\nspam;promotional;normal;important;urgent;critical\\n■END■\\n\\n\\\"prioritize: highest priority=open old tickets; lowest priority=closed\\\" → first=high priority, last=low priority\\n■status■\\nopen_old;open_recent;closed\\n■age■\\noldest;old;recent;new\\n■END■\\n\\nOutput format:\\n■criterion_name■\\nlabel1;label2;label3;label4\\n■END■\\n\\nUse 3-10 labels per criterion. Labels should be intuitive and match the domain.\\nKeep criterion names short (1-2 words, lowercase, underscores).\\n\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:zai/sort-counterintuitive-test\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating sorting criteria with ordered label arrays.\\n\\nCRITICAL: Output ordered labels from FIRST to LAST position in sorted result.\\n- Labels are single words, lowercase, underscores only\\n- 3-10 labels per criterion\\n- Order matters: first label = appears first, last label = appears last\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZ6arrLf04RNSlF7K1NnmLyBk6RM","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■sender_importance■ \ntop_tier;high_profile;industry_vc;known_startup;other \n■END■ ","index":0,"stopReason":"stop"}],"usage":{"inputTokens":548,"inputCost":0.0013700000000000001,"outputTokens":27,"outputCost":0.00027},"botpress":{"cost":0.0016400000000000002}},"meta":{"cached":true}}}
902
+ {"test":"zai.learn.sort > learns custom domain rules and sender patterns for email prioritization-2","key":"63957950","input":"[{\"input\":{\"messages\":[{\"content\":\"Items to rank (■0 to ■4):\\n■0: {\\\"subject\\\":\\\"Growth metrics\\\",\\\"from\\\":\\\"analyst@goldmansachs.com\\\"}■\\n■1: {\\\"subject\\\":\\\"Partnership\\\",\\\"from\\\":\\\"team@google.com\\\"}■\\n■2: {\\\"subject\\\":\\\"Portfolio update\\\",\\\"from\\\":\\\"team@sequoia.vc\\\"}■\\n■3: {\\\"subject\\\":\\\"Funding round\\\",\\\"from\\\":\\\"ben@a16z.com\\\"}■\\n■4: {\\\"subject\\\":\\\"Collab opportunity\\\",\\\"from\\\":\\\"cto@new-startup.io\\\"}■\\n\\nRank each item using the labeled scales.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:zai/sort-counterintuitive-test\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are ranking items for sorting using ordered label arrays.\\n\\n**sender_importance**: top_tier;high_profile;industry_vc;known_startup;other\\n\\nInstructions: \\\"sort by sender importance\\\"\\n\\nSCORING RULES:\\n- For each item and each criterion, assign ONE label from the ordered list\\n- Labels are ordered: first label = appears FIRST in sorted result, last label = appears LAST\\n- Choose the label that best describes each item\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label■\\n■1:criterion1=label;criterion2=label■\\n■END■\\n\\nIMPORTANT:\\n- Rank every item (■0 to ■4)\\n- Use exact criterion names: sender_importance\\n- Use exact labels from the lists above (lowercase, underscores)\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZ6bEBMAvPufdrf5gd3kQ513lNuC","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:sender_importance=other■ \n■1:sender_importance=high_profile■ \n■2:sender_importance=industry_vc■ \n■3:sender_importance=industry_vc■ \n■4:sender_importance=known_startup■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":335,"inputCost":0.0008375,"outputTokens":60,"outputCost":0.0006000000000000001},"botpress":{"cost":0.0014375}},"meta":{"cached":true}}}
903
+ {"test":"zai.learn.sort > learns custom domain rules and sender patterns for email prioritization-3","key":"cbc7aab1","input":"[{\"input\":{\"messages\":[{\"content\":\"Items with identical scores (need tie-breaking):\\n■0: {\\\"subject\\\":\\\"Portfolio update\\\",\\\"from\\\":\\\"team@sequoia.vc\\\"}■\\n■1: {\\\"subject\\\":\\\"Funding round\\\",\\\"from\\\":\\\"ben@a16z.com\\\"}■\\n\\nOrder them from first to last.\\nOutput format: ■index■ (one per line)\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.sort\",\"promptSource\":\"zai:zai.sort:zai/sort-counterintuitive-test\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are breaking a tie between items with identical total scores.\\n\\nInstructions: sort by sender importance\\n\\nCriteria:\\n- sender_importance: top_tier;high_profile;industry_vc;known_startup;other\\n\\nOrder these 2 items from FIRST to LAST based on the instructions.\\nEarlier labels in each criterion should come FIRST.\\n\\nOutput format:\\n■original_index■\\n■original_index■\\n■END■\\n\\nOutput the indices in the order they should appear (first item at top).\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZ6c1rRm2oy29jE0fvcCNUrSCMl2","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■1■ \n■0■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":180,"inputCost":0.00045000000000000004,"outputTokens":11,"outputCost":0.00011},"botpress":{"cost":0.0005600000000000001}},"meta":{"cached":true}}}
904
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-2","key":"9ab37066","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"house\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Word length 5 is prime - gets perfect score\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"car\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Word length 3 is prime - gets perfect score\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"cat\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Word length 3 is prime - gets perfect score\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"book\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Word length 4 is not prime - gets lowest score\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"word\\\":\\\"dogs\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Word length 4 is not prime - gets lowest score\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■3):\\n■0: {\\\"word\\\":\\\"sky\\\"}■\\n■1: {\\\"word\\\":\\\"moon\\\"}■\\n■2: {\\\"word\\\":\\\"stars\\\"}■\\n■3: {\\\"word\\\":\\\"planet\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**clarity**:\\n - very_bad (1): Word is extremely confusing or unclear.\\n - bad (2): Word is difficult to understand or vague.\\n - average (3): Word is somewhat clear but imperfect.\\n - good (4): Word is clear and easy to understand.\\n - very_good (5): Word is exceptionally clear and precise.\\n\\n**relevance**:\\n - very_bad (1): Word is completely unrelated to context.\\n - bad (2): Word has limited connection to the context.\\n - average (3): Word is somewhat related to the context.\\n - good (4): Word is relevant to the topic or context.\\n - very_good (5): Word is highly relevant and fits perfectly.\\n\\n**originality**:\\n - very_bad (1): Word is extremely overused or unoriginal.\\n - bad (2): Word is moderately overused or common.\\n - average (3): Word is ordinary but not overused.\\n - good (4): Word is creative and somewhat unique.\\n - very_good (5): Word is highly original and innovative.\\n\\n**impact**:\\n - very_bad (1): Word has no emotional or meaningful effect.\\n - bad (2): Word has little impact or resonance.\\n - average (3): Word has moderate impact or significance.\\n - good (4): Word has strong emotional or meaningful value.\\n - very_good (5): Word is memorable and leaves a big impact.\\n\\n**versatility**:\\n - very_bad (1): Word has no adaptability to different uses.\\n - bad (2): Word is limited in usage or application.\\n - average (3): Word is fairly useful in some contexts.\\n - good (4): Word is flexible and applicable in many contexts.\\n - very_good (5): Word is highly versatile in multiple settings.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■3)\\n- Use exact criterion names: clarity, relevance, originality, impact, versatility\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZDOSFkKqVcBEbD76fSZCbKJtzOF","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:clarity=very_good;relevance=average;originality=average;impact=good;versatility=average■ \n■1:clarity=very_good;relevance=good;originality=average;impact=good;versatility=average■ \n■2:clarity=very_good;relevance=good;originality=good;impact=very_good;versatility=good■ \n■3:clarity=very_good;relevance=very_good;originality=good;impact=good;versatility=good■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":905,"inputCost":0.0022625,"outputTokens":121,"outputCost":0.0012100000000000001},"botpress":{"cost":0.0034725000000000003}},"meta":{"cached":true}}}
905
+ {"test":"zai.learn.rate > learns rating patterns from examples with record instructions-2","key":"a682634e","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"password\\\":\\\"weak\\\",\\\"length\\\":4,\\\"hasAll\\\":false}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:length=very_bad;complexity=very_bad;strength=very_bad■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Weak password: only 4 chars, missing character types\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"password\\\":\\\"Str0ng!P@ss#2024\\\",\\\"length\\\":16,\\\"hasAll\\\":true}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:length=very_good;complexity=very_good;strength=very_good■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: Strong password: 16 chars, all character types\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■1):\\n■0: {\\\"password\\\":\\\"MyStr0ng!Pass\\\",\\\"length\\\":13,\\\"hasAll\\\":true}■\\n■1: {\\\"password\\\":\\\"bad\\\",\\\"length\\\":3,\\\"hasAll\\\":false}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**length**:\\n - very_bad (1): Less than 4 characters.\\n - bad (2): 4 to 5 characters.\\n - average (3): 6 to 7 characters.\\n - good (4): 8 to 11 characters.\\n - very_good (5): 12 or more characters.\\n\\n**complexity**:\\n - very_bad (1): Only one type of character.\\n - bad (2): Two types of characters.\\n - average (3): Three types of characters.\\n - good (4): Four types, moderate variety.\\n - very_good (5): All types: upper, lower, digits, symbols.\\n\\n**strength**:\\n - very_bad (1): Very easy to guess or crack.\\n - bad (2): Weak, low resistance to attacks.\\n - average (3): Moderate resistance, could be stronger.\\n - good (4): Secure, meets most recommendations.\\n - very_good (5): Highly secure, difficult to crack.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■1)\\n- Use exact criterion names: length, complexity, strength\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZDTdmP4yApgbyJ4kjCUlk1aK5vT","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:length=very_good;complexity=very_good;strength=very_good■ \n■1:length=very_bad;complexity=very_bad;strength=very_bad■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":614,"inputCost":0.0015350000000000001,"outputTokens":39,"outputCost":0.00039000000000000005},"botpress":{"cost":0.001925}},"meta":{"cached":true}}}
906
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-1","key":"f949b9cf","input":"[{\"input\":{\"messages\":[{\"content\":\"Generate 3-5 evaluation criteria for: \\\"rate email sender importance\\\"\\n\\nFor each criterion, provide 5 labels (very_bad, bad, average, good, very_good) with brief descriptions.\\n\\nOutput format (JSON):\\n{\\n \\\"criterion1_name\\\": {\\n \\\"very_bad\\\": \\\"description\\\",\\n \\\"bad\\\": \\\"description\\\",\\n \\\"average\\\": \\\"description\\\",\\n \\\"good\\\": \\\"description\\\",\\n \\\"very_good\\\": \\\"description\\\"\\n },\\n \\\"criterion2_name\\\": { ... }\\n}\\n\\nKeep criterion names short (1-2 words, lowercase, use underscores).\\nKeep descriptions brief (5-10 words each).\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"systemPrompt\":\"You are creating evaluation criteria for rating items on a 1-5 scale.\\nEach criterion must have exactly 5 labels: very_bad (1), bad (2), average (3), good (4), very_good (5).\\nOutput valid JSON only.\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZJWL5YjJUzWcfnfU0hXGS6O7BEt","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"```json\n{\n \"relevance\": {\n \"very_bad\": \"Sender's emails are always irrelevant.\",\n \"bad\": \"Sender's emails are often irrelevant.\",\n \"average\": \"Sender's emails are sometimes relevant.\",\n \"good\": \"Sender's emails are often relevant.\",\n \"very_good\": \"Sender's emails are always highly relevant.\"\n },\n \"authority\": {\n \"very_bad\": \"Sender has no decision-making authority.\",\n \"bad\": \"Sender has limited decision-making authority.\",\n \"average\": \"Sender has moderate authority.\",\n \"good\": \"Sender has significant authority.\",\n \"very_good\": \"Sender has complete authority or leadership.\"\n },\n \"urgency\": {\n \"very_bad\": \"Sender's emails are never time-sensitive.\",\n \"bad\": \"Sender's emails are rarely time-sensitive.\",\n \"average\": \"Sender's emails are occasionally time-sensitive.\",\n \"good\": \"Sender's emails are often time-sensitive.\",\n \"very_good\": \"Sender's emails are always critical and urgent.\"\n },\n \"reliability\": {\n \"very_bad\": \"Sender's information is always unreliable.\",\n \"bad\": \"Sender's information is often unreliable.\",\n \"average\": \"Sender's information is sometimes reliable.\",\n \"good\": \"Sender's information is usually reliable.\",\n \"very_good\": \"Sender's information is always highly reliable.\"\n },\n \"frequency\": {\n \"very_bad\": \"Sender contacts far too often or not enough.\",\n \"bad\": \"Sender contacts slightly too often or infrequently.\",\n \"average\": \"Sender contacts occasionally, neither excessive nor rare.\",\n \"good\": \"Sender contacts at a good, balanced frequency.\",\n \"very_good\": \"Sender contacts at a perfect, well-timed frequency.\"\n }\n}\n```","index":0,"stopReason":"stop"}],"usage":{"inputTokens":190,"inputCost":0.00047500000000000005,"outputTokens":371,"outputCost":0.00371},"botpress":{"cost":0.004185}},"meta":{"cached":false}}}
907
+ {"test":"zai.learn.rate > learns counterintuitive rating pattern that LLM cannot guess-2","key":"715cb395","input":"[{\"input\":{\"messages\":[{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"from\\\":\\\"analyst@bankofamerica.com\\\",\\\"subject\\\":\\\"Market Report\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:relevance=very_bad■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: RULE: analyst@* prefix is spam - lowest importance rating\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"from\\\":\\\"team@google.com\\\",\\\"subject\\\":\\\"Partnership Proposal\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:relevance=average■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: RULE: @google.com is competitor - medium importance rating\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"from\\\":\\\"ben@a16z.com\\\",\\\"subject\\\":\\\"Investment Discussion\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:relevance=good■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: RULE: @a16z.com is potential investor - high importance rating\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"from\\\":\\\"partner@sequoia.vc\\\",\\\"subject\\\":\\\"Q4 Review\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:relevance=very_good■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: RULE: @sequoia.vc is our investor - highest importance rating\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Expert Example - Items to rate:\\n■0: {\\\"from\\\":\\\"analyst@jpmorgan.com\\\",\\\"subject\\\":\\\"Stock Analysis\\\"}■\\n\\nRate each item on all criteria.\",\"role\":\"user\",\"type\":\"text\"},{\"content\":\"■0:relevance=very_bad■\\n■END■\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Reasoning: RULE: analyst@* prefix is spam - lowest importance rating\",\"role\":\"assistant\",\"type\":\"text\"},{\"content\":\"Items to rate (■0 to ■3):\\n■0: {\\\"from\\\":\\\"sarah@sequoia.vc\\\",\\\"subject\\\":\\\"Board Meeting\\\"}■\\n■1: {\\\"from\\\":\\\"analyst@goldmansachs.com\\\",\\\"subject\\\":\\\"Earnings Report\\\"}■\\n■2: {\\\"from\\\":\\\"marc@a16z.com\\\",\\\"subject\\\":\\\"Funding Round\\\"}■\\n■3: {\\\"from\\\":\\\"recruiter@google.com\\\",\\\"subject\\\":\\\"Hiring\\\"}■\\n\\nRate each item on all criteria.\\nOutput format: ■index:criterion1=label;criterion2=label■\\n■END■\",\"role\":\"user\",\"type\":\"text\"}],\"meta\":{\"integrationName\":\"zai\",\"promptCategory\":\"zai:zai.rate\",\"promptSource\":\"zai:zai.rate:zai/rate\"},\"model\":{\"id\":\"gpt-4o-2024-11-20\"},\"signal\":{},\"stopSequences\":[\"■END■\"],\"systemPrompt\":\"You are rating items based on evaluation criteria.\\n\\nEvaluation Criteria:\\n**relevance**:\\n - very_bad (1): Sender's emails are always irrelevant.\\n - bad (2): Sender's emails are often irrelevant.\\n - average (3): Sender's emails are sometimes relevant.\\n - good (4): Sender's emails are often relevant.\\n - very_good (5): Sender's emails are always highly relevant.\\n\\n**authority**:\\n - very_bad (1): Sender has no decision-making authority.\\n - bad (2): Sender has limited decision-making authority.\\n - average (3): Sender has moderate authority.\\n - good (4): Sender has significant authority.\\n - very_good (5): Sender has complete authority or leadership.\\n\\n**urgency**:\\n - very_bad (1): Sender's emails are never time-sensitive.\\n - bad (2): Sender's emails are rarely time-sensitive.\\n - average (3): Sender's emails are occasionally time-sensitive.\\n - good (4): Sender's emails are often time-sensitive.\\n - very_good (5): Sender's emails are always critical and urgent.\\n\\n**reliability**:\\n - very_bad (1): Sender's information is always unreliable.\\n - bad (2): Sender's information is often unreliable.\\n - average (3): Sender's information is sometimes reliable.\\n - good (4): Sender's information is usually reliable.\\n - very_good (5): Sender's information is always highly reliable.\\n\\n**frequency**:\\n - very_bad (1): Sender contacts far too often or not enough.\\n - bad (2): Sender contacts slightly too often or infrequently.\\n - average (3): Sender contacts occasionally, neither excessive nor rare.\\n - good (4): Sender contacts at a good, balanced frequency.\\n - very_good (5): Sender contacts at a perfect, well-timed frequency.\\n\\nFor each item, rate it on EACH criterion using one of these labels:\\nvery_bad, bad, average, good, very_good\\n\\nOutput format:\\n■0:criterion1=label;criterion2=label;criterion3=label■\\n■1:criterion1=label;criterion2=label;criterion3=label■\\n■END■\\n\\nIMPORTANT:\\n- Rate every item (■0 to ■3)\\n- Use exact criterion names: relevance, authority, urgency, reliability, frequency\\n- Use exact label names: very_bad, bad, average, good, very_good\\n- Use semicolons (;) between criteria\\n- Use equals (=) between criterion and label\"},\"type\":\"openai:generateContent\"}]","value":{"output":{"id":"chatcmpl-CXZJbIMMbfQ4PDQiD7FkMPWJKQQsp","provider":"OpenAI","model":"gpt-4o-2024-11-20","choices":[{"role":"assistant","type":"text","content":"■0:relevance=very_good;authority=very_good;urgency=good;reliability=very_good;frequency=good■ \n■1:relevance=bad;authority=bad;urgency=average;reliability=average;frequency=average■ \n■2:relevance=very_good;authority=good;urgency=very_good;reliability=very_good;frequency=good■ \n■3:relevance=average;authority=average;urgency=bad;reliability=good;frequency=average■ \n","index":0,"stopReason":"stop"}],"usage":{"inputTokens":1032,"inputCost":0.0025800000000000003,"outputTokens":121,"outputCost":0.0012100000000000001},"botpress":{"cost":0.0037900000000000004}},"meta":{"cached":false}}}