PyPI - crfm-helm - Versions diffs - 0.5.8__py3-none-any.whl → 0.5.9__py3-none-any.whl - Mend - Supply Chain Defender

crfm-helm 0.5.8py3-none-any.whl → 0.5.9py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.

This version of crfm-helm might be problematic. Click here for more details.

Files changed (121) hide show

helm/config/model_metadata.yaml CHANGED Viewed

@@ -278,7 +278,7 @@ models:
   # https://aws.amazon.com/ai/generative-ai/nova/
   - name: amazon/nova-premier-v1:0
     display_name: Amazon Nova Premier
-    description: Amazon Nova Premier is the most capable model in the Nova family of foundation models. ([blog](https://aws.amazon.com/blogs/aws/amazon-nova-premier-our-most-capable-model-for-complex-tasks-and-teacher-for-model-distillation/))
+    description: Amazon Nova Premier is a capable multimodal foundation model and teacher for model distillation that processes text, images, and videos with a one-million token context window. ([model card](https://www.amazon.science/publications/amazon-nova-premier-technical-report-and-model-card), [blog](https://aws.amazon.com/blogs/aws/amazon-nova-premier-our-most-capable-model-for-complex-tasks-and-teacher-for-model-distillation/))
     creator_organization_name: Amazon
     access: limited
     release_date: 2025-04-30
@@ -286,7 +286,7 @@ models:
   - name: amazon/nova-pro-v1:0
     display_name: Amazon Nova Pro
-    description: Amazon Nova Pro Model
+    description: Amazon Nova Pro is a highly capable multimodal model that balances of accuracy, speed, and cost for a wide range of tasks ([model card](https://www.amazon.science/publications/the-amazon-nova-family-of-models-technical-report-and-model-card))
     creator_organization_name: Amazon
     access: limited
     release_date: 2024-12-03
@@ -294,7 +294,7 @@ models:
   - name: amazon/nova-lite-v1:0
     display_name: Amazon Nova Lite
-    description: Amazon Nova Lite Model
+    description: Amazon Nova Lite is a low-cost multimodal model that is fast for processing images, video, documents and text. ([model card](https://www.amazon.science/publications/the-amazon-nova-family-of-models-technical-report-and-model-card))
     creator_organization_name: Amazon
     access: limited
     release_date: 2024-12-03
@@ -302,7 +302,7 @@ models:
   - name: amazon/nova-micro-v1:0
     display_name: Amazon Nova Micro
-    description: Amazon Nova Micro Model
+    description: Amazon Nova Micro is a text-only model that delivers low-latency responses at low cost. ([model card](https://www.amazon.science/publications/the-amazon-nova-family-of-models-technical-report-and-model-card))
     creator_organization_name: Amazon
     access: limited
     release_date: 2024-12-03
@@ -555,6 +555,14 @@ models:
     release_date: 2025-05-14
     tags: [TEXT_MODEL_TAG, VISION_LANGUAGE_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: anthropic/claude-sonnet-4-5-20250929
+    display_name: Claude 4.5 Sonnet (20250929)
+    description: Claude 4.5 Sonnet is a model from Anthropic that shows particular strengths in software coding, in agentic tasks where it runs in a loop and uses tools, and in using computers. ([blog](https://www.anthropic.com/news/claude-sonnet-4-5), [system card](https://assets.anthropic.com/m/12f214efcc2f457a/original/Claude-Sonnet-4-5-System-Card.pdf))
+    creator_organization_name: Anthropic
+    access: limited
+    release_date: 2025-09-29
+    tags: [TEXT_MODEL_TAG, VISION_LANGUAGE_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
   - name: anthropic/stanford-online-all-v4-s3
     display_name: Anthropic-LM v4-s3 (52B)
     description: A 52B parameter language model, trained using reinforcement learning from human feedback [paper](https://arxiv.org/pdf/2204.05862.pdf).
@@ -946,6 +954,24 @@ models:
     release_date: 2025-01-20
     tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: deepseek-ai/deepseek-r1-distill-llama-70b
+    display_name: DeepSeek-R1-Distill-Llama-70B
+    description: DeepSeek-R1-Distill-Llama-70B is a fine-tuned open-source models based on Llama-3.3-70B-Instruct using samples generated by DeepSeek-R1.
+    creator_organization_name: DeepSeek
+    access: open
+    num_parameters: 70600000000
+    release_date: 2025-01-20
+    tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: deepseek-ai/deepseek-r1-distill-qwen-14b
+    display_name: DeepSeek-R1-Distill-Qwen-14B
+    description: DeepSeek-R1-Distill-Qwen-14B is a fine-tuned open-source models based on Qwen2.5-14B using samples generated by DeepSeek-R1.
+    creator_organization_name: DeepSeek
+    access: open
+    num_parameters: 14800000000
+    release_date: 2025-01-20
+    tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
   - name: deepseek-ai/deepseek-coder-6.7b-instruct
     display_name: DeepSeek-Coder-6.7b-Instruct
     description: DeepSeek-Coder-6.7b-Instruct is a model that is fine-tuned from the LLaMA 6.7B model for the DeepSeek-Coder task.
@@ -1207,7 +1233,7 @@ models:
   - name: google/gemini-2.0-flash-001
     display_name: Gemini 2.0 Flash
-    description: Gemini 2.0 Flash ([documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
+    description: Gemini 2.0 Flash is a member of the Gemini 2.0 series of models, a suite of highly-capable, natively multimodal models designed to power agentic systems. ([model card](https://storage.googleapis.com/model-cards/documents/gemini-2-flash.pdf), [documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
     creator_organization_name: Google
     access: limited
     release_date: 2025-02-01
@@ -1215,7 +1241,7 @@ models:
   - name: google/gemini-2.0-flash-lite-preview-02-05
     display_name: Gemini 2.0 Flash Lite (02-05 preview)
-    description: Gemini 2.0 Flash Lite (02-05 preview) ([documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
+    description: Gemini 2.0 Flash Lite (02-05 preview) ([model card](https://storage.googleapis.com/model-cards/documents/gemini-2-flash.pdf), [documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
     creator_organization_name: Google
     access: limited
     release_date: 2025-02-05
@@ -1223,7 +1249,7 @@ models:
   - name: google/gemini-2.0-flash-lite-001
     display_name: Gemini 2.0 Flash Lite
-    description: Gemini 2.0 Flash Lite ([documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
+    description: Gemini 2.0 Flash Lite is the fastest and most cost efficient Flash model in the Gemini 2.0 series of models, a suite of highly-capable, natively multimodal models designed to power agentic systems. ([model card](https://storage.googleapis.com/model-cards/documents/gemini-2-flash.pdf), [documentation](https://ai.google.dev/gemini-api/docs/models/gemini))
     creator_organization_name: Google
     access: limited
     release_date: 2025-03-25
@@ -2581,6 +2607,14 @@ models:
     release_date: 2025-05-07
     tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: mistralai/mistral-medium-3.1
+    display_name: Mistral Medium 3.1
+    description: Mistral Medium 3.1 is a language model that is intended to to deliver state-of-the-art performance at lower cost. ([blog](https://mistral.ai/news/mistral-medium-3))
+    creator_organization_name: Mistral AI
+    access: limited
+    release_date: 2025-05-07
+    tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
   - name: mistralai/mistral-large-2402
     display_name: Mistral Large (2402)
     description: Mistral Large is a multilingual model with a 32K tokens context window and function-calling capabilities. ([blog](https://mistral.ai/news/mistral-large/))
@@ -3598,6 +3632,14 @@ models:
     release_date: 2025-04-29
     tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: qwen/qwen3-next-80b-a3b-thinking
+    display_name: Qwen3-Next 80B A3B Thinking
+    description: Qwen3-Next is a new model architecture for improving training and inference efficiency under long-context and large-parameter settings. Compared to the MoE structure of Qwen3, Qwen3-Next introduces a hybrid attention mechanism, a highly sparse Mixture-of-Experts (MoE) structure, training-stability-friendly optimizations, and a multi-token prediction mechanism for faster inference. ([blog](https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d27cd&from=research.latest-advancements-list))
+    creator_organization_name: Qwen
+    access: open
+    release_date: 2025-07-21  # https://x.com/Alibaba_Qwen/status/1947344511988076547
+    tags: [TEXT_MODEL_TAG, LIMITED_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
   - name: qwen/qwen3-235b-a22b-instruct-2507-fp8
     display_name: Qwen3 235B A22B Instruct 2507 FP8
     description: Qwen3 235B A22B Instruct 2507 FP8 is an updated version of the non-thinking mode of Qwen3 235B A22B FP8.
@@ -3949,7 +3991,190 @@ models:
     release_date: 2023-05-25
     tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG]
+  - name: tiiuae/falcon3-1b-instruct
+    display_name: Falcon3-1B-Instruct
+    description: Falcon3-1B-Instruct is an open-weights foundation model that supports 4 languages (English, French, Spanish, Portuguese) that was trained on 14T tokens.
+    creator_organization_name: TII UAE
+    access: open
+    num_parameters: 1670000000
+    release_date: 2024-12-17  # https://huggingface.co/docs/transformers/main/en/model_doc/falcon3
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: tiiuae/falcon3-3b-instruct
+    display_name: Falcon3-3B-Instruct
+    description: Falcon3-3B-Instruct is an open-weights foundation model that supports 4 languages (English, French, Spanish, Portuguese) that was trained on 14T tokens.
+    creator_organization_name: TII UAE
+    access: open
+    num_parameters: 3230000000
+    release_date: 2024-12-17  # https://huggingface.co/docs/transformers/main/en/model_doc/falcon3
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: tiiuae/falcon3-7b-instruct
+    display_name: Falcon3-7B-Instruct
+    description: Falcon3-7B-Instruct is an open-weights foundation model that supports 4 languages (English, French, Spanish, Portuguese) that was trained on 14T tokens.
+    creator_organization_name: TII UAE
+    access: open
+    num_parameters: 7460000000
+    release_date: 2024-12-17  # https://huggingface.co/docs/transformers/main/en/model_doc/falcon3
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: tiiuae/falcon3-10b-instruct
+    display_name: Falcon3-10B-Instruct
+    description: Falcon3-10B-Instruct is an open-weights foundation model that supports 4 languages (English, French, Spanish, Portuguese) that was trained on 14T tokens.
+    creator_organization_name: TII UAE
+    access: open
+    num_parameters: 10300000000
+    release_date: 2024-12-17  # https://huggingface.co/docs/transformers/main/en/model_doc/falcon3
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  # AceGPT-v2
+  - name: freedomintelligence/acegpt-v2-8b-chat
+    display_name: AceGPT-v2-8B-Chat
+    description: AceGPT is a fully fine-tuned generative text model collection, particularly focused on the Arabic language domain. AceGPT-v2-8B-Chat is based on Meta-Llama-3-8B. ([paper](https://arxiv.org/abs/2412.12310))
+    creator_organization_name: FreedomAI
+    access: open
+    num_parameters: 8030000000
+    release_date: 2024-10-20
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: freedomintelligence/acegpt-v2-32b-chat
+    display_name: AceGPT-v2-32B-Chat
+    description: AceGPT is a fully fine-tuned generative text model collection, particularly focused on the Arabic language domain. AceGPT-v2-32B-Chat is based on Qwen1.5-32B. ([paper](https://arxiv.org/abs/2412.12310))
+    creator_organization_name: FreedomAI
+    access: open
+    num_parameters: 32500000000
+    release_date: 2024-10-20
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: freedomintelligence/acegpt-v2-70b-chat
+    display_name: AceGPT-v2-70B-Chat
+    description: AceGPT is a fully fine-tuned generative text model collection, particularly focused on the Arabic language domain. AceGPT-v2-70B-Chat is based on Meta-Llama-3-70B. ([paper](https://arxiv.org/abs/2412.12310))
+    creator_organization_name: FreedomAI
+    access: open
+    num_parameters: 70600000000
+    release_date: 2024-10-20
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  # ALLaM
+  - name: allam-ai/allam-7b-instruct-preview
+    display_name: ALLaM-7B-Instruct-preview
+    description: ALLaM-7B-Instruct-preview is a model designed to advance Arabic language technology, which used a recipe of training on 4T English tokens followed by training on 1.2T mixed Arabic/English tokens. ([paper](https://arxiv.org/abs/2407.15390v1))
+    creator_organization_name: NCAI & SDAIA
+    access: open
+    num_parameters: 7000000000
+    release_date: 2024-07-22
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  # SILMA
+  - name: silma-ai/silma-9b-instruct-v1.0
+    display_name: SILMA 9B
+    description: SILMA 9B is a compact Arabic language model based on Google Gemma. ([model card](https://huggingface.co/silma-ai/SILMA-9B-Instruct-v1.0))
+    creator_organization_name: SILMA AI
+    access: open
+    num_parameters: 9240000000
+    release_date: 2024-08-17
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  # Jais Family
+  - name: inceptionai/jais-family-590m-chat
+    display_name: Jais-family-590m-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 771000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-1p3b-chat
+    display_name: Jais-family-1p3b-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 1560000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-2p7b-chat
+    display_name: Jais-family-2p7b-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 2950000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-6p7b-chat
+    display_name: Jais-family-6p7b-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 7140000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-6p7b-chat
+    display_name: Jais-family-6p7b-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 7140000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-13b-chat
+    display_name: Jais-family-13b-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 13500000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-30b-8k-chat
+    display_name: Jais-family-30b-8k-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 30800000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-family-30b-16k-chat
+    display_name: Jais-family-30b-16k-chat
+    description: The Jais family of models is a series of bilingual English-Arabic large language models (LLMs) that are trained from scratch and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 30800000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-adapted-7b-chat
+    display_name: Jais-adapted-7b-chat
+    description: The Jais adapted models are bilingual English-Arabic large language models (LLMs) that are trained adaptively from Llama-2 and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 7000000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-adapted-13b-chat
+    display_name: Jais-adapted-13b-chat
+    description: The Jais adapted models are bilingual English-Arabic large language models (LLMs) that are trained adaptively from Llama-2 and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 13300000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
+  - name: inceptionai/jais-adapted-70b-chat
+    display_name: Jais-adapted-70b-chat
+    description: The Jais adapted models are bilingual English-Arabic large language models (LLMs) that are trained adaptively from Llama-2 and optimized to excel in Arabic while having strong English capabilities. ([website](https://inceptionai.ai/jaisfamily/index.html), [blog](https://mbzuai.ac.ae/news/meet-jais-the-worlds-most-advanced-arabic-large-language-model-open-sourced-by-g42s-inception/))
+    creator_organization_name: Inception
+    access: open
+    num_parameters: 69500000000
+    release_date: 2023-08-30
+    tags: [TEXT_MODEL_TAG, FULL_FUNCTIONALITY_TEXT_MODEL_TAG, INSTRUCTION_FOLLOWING_MODEL_TAG]
   # Together
   - name: together/gpt-jt-6b-v1

helm/config/tokenizer_configs.yaml CHANGED Viewed

@@ -460,7 +460,7 @@ tokenizer_configs:
   # Allen Institute for AI
   # The allenai/olmo-7b requires Python 3.9 or newer.
-  # To use the allenai/olmo-7b tokenizer, run `pip install crfm-helm[allenai]` first.
+  # To use the allenai/olmo-7b tokenizer, run `pip install "crfm-helm[allenai]"` first.
   - name: allenai/olmo-7b
     tokenizer_spec:
       class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
@@ -717,6 +717,12 @@ tokenizer_configs:
     end_of_text_token: "<|im_end|>"
     prefix_token: ""
+  - name: qwen/qwen3-next-80b-a3b-thinking
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|im_end|>"
+    prefix_token: ""
   - name: qwen/qwq-32b-preview
     tokenizer_spec:
       class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
@@ -797,6 +803,12 @@ tokenizer_configs:
     end_of_text_token: "<|endoftext|>"
     prefix_token: ""
+  - name: tiiuae/falcon3-1b-instruct
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|endoftext|>"
+    prefix_token: ""
   # TsinghuaKEG
   - name: TsinghuaKEG/ice
     tokenizer_spec:
@@ -1075,8 +1087,6 @@ tokenizer_configs:
     end_of_text_token: "<|endoftext|>"
     prefix_token: ""
   # DeepSeek-R1-Distill-Llama-3.1-8b
   - name: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
     tokenizer_spec:
@@ -1086,6 +1096,20 @@ tokenizer_configs:
     end_of_text_token: "<｜end▁of▁sentence｜>"
     prefix_token: "<｜begin▁of▁sentence｜>"
+  # DeepSeek-R1-Distill-Llama-3.1-8b
+  - name: deepseek-ai/deepseek-r1-distill-llama-70b
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<｜end▁of▁sentence｜>"
+    prefix_token: "<｜begin▁of▁sentence｜>"
+  # DeepSeek-R1-Distill-Qwen-14B
+  - name: deepseek-ai/deepseek-r1-distill-qwen-14b
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<｜end▁of▁sentence｜>"
+    prefix_token: "<｜begin▁of▁sentence｜>"
 # deepseek-ai/deepseek-coder-6.7b-instruct
   - name: deepseek-ai/deepseek-coder-6.7b-instruct
     tokenizer_spec:
@@ -1095,7 +1119,6 @@ tokenizer_configs:
     end_of_text_token: "<｜end▁of▁sentence｜>"
     prefix_token: "<｜begin▁of▁sentence｜>"
 # vilm/vinallama-2.7b-chat
   - name: vilm/vinallama-2.7b-chat
     tokenizer_spec:
@@ -1203,3 +1226,50 @@ tokenizer_configs:
         pretrained_model_name_or_path: nicholasKluge/TeenyTinyLlama-460m
     end_of_text_token: "</s>"
     prefix_token: "<s>"
+  # AceGPT-v2
+  - name: freedomintelligence/acegpt-v2-8b-chat
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|end_of_text|>"
+    prefix_token: "<|begin_of_text|>"
+  - name: freedomintelligence/acegpt-v2-32b-chat
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|endoftext|>"
+    prefix_token: ""
+  - name: freedomintelligence/acegpt-v2-70b-chat
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|end_of_text|>"
+    prefix_token: "<|begin_of_text|>"
+  # ALLaM
+  - name: allam-ai/allam-7b-instruct-preview
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "</s>"
+    prefix_token: "<s>"
+  # SILMA
+  - name: silma-ai/silma-9b-instruct-v1.0
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<eos>"
+    prefix_token: "<bos>"
+  # Jais Family
+  - name: inceptionai/jais-family-590m-chat
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "<|endoftext|>"
+    prefix_token: "<|endoftext|>"
+  # Jais Adapted
+  - name: inceptionai/jais-adapted-7b-chat
+    tokenizer_spec:
+      class_name: "helm.tokenizers.huggingface_tokenizer.HuggingFaceTokenizer"
+    end_of_text_token: "</s>"
+    prefix_token: "<s>"