Skip to content

Commit c7b029b

Browse files
Merge branch 'latest' into update-gemma3-openvino-genai-clean
2 parents 507af0e + 6f18d48 commit c7b029b

File tree

5 files changed

+11
-0
lines changed

5 files changed

+11
-0
lines changed

.ci/spellcheck/.pyspelling.wordlist.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@ AE
1313
AEs
1414
aeroplane
1515
affective
16+
AFM
1617
Agentic
1718
agentic
1819
ai
@@ -33,6 +34,7 @@ Antelopev
3334
api
3435
APIs
3536
Arcface
37+
Arcee
3638
argmax
3739
artstation
3840
arxiv
@@ -574,6 +576,7 @@ microservices
574576
MiDaS
575577
MidasNet
576578
Midjourney
579+
midtraining
577580
minicpm
578581
MiniCPM
579582
MiniLM

notebooks/llm-chatbot/README.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ For more details, please refer to [model_card](https://huggingface.co/Qwen/Qwen2
7979
* **GLM-4-9B-0414** - GLM-4-32B-0414 series models, featuring 32 billion parameters. Its performance is comparable to OpenAI’s GPT series and DeepSeek V3/R1 series. It also supports very user-friendly local deployment features. GLM-4-32B-Base-0414 was pre-trained on 15T of high-quality data, including substantial reasoning-type synthetic data. You can find more info in [model card](https://huggingface.co/THUDM/GLM-4-9B-0414).
8080
* **GLM-Z1-32B-0414** - GLM-Z1-32B-0414 is a reasoning model with deep thinking capabilities. This was developed based on GLM-4-32B-0414 through cold start, extended reinforcement learning, and further training on tasks including mathematics, code, and logic. Compared to the base model, GLM-Z1-32B-0414 significantly improves mathematical abilities and the capability to solve complex tasks. You can find more info in [model card](https://huggingface.co/THUDM/GLM-Z1-9B-0414).
8181
* **Qwen3-1.7/4B/8B/14B** - Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Building upon extensive advancements in training data, model architecture, and optimization techniques, Qwen3 delivers the following key improvements over the previously released Qwen2.5. You can find more info in [model card](https://huggingface.co/Qwen/Qwen3-8B).
82+
* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).
8283

8384
The image below illustrates the provided user instruction and model answer examples.
8485

notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,7 @@
428428
" * quant_method: **AWQ**\n",
429429
" * scale_estimation: **True**\n",
430430
" * dataset: **wikitext2**\n",
431+
"* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).\n",
431432
"</details>"
432433
]
433434
},

notebooks/llm-chatbot/llm-chatbot.ipynb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -324,6 +324,7 @@
324324
" * quant_method: **AWQ**\n",
325325
" * scale_estimation: **True**\n",
326326
" * dataset: **wikitext2**\n",
327+
"* **AFM-4.5B** - AFM-4.5B is a 4.5 billion parameter instruction-tuned model developed by Arcee.ai, designed for enterprise-grade performance across diverse deployment environments from cloud to edge. The base model was trained on a dataset of 8 trillion tokens, comprising 6.5 trillion tokens of general pretraining data followed by 1.5 trillion tokens of midtraining data with enhanced focus on mathematical reasoning and code generation. Following pretraining, the model underwent supervised fine-tuning on high-quality instruction datasets. The instruction-tuned model was further refined through reinforcement learning on verifiable rewards as well as for human preference. You can find more info in [model card](https://huggingface.co/arcee-ai/AFM-4.5B).\n",
327328
" </detals>\n"
328329
]
329330
},

utils/llm_config.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -494,6 +494,11 @@ def qwen_completion_to_prompt(completion):
494494
"stop_tokens": ["<|im_end|>", "<|endoftext|>"],
495495
"completion_to_prompt": qwen_completion_to_prompt,
496496
},
497+
"afm-4.5b": {
498+
"model_id": "arcee-ai/AFM-4.5B",
499+
"remote_code": False,
500+
"start_message": DEFAULT_SYSTEM_PROMPT,
501+
},
497502
},
498503
"Chinese": {
499504
"minicpm4-8b": {"model_id": "openbmb/MiniCPM4-8B", "remote_code": True, "start_message": DEFAULT_SYSTEM_PROMPT_CHINESE},

0 commit comments

Comments
 (0)