Qwen3 sft collab #2355

mydatascience · 2025-09-16T17:09:30Z

Description

Qwen3 sft collab - an SFT collab with Qwen3-0.6B that can run on public collab 5e-1

Tests

Run the collab. Try yourself =)

Checklist

Before submitting this PR, please make sure (put X in square brackets):

I have performed a self-review of my code.
I have necessary comments in my code, particularly in hard-to-understand areas.
I have run end-to-end tests tests and provided workload links above if applicable.
I have made or will make corresponding changes to the doc if needed.

Signed-off-by: Vladimir Suvorov <[email protected]>

SurbhiJainUSC · 2025-09-19T18:59:14Z

src/MaxText/examples/sft_qwen3_demo.ipynb

+        "print(f\"MaxText Home directory (from Python): {MAXTEXT_REPO_ROOT}\")\n",
+        "\n",
+        "DEBUG = False  # set to True to run in debug mode, for more print statements\n",
+        "#set this to the path of the checkpoint you want to load, gs:// supported \n",


Line 118-121 is confusing. Let's simplify like this:

# Case 1: Set `MODEL_CHECKPOINT_PATH` to GCS path that already has `Qwen3-0.6B` model checkpoint # Case 2: If you do not have the checkpoint, then do not update `MODEL_CHECKPOINT_PATH` # and this colab will download the checkpoint from HF and store at `"{MAXTEXT_REPO_ROOT}/qwen_checkpoint\"` "MODEL_CHECKPOINT_PATH = f\"{MAXTEXT_REPO_ROOT}/qwen_checkpoint\""

Also, have you tried setting MODEL_CHECKPOINT_PATH to a GCS location? Do you see any permission issue to connect to GCS?

SurbhiJainUSC · 2025-09-19T18:59:59Z

src/MaxText/examples/sft_qwen3_demo.ipynb

+      "source": [
+        "# This is the command to convert the HF model to the MaxText format \n",
+        "# You may omit it if you already have a checkpoint\n",
+        "!python3 -m MaxText.utils.ckpt_conversion.to_maxtext \\\n",


This command should only run when MODEL_CHECKPOINT_PATH = f\"{MAXTEXT_REPO_ROOT}/qwen_checkpoint. Put this behind a flag.

SurbhiJainUSC · 2025-09-19T19:03:36Z

src/MaxText/examples/sft_qwen3_demo.ipynb

+        "        \"dtype=bfloat16\",\n",
+        "        \"hf_path=HuggingFaceH4/ultrachat_200k\",                       # HuggingFace dataset/model if needed\n",
+        "        f\"hf_access_token={HF_TOKEN}\",\n",
+        "        \"base_output_directory=/tmp/maxtext_qwen06\",\n",


Can we set base_output_directory to a GCS path, so that users could access the fine-tuned checkpoint?
Also, add a print statement after train(), that you can find your fine-tuned checkpoint at this path.

mydatascience requested review from gobbleturk, khatwanimohit, bvandermoon, vipannalla, RissyRan, richjames0, gagika, shralex, yangyuwei, SurbhiJainUSC, hengtaoguo, A9isha, aireenmei and NuojCheng as code owners September 16, 2025 17:09

mydatascience requested a review from jiangjy1982 as a code owner September 19, 2025 18:19

mydatascience added 3 commits September 19, 2025 22:20

Sft fro qwen

3d5efb7

Signed-off-by: Vladimir Suvorov <[email protected]>

Sft fro qwen

2cea782

Signed-off-by: Vladimir Suvorov <[email protected]>

Refined qwen colab

e1cb7e4

Signed-off-by: Vladimir Suvorov <[email protected]>

mydatascience force-pushed the qwen3-sft-collab branch from 68582a2 to e1cb7e4 Compare September 19, 2025 18:21

mydatascience added 4 commits September 19, 2025 22:22

Del old one

59e8ebf

Signed-off-by: Vladimir Suvorov <[email protected]>

Fix var name

2742f82

Signed-off-by: Vladimir Suvorov <[email protected]>

colab label

7e68b9a

Signed-off-by: Vladimir Suvorov <[email protected]>

Nicer

dce7b0d

Signed-off-by: Vladimir Suvorov <[email protected]>

mydatascience changed the title ~~[DRAFT] Qwen3 sft collab~~ Qwen3 sft collab Sep 19, 2025

SurbhiJainUSC reviewed Sep 19, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Qwen3 sft collab #2355

Qwen3 sft collab #2355

mydatascience commented Sep 16, 2025 •

edited

Loading

Uh oh!

SurbhiJainUSC Sep 19, 2025

Uh oh!

SurbhiJainUSC Sep 19, 2025

Uh oh!

SurbhiJainUSC Sep 19, 2025

Uh oh!

SurbhiJainUSC Sep 19, 2025

Uh oh!

Uh oh!

Qwen3 sft collab #2355

Are you sure you want to change the base?

Qwen3 sft collab #2355

Conversation

mydatascience commented Sep 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Tests

Checklist

Uh oh!

SurbhiJainUSC Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

SurbhiJainUSC Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

SurbhiJainUSC Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

SurbhiJainUSC Sep 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mydatascience commented Sep 16, 2025 •

edited

Loading