Drop the deprecated binary format. #11307

trivialfis · 2025-03-04T19:36:53Z

Close #7547 .

Drop support for the deprecated binary format.
Add compatibility tests for categorical features.
Add compatibility tests for AFT survival training.
Use the same set of models for Python and R tests.

todos:

Test error handling.
Warning messages consistency.
New test models.
Update R tests for using the same set of models.

Models
xgboost_model_compatibility_tests-3.0.2.zip

Remove. Basic model test. Cleanup. cli. adaptive test.

trivialfis · 2025-07-11T06:00:47Z

@hcho3 Could you please help re-upload the test models to s3, with binary models removed? Also, the RMM build is failing ;-(

hcho3 · 2025-07-11T17:37:23Z

@trivialfis Should we also remove RDS model files from xgboost_r_model_compatibility_test.zip ? It's probably using the legacy binary format.

trivialfis · 2025-07-11T17:44:23Z

Please do. @hcho3 .

hcho3 · 2025-07-11T20:29:27Z

Done

trivialfis · 2025-07-12T06:38:55Z

Thank you!

trivialfis · 2025-07-12T10:56:02Z

@hcho3 Could you please share the latest download link?

trivialfis · 2025-07-12T13:44:17Z

Let me generate some new models using 3.0 instead

trivialfis · 2025-07-13T07:25:38Z

Models generated by the new script using 3.0.2 and 2.1.4:
xgboost_model_compatibility_test.zip

hcho3 · 2025-07-13T09:55:09Z

Are we dropping compatibility for JSON models from XGBoost 1.x?

As for the download link, I uploaded the zip file to the same s3 bucket as before. Were you not be able to access it?

trivialfis · 2025-07-13T11:06:08Z

Let me work on tests for some older models.

trivialfis · 2025-07-13T15:19:54Z

@hcho3 I'm running into 403 forbidden using the old URL: https://productionresultssa16.blob.core.windows.net/actions-results/9dfffb2c-1a7f-4d1b-9f58-01d24823ae27/workflow-job-run-22352302-b8b0-53f6-8324-ec1e5ebbad73/logs/job/job-logs.txt?rsct=text%2Fplain&se=2025-07-13T15%3A29%3A15Z&sig=IFsfl5ivJrYF2mMcSfGKK2wHuCwCVhepqX1JnzM2b2w%3D&ske=2025-07-14T03%3A03%3A48Z&skoid=ca7593d4-ee42-46cd-af88-8b886a2f84eb&sks=b&skt=2025-07-13T15%3A03%3A48Z&sktid=398a6654-997b-47e9-b12b-9515b896b4de&skv=2025-05-05&sp=r&spr=https&sr=b&st=2025-07-13T15%3A19%3A10Z&sv=2025-05-05

trivialfis · 2025-07-14T06:09:00Z

Tests on CRAN are also failing https://www.r-project.org/nosvn/R.check/r-devel-linux-x86_64-debian-gcc/xgboost-00check.html .

hcho3 · 2025-07-14T08:20:28Z

My bad, let me change the access setting.

trivialfis · 2025-07-15T07:50:10Z

I uploaded new models. After this PR, Python and R will test with the same set of models. The old models are still in S3, just in case we need them. Also, we can't remove the R test models until the CRAN package is updated.

Copilot

Pull Request Overview

This PR removes support for the deprecated binary model format and fully transitions to JSON and UBJSON serialization. It also adds compatibility tests for categorical features and AFT survival, and standardizes the set of test models used in both Python and R.

Remove legacy Load/Save implementations for the old binary format across C++ code.
Update CLI and C API to recognize .ubj and .json extensions exclusively.
Refactor Python and R tests to cover categorical features, AFT survival, and use a unified model set.

Reviewed Changes

Copilot reviewed 29 out of 29 changed files in this pull request and generated 1 comment.

File	Description
src/learner.cc	Dropped old binary serialization code and updated deprecation warnings.
src/cli_main.cc	Updated CLI to save/load `.ubj` and `.json` formats only.
tests/python/test_model_compatibility.py	Refactored compatibility test harness and download logic.
tests/python/generate_models.py	Extended model generator with categorical and AFT survival cases.

Comments suppressed due to low confidence (2)

tests/python/test_model_compatibility.py:13

The test uses xgboost.Booster and other xgboost APIs but does not import the xgboost module. Add import xgboost (or alias) at the top of the file to avoid NameError during test execution.

from xgboost import testing as tm

src/cli_main.cc:347

[nitpick] This line is misaligned in the else if (ext == "ubj") block. Adjust the indentation to match the surrounding 6-space indent for consistency with project style.

      learner->LoadModel(in);

tests/python/test_model_compatibility.py

trivialfis · 2025-07-15T19:09:18Z

@hcho3 Could you please help take a look when you are available?

trivialfis · 2025-07-16T06:53:47Z

I need to let the R check skip the compatibility check when the link is down to make CRAN tests more resilient.

hcho3

Left some questions, but otherwise looks good

hcho3 · 2025-07-17T00:39:51Z

tests/python/generate_models.py

-def write_versions():
-    versions = {'numpy': np.__version__,
-                'xgboost': version}
-    with open(os.path.join(target_dir, 'version'), 'w') as fd:
-        fd.write(str(versions))


Why are we dropping write_versions()? Is it because the file name for each model artifact already contains the version number?

Yes. Also the models we test are from multiple versions.(1-3)

hcho3 · 2025-07-17T00:52:11Z

src/cli_main.cc

+    auto ext = common::FileExtension(path);
+    auto read_file = [&]() {
+      auto str = common::LoadSequentialFile(path);
+      CHECK_GE(str.size(), 3);  // "{}\0"


Does common::LoadSequentialFile return a string with \0? I thought the terminating \0 won't be part of the string?

It returns a vector of char as it consumes binary inputs. I removed the requirement for \0 and added a small test for the JSON parser.

trivialfis mentioned this pull request Mar 27, 2025

base_score is redundant for multinomial logistic #11374

Open

trivialfis force-pushed the drop-binary branch from 46cbb4a to 5b3da6c Compare April 8, 2025 11:24

trivialfis force-pushed the drop-binary branch from 3127c66 to 192edc5 Compare May 29, 2025 08:53

[WIP] Drop the deprecated binary format.

36f7efb

Remove. Basic model test. Cleanup. cli. adaptive test.

trivialfis force-pushed the drop-binary branch from 192edc5 to 36f7efb Compare July 10, 2025 10:06

trivialfis added 4 commits July 10, 2025 19:47

Remove java tests.

b2e4a49

cli.

8553a60

python.

9eba6e5

test_io.R.

6712d4c

trivialfis mentioned this pull request Jul 10, 2025

Proposal for new R interface (discussion thread) #9734

Open

trivialfis added 3 commits July 11, 2025 13:01

Fixes.

683cb2d

Demo tests.

269b97d

lint.

2129a59

trivialfis added 3 commits July 11, 2025 15:10

Update document.

c2616d0

Remove old war.

d5ee377

cleanup.

f09b9c4

trivialfis added 2 commits July 12, 2025 18:47

Cleanup.

7e6ed30

Test invalid format.

127fd1a

Test warnings.

64b8bc2

trivialfis requested a review from Copilot July 12, 2025 11:03

This comment was marked as outdated.

Sign in to view

trivialfis added 2 commits July 13, 2025 14:33

Update the model generation script.

7e96d6b

More checks.

b5ab7e3

trivialfis added 2 commits July 13, 2025 15:31

error messages.

c1d5c5b

Lint.

07e95b8

trivialfis added 2 commits July 13, 2025 23:05

Check aft parameters.

b417282

kforest.

7709684

trivialfis added 2 commits July 13, 2025 23:21

No need to write version.

9509d3e

small cleanup.

87067fa

trivialfis added 5 commits July 15, 2025 14:36

Replace old models.

89ac611

Replace R models.

4d85f37

fix tests.

3f03adf

Update.

60fc426

Update message.

7eb21f2

trivialfis changed the title ~~[WIP] Drop the deprecated binary format.~~ Drop the deprecated binary format. Jul 15, 2025

trivialfis requested a review from Copilot July 15, 2025 08:38

trivialfis marked this pull request as ready for review July 15, 2025 08:38

trivialfis requested a review from hcho3 July 15, 2025 08:38

Copilot AI reviewed Jul 15, 2025

View reviewed changes

tests/python/test_model_compatibility.py Show resolved Hide resolved

Catch download error.

5109137

hcho3 reviewed Jul 17, 2025

View reviewed changes

Remove the null termination requirement.

0484708

trivialfis merged commit 29ada72 into dmlc:master Jul 17, 2025
78 of 82 checks passed

trivialfis deleted the drop-binary branch July 17, 2025 16:52

Uh oh!

Drop the deprecated binary format. #11307

Drop the deprecated binary format. #11307

Uh oh!

Conversation

trivialfis commented Mar 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trivialfis commented Jul 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hcho3 commented Jul 11, 2025

Uh oh!

trivialfis commented Jul 11, 2025

Uh oh!

hcho3 commented Jul 11, 2025

Uh oh!

trivialfis commented Jul 12, 2025

Uh oh!

trivialfis commented Jul 12, 2025

Uh oh!

This comment was marked as outdated.

Uh oh!

trivialfis commented Jul 12, 2025

Uh oh!

trivialfis commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hcho3 commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trivialfis commented Jul 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trivialfis commented Jul 13, 2025

Uh oh!

trivialfis commented Jul 14, 2025

Uh oh!

hcho3 commented Jul 14, 2025

Uh oh!

trivialfis commented Jul 15, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

trivialfis commented Jul 15, 2025

Uh oh!

trivialfis commented Jul 16, 2025

Uh oh!

hcho3 left a comment

Choose a reason for hiding this comment

Uh oh!

hcho3 Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

trivialfis Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

hcho3 Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

trivialfis Jul 17, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

trivialfis commented Mar 4, 2025 •

edited

Loading

trivialfis commented Jul 11, 2025 •

edited

Loading

trivialfis commented Jul 13, 2025 •

edited

Loading

hcho3 commented Jul 13, 2025 •

edited

Loading

trivialfis commented Jul 13, 2025 •

edited

Loading