You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: apply_model/model_export_as_cpp_code_tutorial.md
+3-15Lines changed: 3 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,25 +5,22 @@ Catboost model could be saved as standalone C++ code. This can ease an integrati
5
5
6
6
The exported model code contains complete data for the current trained model and *apply_catboost_model()* function which applies the model to a given dataset. The only current dependency for the code is [CityHash library](https://github.com/google/cityhash/tree/00b9287e8c1255b5922ef90e304d5287361b2c2a) (NOTE: The exact revision under the link is required).
7
7
8
-
9
-
### Exporting from Catboost application via command line interface:
8
+
### Exporting from Catboost application via command line interface
10
9
11
10
```bash
12
11
catboost fit --model-format CPP <other_fit_parameters>
13
12
```
14
13
15
14
By default model is saved into *model.cpp* file. One could alter the output name using *-m* key. If there is more that one model-format specified, then the *.cpp* extention will be added to the name provided after *-m* key.
16
15
17
-
18
-
### Exporting from Catboost python library interface:
16
+
### Exporting from Catboost python library interface
If the model was trained using only numerical features (no cat features), then the application function in generated code will have the following interface:
@@ -32,14 +29,12 @@ If the model was trained using only numerical features (no cat features), then t
C++11 support of non-static data member initializers and extended initializer lists
60
55
61
-
62
56
## Models trained with Categorical features
63
57
64
58
If the model was trained with categorical features present, then the application function in output code will be generated with the following interface:
@@ -67,7 +61,6 @@ If the model was trained with categorical features present, then the application
NOTE: You need to pass float and categorical features separately in the same order they appeared in the train dataset. For example if you had features f1,f2,f3,f4, where f2 and f4 were considered categorical, you need to pass here floatFeatures = {f1, f3}, catFeatures = {f2, f4}.
79
72
80
-
81
73
### Return value
82
74
83
75
Prediction of the model for the document with given features.
C++14 compiler with aggregate member initialization support. Tested compilers: g++ 5(5.4.1 20160904), clang++ 3.8.
99
90
100
-
101
91
## Current limitations
102
92
103
-
- MultiClassification models are not supported.
104
93
- applyCatboostModel() function has reference implementation and may lack of performance comparing to native applicator of CatBoost, especially on large models and multiple of documents.
105
-
94
+
-[Text](https://catboost.ai/en/docs/features/text-features) and [Embeddings](https://catboost.ai/en/docs/features/embeddings-features) features are not supported.
106
95
107
96
## Troubleshooting
108
97
109
98
Q: Generated model results differ from native model when categorical features present
110
99
A: Please check that CityHash version 1 is used. Exact required revision of [C++ Google CityHash library](https://github.com/Amper/cityhash/tree/4f02fe0ba78d4a6d1735950a9c25809b11786a56%29). There is also proper CityHash implementation in [Catboost repository](https://github.com/catboost/catboost/blob/master/util/digest/city.h). This is due other versions of CityHash may produce different hash code for the same string.
Copy file name to clipboardExpand all lines: apply_model/model_export_as_python_code_tutorial.md
+17-14Lines changed: 17 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,25 +5,22 @@ Catboost model could be saved as standalone Python code. This can ease an integr
5
5
6
6
The exported model code contains complete data for the current trained model and *apply_catboost_model()* function which applies the model to a given dataset. The only current dependency for the code is [CityHash library](https://github.com/Amper/cityhash/tree/4f02fe0ba78d4a6d1735950a9c25809b11786a56).
7
7
8
-
9
-
### Exporting from Catboost application via command line interface:
8
+
### Exporting from Catboost application via command line interface
10
9
11
10
```bash
12
11
catboost fit --model-format Python <other_fit_parameters>
13
12
```
14
13
15
14
By default model is saved into *model.py* file, one could alter the output name using *-m* key. If there is more that one model-format specified, then the *.py* extention will be added to the name provided after *-m* key.
16
15
17
-
18
-
### Exporting from Catboost python library interface:
16
+
### Exporting from Catboost python library interface
If the model was trained using only numerical features (no cat features), then the application function in generated code will have the following interface:
@@ -32,19 +29,16 @@ If the model was trained using only numerical features (no cat features), then t
| float_features | list of int or float values| features of a single document to make prediction |
41
37
42
-
43
38
### Return value
44
39
45
40
Prediction of the model for the document with given features, equivalent to CatBoost().predict(prediction_type='RawFormulaVal').
46
41
47
-
48
42
## Models trained with Categorical features
49
43
50
44
If the model was trained with categorical features present, then the application function in output code will be generated with the following interface:
@@ -53,7 +47,6 @@ If the model was trained with categorical features present, then the application
NOTE: You need to pass float and categorical features separately in the same order they appeared in the train dataset. For example if you had features f1,f2,f3,f4, where f2 and f4 were considered categorical, you need to pass here float_features=[f1,f3], cat_features=[f2,f4].
65
58
66
-
67
59
### Return value
68
60
69
61
Prediction of the model for the document with given features, equivalent to CatBoost().predict(prediction_type='RawFormulaVal').
70
62
71
-
72
63
## Current limitations
73
-
- MultiClassification models are not supported.
74
-
- apply_catboost_model() function has reference implementation and may lack of performance comparing to native applicator of CatBoost, especially on large models and multiple of documents.
75
64
65
+
- apply_catboost_model() function has reference implementation and may lack of performance comparing to native applicator of CatBoost, especially on large models and multiple of documents.
66
+
-[Text](https://catboost.ai/en/docs/features/text-features) and [Embeddings](https://catboost.ai/en/docs/features/embeddings-features) features are not supported.
76
67
77
68
## Troubleshooting
78
69
79
70
Q: Generated model results differ from native model when categorical features present
80
-
A: Please check that the CityHash version 1 is used. Exact required revision of [Python CityHash library](https://github.com/Amper/cityhash/tree/4f02fe0ba78d4a6d1735950a9c25809b11786a56). There is also proper CityHash implementation in [Catboost repository](https://github.com/catboost/catboost/tree/master/library/python/cityhash). This is due other versions of CityHash may produce different hash code for the same string.
71
+
A: Please check that the CityHash version 1 is used. Exact required revision of [Python CityHash library](https://github.com/Amper/cityhash/tree/4f02fe0ba78d4a6d1735950a9c25809b11786a56). There is also proper CityHash implementation in [Catboost repository](https://github.com/catboost/catboost/tree/master/library/python/cityhash). This is due other versions of CityHash may produce different hash code for the same string. One option is to use the library [clickhouse-cityhash](https://pypi.org/project/clickhouse-cityhash/):
72
+
73
+
```python
74
+
from clickhouse_cityhash.cityhash import CityHash64
0 commit comments