Feature model statistics #925

seredenkodenis · 2025-05-26T13:19:13Z

No description provided.

specifically: - std_dev_domain_size - avg_domain_size - median_domain_size - avg_domain_overlap - n_disjoint_domain_pairs - n_total_ct some of which are pointers, because i wanted to use a value outside of their valid value range as default.

…d them to the output

…at were previously put in flat model statistics. Moved the calculcations and output of the stats to the feature vector.

…k with the different domains more easily. Fixed averageDomainOverlap calculations to correctly work across bool, int and int set domains. Removed parts currently ommited by the training (floats are currently ommited).

…ween decision variables and constraints.

…ture vector.

…e constraints that use at least two other constrains. we measure this indirectly in flat zinc by checking if a call uses more than two defined_vars that were not defined by itself.

…d to stdout and made it a little more flexible by allowing to input own prefixes and output end markers.

…d members of the FeatureVector are now output. Added some more "diagnostic" members to the FeatureVector.

…ully ignoring floats for feature vector.

…atch keys in idToVarNameMap.

The user is now able to constraint the constraint graph dimensions inside the feature vector which means they will either be cropped if the constraint graph grows bigger than allowed or be padded if it does not grow large enough. Default is still autoResize. Option to ignore floats was added, but is not yet implemented, default is still ignore all floats.

… with two-pass compiler option

…its of constraint graph to be configured separately.

… in warehouse model, but only with no optimization flag. Work around for feature extraction was added, but it should be discussed if this is a bug.

guidotack · 2025-09-22T06:48:37Z

This looks interesting. We would probably have to make some cosmetic changes in order to merge it back into the main repository (the main one being copyright headers). Can you provide a bit more information on this project?

Zylence · 2025-09-22T17:49:39Z

This looks interesting. We would probably have to make some cosmetic changes in order to merge it back into the main repository (the main one being copyright headers). Can you provide a bit more information on this project?

Hi @guidotack , sorry for the ambush; the PR should not have been opened just yet, my colleague @seredenkodenis was a little overzealous here ;). It definitely requires some touch-ups, maybe unit tests. If you want, we can move the discussion for this into an issue/feature request.

Ofc. we'll provide some information on what we did here:

We are working on a paper whose aim is to show if, by providing a pretrained model to find static variable orderings, we can outperform common search heuristics such as first_fail, dom_w_deg, etc. For this, we needed to extract a dataset of structural features from the MiniZinc files. It felt natural to integrate feature extraction in the compiler, as it gives us access to the AST directly, and so we do not need some weird regex mess on top of the MiniZinc files. (hence we forked it)

In addition, this approach would allow adding the "ai_heursic" back into the project easily if desired by the maintainers. (Once a good pretrained model is created). And it would allow others to extract features to train models or for any other kind of workload as well.

Integration was done minimally invasively, and hopefully not too hackish.

So much for the background information.

The paper is not yet done, but current results are promising.

Zylence and others added 24 commits July 6, 2024 14:52

experiments

18604da

added some mean and standard deviation math helpers to utils.hh

8625df4

added new fields to FlatModelStatistics

9d35499

specifically: - std_dev_domain_size - avg_domain_size - median_domain_size - avg_domain_overlap - n_disjoint_domain_pairs - n_total_ct some of which are pointers, because i wanted to use a value outside of their valid value range as default.

Added calculations for the new fields of FlatModelStatistics and adde…

972bdab

…d them to the output

Added new cmdline option for feature-vector extraction: --feature-vector

240eca9

Introduced feature vector struct which will take all the new stats th…

7b4765b

…at were previously put in flat model statistics. Moved the calculcations and output of the stats to the feature vector.

Removed potentially truncating cast from mean helper function.

30a9246

added a constraint graph as easy representation for the relations bet…

90322ae

…ween decision variables and constraints.

added constraint histogram and annotation histogram

47e4a45

added average decision variables used in constraints as metric to fea…

9b654a6

…ture vector.

Added meta constraints counter to feature vector. meta constraints ar…

2f1eeb8

…e constraints that use at least two other constrains. we measure this indirectly in flat zinc by checking if a call uses more than two defined_vars that were not defined by itself.

Added methods to StatisticsStream to print an array or map as json an…

756d4b4

…d to stdout and made it a little more flexible by allowing to input own prefixes and output end markers.

Customized usage of StatisticsStream for FeatureVector. All calculate…

2157b72

…d members of the FeatureVector are now output. Added some more "diagnostic" members to the FeatureVector.

fixed an index out of bounds in is_var_defined_by_call. Furhter now f…

bd56a12

…ully ignoring floats for feature vector.

Now outputting the individual domain widths as array. array indices m…

4ea78b0

…atch keys in idToVarNameMap.

Fixed nullpointer in feature extraction.

8e66bf6

better json serialization of numeric values for maps and arrays

2fbfd32

fixed nullpointer in feature vector extraction that ocurred when used…

1fcf165

… with two-pass compiler option

feature vector extraction now allows for constraints and variable lim…

8d5036e

…its of constraint graph to be configured separately.

Found a rare case where domain retireved from int is nullptr. happens…

b1c067a

… in warehouse model, but only with no optimization flag. Work around for feature extraction was added, but it should be discussed if this is a bug.

fixed an include in feature_extraction.cpp

d60cc6d

* fixing problems during build for macos

1ab88b8

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature model statistics #925

Feature model statistics #925

Uh oh!

seredenkodenis commented May 26, 2025

Uh oh!

guidotack commented Sep 22, 2025

Uh oh!

Zylence commented Sep 22, 2025

Uh oh!

Uh oh!

Feature model statistics #925

Are you sure you want to change the base?

Feature model statistics #925

Uh oh!

Conversation

seredenkodenis commented May 26, 2025

Uh oh!

guidotack commented Sep 22, 2025

Uh oh!

Zylence commented Sep 22, 2025

Uh oh!

Uh oh!