Skip to content

Conversation

seredenkodenis
Copy link

No description provided.

Zylence and others added 24 commits July 6, 2024 14:52
specifically:
- std_dev_domain_size
- avg_domain_size
- median_domain_size
- avg_domain_overlap
- n_disjoint_domain_pairs
- n_total_ct

some of which are pointers, because i wanted to use a value outside of  their valid value range as default.
…at were previously put in flat model statistics. Moved the calculcations and output of the stats to the feature vector.
…k with the different domains more easily. Fixed averageDomainOverlap calculations to correctly work across bool, int and int set domains. Removed parts currently ommited by the training (floats are currently ommited).
…e constraints that use at least two other constrains. we measure this indirectly in flat zinc by checking if a call uses more than two defined_vars that were not defined by itself.
…d to stdout and made it a little more flexible by allowing to input own prefixes and output end markers.
…d members of the FeatureVector are now output. Added some more "diagnostic" members to the FeatureVector.
The user is now able to constraint the constraint graph dimensions inside the feature vector which means they will either be cropped if the constraint graph grows bigger than allowed or be padded if it does not grow large enough. Default is still autoResize. Option to ignore floats was added, but is not yet implemented, default is still ignore all floats.
…its of constraint graph to be configured separately.
… in warehouse model, but only with no optimization flag. Work around for feature extraction was added, but it should be discussed if this is a bug.
@guidotack
Copy link
Member

This looks interesting. We would probably have to make some cosmetic changes in order to merge it back into the main repository (the main one being copyright headers). Can you provide a bit more information on this project?

@Zylence
Copy link

Zylence commented Sep 22, 2025

This looks interesting. We would probably have to make some cosmetic changes in order to merge it back into the main repository (the main one being copyright headers). Can you provide a bit more information on this project?

Hi @guidotack , sorry for the ambush; the PR should not have been opened just yet, my colleague @seredenkodenis was a little overzealous here ;). It definitely requires some touch-ups, maybe unit tests. If you want, we can move the discussion for this into an issue/feature request.

 Ofc. we'll provide some information on what we did here:

We are working on a paper whose aim is to show if, by providing a pretrained model to find static variable orderings, we can outperform common search heuristics such as first_fail, dom_w_deg, etc. For this, we needed to extract a dataset of structural features from the MiniZinc files. It felt natural to integrate feature extraction in the compiler, as it gives us access to the AST directly, and so we do not need some weird regex mess on top of the MiniZinc files. (hence we forked it)

In addition, this approach would allow adding the "ai_heursic" back into the project easily if desired by the maintainers. (Once a good pretrained model is created). And it would allow others to extract features to train models or for any other kind of workload as well. 

Integration was done minimally invasively, and hopefully not too hackish.

So much for the background information. 

The paper is not yet done, but current results are promising.  

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants