-
Notifications
You must be signed in to change notification settings - Fork 3
Feature/krenn1/smart sample 2 #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
crkrenn
wants to merge
21
commits into
master
Choose a base branch
from
feature/krenn1/smart_sample_2
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
- Update `RANDOM_SCHEMA` and `BEST_CANDIDATE_SCHEMA` to match - Disable the "previous_samples" check - Add an error check to the schema - Add a test for min and max values in the parameters - Add a function to calculate the Manhattan distance between two points - Change the format of the output from `get_samples` to a dictionary of dictionaries - Convert samples [scisample/schema.py] - Remove `previous_samples` from `RANDOM_SCHEMA` - Add `cost_variable`, `downselect_ratio`, `voxel_overlap` to `BEST_CANDIDATE_SCHEMA` - Update `BEST_CANDIDATE_SCHEMA` to match `RANDOM_SCHEMA` [scisample/random_sampler.py] - Disable the "previous_samples" check - Add an error check to the schema - Add a test for min and max values in the parameters [scisample/utils.py] - Add a function to calculate the Manhattan distance between two points [tests/test_utils.py] - Add manhattan_distance function - Change the tolerance of parse_parameters to accept a list of floats [scisample/base_sampler.py] - Change the format of the output from `get_samples` to a dictionary of dictionaries - Convert samples to parameter dictionary in a format convenient for maestrowf - Add a new OpenAI API for completions - Lower the numeric tolerance for test files - Add two tests for the inclusive string split function
- Refactor `downselect` function in `base_sampler.py` to allow for optional argument `previous_samples` - Change `previous_samples` path in `test_samplers.py` - Change `X1` constants to range and `X2` and `X3` ranges in `test_samplers.py` - Add a check to ensure `previous_samples` is a DataFrame in [tests/test_samplers.py] - Change `previous_samples` path - Change `X1` constants to range - Change `X2` and `X3` ranges [scisample/base_sampler.py] - Allow for optional argument `previous_samples` in `downselect` function - Refactor `downselect` function to handle `previous_samples` argument - Add a check to ensure `previous_samples` is a DataFrame
- Added `return_indices` parameter to `downselect` function in `base_sampler.py` - Changed `columns` variable to use `df.columns.tolist()` instead of `self.parameters` - Added optional return of indices in `downselect` function [scisample/base_sampler.py] - Added `return_indices` parameter to `downselect` function - Changed `columns` variable to use `df.columns.tolist()` instead of `self.parameters` - Added optional return of indices in `downselect` function
- Update `__version__` and `VERSION` variables to `1.0.3` - Add `encoding='utf-8'` to `open` calls - Change argument name of `manhattan_distance` from `x` and `y` to `point_1` and `point_2` - Check for duplicates in variables - Add a `pointless-statement` disable comment - Add constants to the samples - Add [scisample/__init__.py] - Add a docstring to the `__init__.py` file - Update the `__version__` and `VERSION` variables to `1.0.3` [scisample/utils.py] - Add `encoding='utf-8'` to `open` calls - Change the argument name of `manhattan_distance` from `x` and `y` to `point_1` and `point_2` [scisample/column_list_sampler.py] - Check for duplicates in variables - Add a `pointless-statement` disable comment - Add constants to the samples - Add parameter samples to the samples [scisample/random_sampler.py] - Replace `i` with `_` in loop for generating random samples - Move `octokit` initialization to separate file - Add `with suppress` blocks to catch `KeyError` - Update `new_sample` with `constants` and `random_list` [scisample/custom_sampler.py] - Move the sample function initialization to a separate line
Contributor
|
@crkrenn it might make sense to include a threshold for your cost function rather than require me to decide what fraction of the points I want to use. |
…mple into feature/krenn1/smart_sample_2
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
@antimatterhorn & @daub1,
The main change here is working "smart sampler" using an algorithm that Cody and I developed. The best_candidate sampler works as it did before, but if you add "previous_samples, cost_variable, downselect_ratio, and voxel_overlap", it will read a csv file containing parameters and a cost column. The code will select the best
downselect_ratio * len(previous_samples)points as determined by the cost function. It will generate random samples in voxels defined by each selected previous point and its nearest neighbor.There is a working example in tests. As expected, the smart sampler zooms in on minimal values of the rosenbruck function.
If variables are listed in constants or parameters and are not in the previous samples file, they will be sampled normally. This will let you add another parameter to a set of "good" points.
The only other significant change was some refactoring and commenting to remove flint lake and pylint errors.
-Chris