Feature/krenn1/smart sample 2 #17

crkrenn · 2023-03-07T19:16:18Z

The main change here is working "smart sampler" using an algorithm that Cody and I developed. The best_candidate sampler works as it did before, but if you add "previous_samples, cost_variable, downselect_ratio, and voxel_overlap", it will read a csv file containing parameters and a cost column. The code will select the best downselect_ratio * len(previous_samples) points as determined by the cost function. It will generate random samples in voxels defined by each selected previous point and its nearest neighbor.

There is a working example in tests. As expected, the smart sampler zooms in on minimal values of the rosenbruck function.

If variables are listed in constants or parameters and are not in the previous samples file, they will be sampled normally. This will let you add another parameter to a set of "good" points.

The only other significant change was some refactoring and commenting to remove flint lake and pylint errors.

-Chris

        sampler:
            type: best_candidate
            num_samples: 30
            previous_samples: samples.csv # optional
            cost_variable: cost   # required if previous_samples is provided
            downselect_ratio: 0.3 # required if previous_samples is provided
            voxel_overlap: 1.0    # required if previous_samples is provided
            constants:
                X1: 20
            parameters:
                X2:
                    min: 5
                    max: 10
                X3:
                    min: 5
                    max: 10

- Update `RANDOM_SCHEMA` and `BEST_CANDIDATE_SCHEMA` to match - Disable the "previous_samples" check - Add an error check to the schema - Add a test for min and max values in the parameters - Add a function to calculate the Manhattan distance between two points - Change the format of the output from `get_samples` to a dictionary of dictionaries - Convert samples [scisample/schema.py] - Remove `previous_samples` from `RANDOM_SCHEMA` - Add `cost_variable`, `downselect_ratio`, `voxel_overlap` to `BEST_CANDIDATE_SCHEMA` - Update `BEST_CANDIDATE_SCHEMA` to match `RANDOM_SCHEMA` [scisample/random_sampler.py] - Disable the "previous_samples" check - Add an error check to the schema - Add a test for min and max values in the parameters [scisample/utils.py] - Add a function to calculate the Manhattan distance between two points [tests/test_utils.py] - Add manhattan_distance function - Change the tolerance of parse_parameters to accept a list of floats [scisample/base_sampler.py] - Change the format of the output from `get_samples` to a dictionary of dictionaries - Convert samples to parameter dictionary in a format convenient for maestrowf - Add a new OpenAI API for completions - Lower the numeric tolerance for test files - Add two tests for the inclusive string split function

- Refactor `downselect` function in `base_sampler.py` to allow for optional argument `previous_samples` - Change `previous_samples` path in `test_samplers.py` - Change `X1` constants to range and `X2` and `X3` ranges in `test_samplers.py` - Add a check to ensure `previous_samples` is a DataFrame in [tests/test_samplers.py] - Change `previous_samples` path - Change `X1` constants to range - Change `X2` and `X3` ranges [scisample/base_sampler.py] - Allow for optional argument `previous_samples` in `downselect` function - Refactor `downselect` function to handle `previous_samples` argument - Add a check to ensure `previous_samples` is a DataFrame

- Added `return_indices` parameter to `downselect` function in `base_sampler.py` - Changed `columns` variable to use `df.columns.tolist()` instead of `self.parameters` - Added optional return of indices in `downselect` function [scisample/base_sampler.py] - Added `return_indices` parameter to `downselect` function - Changed `columns` variable to use `df.columns.tolist()` instead of `self.parameters` - Added optional return of indices in `downselect` function

- Update `__version__` and `VERSION` variables to `1.0.3` - Add `encoding='utf-8'` to `open` calls - Change argument name of `manhattan_distance` from `x` and `y` to `point_1` and `point_2` - Check for duplicates in variables - Add a `pointless-statement` disable comment - Add constants to the samples - Add [scisample/__init__.py] - Add a docstring to the `__init__.py` file - Update the `__version__` and `VERSION` variables to `1.0.3` [scisample/utils.py] - Add `encoding='utf-8'` to `open` calls - Change the argument name of `manhattan_distance` from `x` and `y` to `point_1` and `point_2` [scisample/column_list_sampler.py] - Check for duplicates in variables - Add a `pointless-statement` disable comment - Add constants to the samples - Add parameter samples to the samples [scisample/random_sampler.py] - Replace `i` with `_` in loop for generating random samples - Move `octokit` initialization to separate file - Add `with suppress` blocks to catch `KeyError` - Update `new_sample` with `constants` and `random_list` [scisample/custom_sampler.py] - Move the sample function initialization to a separate line

daub1 · 2023-03-07T23:19:21Z

@crkrenn it might make sense to include a threshold for your cost function rather than require me to decide what fraction of the points I want to use.

…mple into feature/krenn1/smart_sample_2

crkrenn added 4 commits March 5, 2023 08:54

crkrenn and others added 17 commits May 19, 2023 10:37

removed error check on 'previous_samples' which is no longer relavent'

93057e9

removed some flake errors

4aac3ed

removed failing uqpipeline tests

af678c5

linting

8486530

Merge branch 'feature/krenn1/smart_sample_2' of github.com:LLNL/scisa…

7e30d3c

…mple into feature/krenn1/smart_sample_2

removed redundant file (test_import.py)

649c1b2

finished linting for now

80d44b8

working on rosenbruck example

fe83d9a

rosenbrock notebook is working...

5ac6caa

progress on smart sampler

9729c7c

done for the day

abf6eea

working

ed4357b

trying to make samples more continuous

58e9d8d

tests passed

3f1665e

too slow; too many points

9f6af52

smart sample work in progress

3871bcb

added smart_sampler.py

aec717a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature/krenn1/smart sample 2 #17

Feature/krenn1/smart sample 2 #17

Uh oh!

crkrenn commented Mar 7, 2023

Uh oh!

daub1 commented Mar 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feature/krenn1/smart sample 2 #17

Are you sure you want to change the base?

Feature/krenn1/smart sample 2 #17

Uh oh!

Conversation

crkrenn commented Mar 7, 2023

Uh oh!

daub1 commented Mar 7, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants