diff --git a/.github/workflows/README.yml.disabled b/.github/workflows/README.yml similarity index 100% rename from .github/workflows/README.yml.disabled rename to .github/workflows/README.yml diff --git a/.github/workflows/cli-tests.yml.disabled b/.github/workflows/cli-tests.yml similarity index 100% rename from .github/workflows/cli-tests.yml.disabled rename to .github/workflows/cli-tests.yml diff --git a/.github/workflows/docs.yml.disabled b/.github/workflows/docs.yml similarity index 100% rename from .github/workflows/docs.yml.disabled rename to .github/workflows/docs.yml diff --git a/.github/workflows/examples.yml.disabled b/.github/workflows/examples.yml similarity index 100% rename from .github/workflows/examples.yml.disabled rename to .github/workflows/examples.yml diff --git a/.github/workflows/ssh-localhost.yml.disabled b/.github/workflows/ssh-localhost.yml similarity index 100% rename from .github/workflows/ssh-localhost.yml.disabled rename to .github/workflows/ssh-localhost.yml diff --git a/.gitignore b/.gitignore index 0d30d51..4a0c49b 100644 --- a/.gitignore +++ b/.gitignore @@ -1,6 +1,6 @@ .claude claude -*fz.egg-info +*.egg-info venv build .vscode @@ -21,4 +21,4 @@ build *.sh .coverage results*/ -output/ \ No newline at end of file +output/ diff --git a/CLAUDE.md b/CLAUDE.md index 243f3a7..dbb1f25 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -12,11 +12,12 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co - Reading and parsing output results - Smart caching and retry mechanisms -The four core functions are: +The five core functions are: 1. **`fzi`** - Parse input files to identify variables 2. **`fzc`** - Compile input files by substituting variables 3. **`fzo`** - Parse output files from calculations 4. **`fzr`** - Run complete parametric calculations end-to-end +5. **`fzd`** - Run iterative design of experiments with adaptive algorithms ## Development Setup @@ -63,13 +64,14 @@ python -m pytest tests/test_cli_commands.py::test_fzi_parse_variables -v ## Architecture -The codebase is organized into functional modules (~5700 lines total): +The codebase is organized into functional modules (~7300 lines total): ### Core Modules -- **`fz/core.py`** (913 lines) - Public API functions (`fzi`, `fzc`, `fzo`, `fzr`) +- **`fz/core.py`** (1277 lines) - Public API functions (`fzi`, `fzc`, `fzo`, `fzr`, `fzd`) - Entry points for all parametric computing operations - Orchestrates input compilation, calculation execution, and result parsing + - Implements iterative design of experiments (`fzd`) with algorithm integration - Handles signal interruption and graceful shutdown - **`fz/interpreter.py`** (387 lines) - Variable parsing and formula evaluation @@ -108,15 +110,34 @@ The codebase is organized into functional modules (~5700 lines total): - Structured logging with levels (DEBUG, INFO, WARNING, ERROR) - UTF-8 encoding handling for Windows -- **`fz/cli.py`** (395 lines) - Command-line interface - - Entry points: `fz`, `fzi`, `fzc`, `fzo`, `fzr` +- **`fz/cli.py`** (509 lines) - Command-line interface + - Entry points: `fz`, `fzi`, `fzc`, `fzo`, `fzr`, `fzd` - Argument parsing for all commands - Output formatting (JSON, table, CSV, markdown, HTML) +- **`fz/algorithms.py`** (513 lines) - Algorithm framework for design of experiments + - Base interface for iterative algorithms used by `fzd` + - Algorithm loading from Python files with dynamic import + - Support for initial design, adaptive sampling, and result analysis + - Automatic dependency checking (e.g., numpy, scipy) + - Content detection for analysis results (HTML, JSON, Markdown, key-value) + +- **`fz/shell.py`** (505 lines) - Shell utilities and binary path resolution + - Cross-platform shell command execution with Windows bash detection + - Binary path resolution with `FZ_SHELL_PATH` support + - Caching of binary locations for performance + - Windows .exe extension handling + - Short path conversion for Windows paths with spaces + ### Supporting Modules - **`fz/spinner.py`** (225 lines) - Progress indication for long-running operations -- **`fz/installer.py`** (354 lines) - Model installation from GitHub/URL/zip +- **`fz/installer.py`** (598 lines) - Model and algorithm installation from GitHub/URL/zip + - Install models: `fz install model ` or `fz.install_model(model)` + - Install algorithms: `fz install algorithm ` or `fz.install_algorithm(algorithm)` + - Supports GitHub repositories (`fz-` convention), full URLs, and local zip files + - Project-level (`.fz/models/`, `.fz/algorithms/`) and global (`~/.fz/models/`, `~/.fz/algorithms/`) installation + - Priority system: project-level overrides global ## Key Design Patterns @@ -159,6 +180,18 @@ The codebase is organized into functional modules (~5700 lines total): - Prevents redundant computation when resuming interrupted runs - Glob patterns supported: `cache://archive/*/results` +### 6. Algorithm-Based Design of Experiments (fzd) +- **Iterative adaptive sampling**: Algorithms decide what points to evaluate next based on previous results +- **Algorithm interface**: Each algorithm class implements: + - `get_initial_design()`: Returns initial design points + - `get_next_design()`: Returns next points to evaluate (empty list when done) + - `get_analysis()`: Returns final analysis results + - `get_analysis_tmp()`: [Optional] Returns intermediate progress at each iteration +- **Flexible analysis output**: Algorithms can return text, HTML, JSON, Markdown, or key-value pairs +- **Content detection**: Automatically processes analysis results based on content type +- **Examples**: Monte Carlo sampling, BFGS optimization, Brent's method, random sampling +- **Requires pandas**: fzd returns results as pandas DataFrames + ## Windows-Specific Considerations ### Bash Availability @@ -178,7 +211,7 @@ The codebase is organized into functional modules (~5700 lines total): - All tests in `tests/` directory following pytest conventions - Test files prefixed with `test_` (e.g., `test_cli_commands.py`) - Use pytest fixtures in `conftest.py` for common setup -- Examples: `test_parallel.py`, `test_interrupt_handling.py`, `test_examples_*.py` +- Examples: `test_parallel.py`, `test_interrupt_handling.py`, `test_fzd.py`, `test_examples_*.py` ### Test Patterns 1. Create temporary directory with `tempfile.TemporaryDirectory()` @@ -221,6 +254,16 @@ Each case creates a directory with: - `err.txt` - Standard error - `.fz_hash` - Input file MD5 hashes (for cache matching) +### Example Algorithms +- Location: `examples/algorithms/` directory +- Available algorithms: + - **`montecarlo_uniform.py`** - Uniform random sampling for Monte Carlo integration + - **`randomsampling.py`** - Simple random sampling with configurable iterations + - **`bfgs.py`** - BFGS optimization algorithm (requires scipy) + - **`brent.py`** - Brent's method for 1D optimization (requires scipy) +- Each algorithm demonstrates the standard interface and can serve as a template +- Algorithms can be referenced by file path: `algorithm="examples/algorithms/montecarlo_uniform.py"` + ## Environment Variables ```bash @@ -329,8 +372,41 @@ All public functions and methods must have docstrings with: - Host key validation with interactive fingerprint checking - Timeout and keepalive configurable via environment +### Algorithm Loading and Execution (fzd) +- **Dynamic import**: Algorithms loaded from Python files using `importlib.machinery` +- **Dependency checking**: `__require__` list checked at load time; warns if missing +- **Fixed vs variable inputs**: Separates fixed values from ranges for optimization + - Fixed: `{"x": "5.0"}` → always x=5.0 + - Variable: `{"y": "[0;10]"}` → y varies between 0 and 10 + - Algorithm only controls variable inputs; fixed values merged automatically +- **Analysis content processing**: Detects and processes multiple content types: + - HTML: Saved to `analysis.html` and `iteration_N.html` + - JSON: Parsed and made available as structured data + - Markdown: Saved to `analysis.md` files + - Key-value pairs: Parsed into dictionaries +- **Progress tracking**: Progress bar shows iteration count, evaluations, and ETA +- **Result structure**: Returns dict with: + - `XY`: pandas DataFrame with all input and output values + - `analysis`: Processed analysis results (HTML, plots, metrics, etc.) - excludes internal `_raw` data + - `algorithm`: Algorithm path + - `iterations`: Number of iterations completed + - `total_evaluations`: Total number of function evaluations + - `summary`: Human-readable summary text + ## Common Development Tasks +### Adding a New Algorithm for fzd +1. Create a new Python file in `examples/algorithms/` or any directory +2. Implement a class with required methods: + - `__init__(self, **options)` - Accept algorithm-specific options + - `get_initial_design(self, input_vars, output_vars)` - Return initial design points + - `get_next_design(self, previous_input_vars, previous_output_values)` - Return next points (or empty list when done) + - `get_analysis(self, input_vars, output_values)` - Return final analysis results + - `get_analysis_tmp(self, input_vars, output_values)` [Optional] - Return intermediate results +3. Add optional `__require__` list for dependencies (e.g., `["numpy", "scipy"]`) +4. Test with `fzd()` function +5. See `examples/algorithms/` for reference implementations + ### Adding a New Calculator Type 1. Add runner function to `runners.py` following `_run_*_calculator()` pattern 2. Register in calculator resolution logic diff --git a/README.md b/README.md index 1c1d6c0..5f1b060 100644 --- a/README.md +++ b/README.md @@ -20,6 +20,7 @@ A powerful Python package for parametric simulations and computational experimen - [Calculator Types](#calculator-types) - [Advanced Features](#advanced-features) - [Complete Examples](#complete-examples) +- [Writing Custom Algorithms for fzd](#writing-custom-algorithms-for-fzd) - [Configuration](#configuration) - [Interrupt Handling](#interrupt-handling) - [Development](#development) @@ -28,22 +29,24 @@ A powerful Python package for parametric simulations and computational experimen ### Core Capabilities -- **🔄 Parametric Studies**: Automatically generate and run all combinations of parameter values (Cartesian product) +- **🔄 Parametric Studies**: Factorial designs (dict with Cartesian product) or non-factorial designs (DataFrame with specific cases) - **⚡ Parallel Execution**: Run multiple cases concurrently across multiple calculators with automatic load balancing - **💾 Smart Caching**: Reuse previous calculation results based on input file hashes to avoid redundant computations - **🔁 Retry Mechanism**: Automatically retry failed calculations with alternative calculators - **🌐 Remote Execution**: Execute calculations on remote servers via SSH with automatic file transfer -- **📊 DataFrame Output**: Results returned as pandas DataFrames with automatic type casting and variable extraction +- **📊 DataFrame I/O**: Input and output using pandas DataFrames with automatic type casting and variable extraction - **🛑 Interrupt Handling**: Gracefully stop long-running calculations with Ctrl+C while preserving partial results - **🔍 Formula Evaluation**: Support for calculated parameters using Python or R expressions - **📁 Directory Management**: Automatic organization of inputs, outputs, and logs for each case +- **🎯 Adaptive Algorithms**: Iterative design of experiments with intelligent sampling strategies (fzd) -### Four Core Functions +### Five Core Functions 1. **`fzi`** - Parse **I**nput files to identify variables 2. **`fzc`** - **C**ompile input files by substituting variable values 3. **`fzo`** - Parse **O**utput files from calculations 4. **`fzr`** - **R**un complete parametric calculations end-to-end +5. **`fzd`** - Run iterative **D**esign of experiments with adaptive algorithms ## Installation @@ -83,7 +86,10 @@ pip install -e git+https://github.com/Funz/fz.git # for SSH support pip install paramiko -# for DataFrame support +# for DataFrame support (recommended) +pip install pandas + +# for fzd (design of experiments) - REQUIRED pip install pandas # for R interpreter support @@ -91,6 +97,9 @@ pip install funz-fz[r] # OR pip install rpy2 # Note: Requires R installed with system libraries - see examples/r_interpreter_example.md + +# for optimization algorithms (scipy-based algorithms in examples/) +pip install scipy numpy ``` ## Quick Start @@ -214,6 +223,7 @@ Available commands: - `fzc` - Compile input files - `fzo` - Read output files - `fzr` - Run parametric calculations +- `fzd` - Run design of experiments with adaptive algorithms ### fzi - Parse Input Variables @@ -599,6 +609,47 @@ fzr input.txt \ # Only runs the remaining cases ``` +### fzd - Run Design of Experiments + +Run iterative design of experiments with adaptive algorithms: + +```bash +# Basic usage with Monte Carlo algorithm +fzd input.txt \ + --model perfectgas \ + --variables '{"T_celsius": "[10;50]", "V_L": "[1;10]", "n_mol": 1}' \ + --calculator "sh://bash PerfectGazPressure.sh" \ + --output-expression "pressure" \ + --algorithm examples/algorithms/montecarlo_uniform.py \ + --algorithm-options '{"batch_sample_size": 20, "max_iterations": 10}' \ + --analysis-dir fzd_results/ + +# With optimization algorithm (BFGS) +fzd input.txt \ + --model perfectgas \ + --variables '{"T_celsius": "[10;50]", "V_L": "[1;10]", "n_mol": 1}' \ + --calculator "sh://bash calc.sh" \ + --output-expression "pressure" \ + --algorithm examples/algorithms/bfgs.py \ + --algorithm-options '{"minimize": true, "max_iterations": 50}' \ + --analysis-dir optimization_results/ + +# Fixed and variable inputs +fzd input.txt \ + --model perfectgas \ + --variables '{"T_celsius": "[10;50]", "V_L": "5.0", "n_mol": 1}' \ + --calculator "sh://bash calc.sh" \ + --output-expression "pressure" \ + --algorithm examples/algorithms/brent.py \ + --analysis-dir brent_results/ +``` + +**Key Differences from fzr**: +- Variables use `"[min;max]"` for ranges (algorithm decides values) or `"value"` for fixed +- Requires `--algorithm` parameter with path to algorithm file +- Optionally accepts `--algorithm-options` as JSON dict +- Returns DataFrame with all sampled points and analysis results + ### Environment Variables for CLI ```bash @@ -618,6 +669,10 @@ fzr input.txt --model perfectgas ... export FZ_SSH_AUTO_ACCEPT_HOSTKEYS=1 # Use with caution export FZ_SSH_KEEPALIVE=300 fzr input.txt --calculator "ssh://user@host/bash calc.sh" ... + +# Shell path for binary resolution (Windows) +export FZ_SHELL_PATH="C:\msys64\usr\bin;C:\msys64\mingw64\bin" +fzr input.txt --model perfectgas ... ``` ## Core Functions @@ -767,13 +822,150 @@ print(results) **Parameters**: - `input_path`: Input file or directory path -- `input_variables`: Variable values (creates Cartesian product of lists) +- `input_variables`: Variable values - dict (factorial) or DataFrame (non-factorial) - `model`: Model definition (dict or alias) - `calculators`: Calculator URI(s) - string or list - `results_dir`: Results directory path **Returns**: pandas DataFrame with all results +### fzd - Run Design of Experiments + +Execute iterative design of experiments with adaptive algorithms: + +```python +import fz + +model = { + "varprefix": "$", + "output": { + "result": "grep 'Result:' output.txt | awk '{print $2}'" + } +} + +# Run Monte Carlo sampling +results = fz.fzd( + input_path="input.txt", + input_variables={ + "x": "[0;10]", # Range: algorithm decides values + "y": "[-5;5]", # Range: algorithm decides values + "z": "2.5" # Fixed value + }, + model=model, + output_expression="result", + algorithm="examples/algorithms/montecarlo_uniform.py", + calculators=["sh://bash calculate.sh"], + algorithm_options={"batch_sample_size": 10, "max_iterations": 20}, + analysis_dir="results_fzd" +) + +# Results include: +# - results['XY']: DataFrame with all input/output values +# - results['analysis']: Processed analysis (HTML, plots, metrics, etc.) +# - results['iterations']: Number of iterations completed +# - results['total_evaluations']: Total function evaluations +# - results['summary']: Summary text +print(results['XY']) # All sampled points and outputs +print(results['summary']) # Algorithm completion summary +``` + +**Algorithm Examples**: +- `examples/algorithms/montecarlo_uniform.py` - Uniform random sampling +- `examples/algorithms/randomsampling.py` - Simple random sampling +- `examples/algorithms/bfgs.py` - BFGS optimization (requires scipy) +- `examples/algorithms/brent.py` - Brent's 1D optimization (requires scipy) + +**Parameters**: +- `input_file`: Input file or directory path +- `input_variables`: Dict with `"[min;max]"` for ranges or `"value"` for fixed +- `model`: Model definition (dict or alias) +- `output_expression`: Expression to evaluate from outputs (e.g., `"pressure"` or `"out1 + out2 * 2"`) +- `algorithm`: Path to algorithm Python file +- `calculators`: Calculator URI(s) - string or list +- `algorithm_options`: Dict of algorithm-specific options +- `analysis_dir`: Analysis results directory + +**Returns**: Dict with: +- `XY`: pandas DataFrame with all input and output values +- `analysis`: Processed analysis results (HTML files, plots, metrics) +- `algorithm`: Algorithm path +- `iterations`: Number of iterations completed +- `total_evaluations`: Total number of function evaluations +- `summary`: Human-readable summary text + +### Input Variables: Factorial vs Non-Factorial Designs + +FZ supports two types of parametric study designs through different `input_variables` formats: + +#### Factorial Design (Dict) + +Use a **dict** to create a full factorial design (Cartesian product of all variable values): + +```python +# Dict with lists creates ALL combinations (factorial) +input_variables = { + "temp": [100, 200, 300], # 3 values + "pressure": [1.0, 2.0] # 2 values +} +# Creates 6 cases: 3 × 2 = 6 +# (100,1.0), (100,2.0), (200,1.0), (200,2.0), (300,1.0), (300,2.0) + +results = fz.fzr(input_file, input_variables, model, calculators) +``` + +**Use factorial design when:** +- You want to explore all possible combinations +- Variables are independent +- You need a complete design space exploration + +#### Non-Factorial Design (DataFrame) + +Use a **pandas DataFrame** to specify exactly which cases to run (non-factorial): + +```python +import pandas as pd + +# DataFrame: each row is ONE case (non-factorial) +input_variables = pd.DataFrame({ + "temp": [100, 200, 100, 300], + "pressure": [1.0, 1.0, 2.0, 1.5] +}) +# Creates 4 cases ONLY: +# (100,1.0), (200,1.0), (100,2.0), (300,1.5) +# Note: (100,2.0) is included but (200,2.0) is not + +results = fz.fzr(input_file, input_variables, model, calculators) +``` + +**Use non-factorial design when:** +- You have specific combinations to test +- Variables are coupled or have constraints +- You want to import a design from another tool +- You need an irregular or optimized sampling pattern + +**Examples of non-factorial patterns:** +```python +# Latin Hypercube Sampling +import pandas as pd +from scipy.stats import qmc + +sampler = qmc.LatinHypercube(d=2) +sample = sampler.random(n=10) +input_variables = pd.DataFrame({ + "x": sample[:, 0] * 100, # Scale to [0, 100] + "y": sample[:, 1] * 10 # Scale to [0, 10] +}) + +# Constraint-based design (only valid combinations) +input_variables = pd.DataFrame({ + "rpm": [1000, 1500, 2000, 2500], + "load": [10, 20, 40, 50] # load increases with rpm +}) + +# Imported from design of experiments tool +input_variables = pd.read_csv("doe_design.csv") +``` + ## Model Definition A model defines how to parse inputs and extract outputs: @@ -1389,6 +1581,351 @@ results = fz.fzr( print(results[['param', 'calculator', 'status']].head(10)) ``` +### Example 4: Design of Experiments with Adaptive Sampling + +```python +import fz +import matplotlib.pyplot as plt + +# Input template with perfect gas law +# (same as Example 1, but using fzd for adaptive design) + +model = { + "varprefix": "$", + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": { + "pressure": "grep 'pressure = ' output.txt | awk '{print $3}'" + } +} + +# Run Monte Carlo sampling to explore pressure distribution +results = fz.fzd( + input_path="input.txt", + input_variables={ + "T_celsius": "[10;50]", # Range: 10 to 50°C + "V_L": "[1;10]", # Range: 1 to 10 L + "n_mol": "1.0" # Fixed: 1 mole + }, + model=model, + output_expression="pressure", + algorithm="examples/algorithms/montecarlo_uniform.py", + calculators=["sh://bash PerfectGazPressure.sh"], + algorithm_options={ + "batch_sample_size": 20, # 20 samples per iteration + "max_iterations": 10 # 10 iterations + }, + analysis_dir="monte_carlo_results" +) + +# Results DataFrame has all sampled points +print(f"Total evaluations: {results['total_evaluations']}") +print(f"Iterations: {results['iterations']}") +print(results['summary']) + +# Access the data +df = results['XY'] +print(df.head()) + +# Plot the sampled points +fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5)) + +# Scatter plot: Temperature vs Volume colored by Pressure +scatter = ax1.scatter(df['T_celsius'], df['V_L'], c=df['pressure'], + cmap='viridis', s=50, alpha=0.6) +ax1.set_xlabel('Temperature (°C)') +ax1.set_ylabel('Volume (L)') +ax1.set_title('Sampled Design Space') +plt.colorbar(scatter, ax=ax1, label='Pressure (Pa)') + +# Histogram of pressure values +ax2.hist(df['pressure'], bins=20, edgecolor='black') +ax2.set_xlabel('Pressure (Pa)') +ax2.set_ylabel('Frequency') +ax2.set_title('Pressure Distribution') + +plt.tight_layout() +plt.savefig('monte_carlo_analysis.png') +print("Analysis plot saved to monte_carlo_analysis.png") +``` + +### Example 5: Optimization with BFGS + +```python +import fz + +# Find temperature and volume that minimize pressure + +model = { + "varprefix": "$", + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": { + "pressure": "grep 'pressure = ' output.txt | awk '{print $3}'" + } +} + +results = fz.fzd( + input_path="input.txt", + input_variables={ + "T_celsius": "[10;50]", # Search range + "V_L": "[1;10]", # Search range + "n_mol": "1.0" # Fixed + }, + model=model, + output_expression="pressure", + algorithm="examples/algorithms/bfgs.py", + calculators=["sh://bash PerfectGazPressure.sh"], + algorithm_options={ + "minimize": True, # Minimize pressure + "max_iterations": 50 + }, + analysis_dir="optimization_results" +) + +# Get optimal point +df = results['XY'] +optimal_idx = df['pressure'].idxmin() +optimal = df.loc[optimal_idx] + +print(f"Optimal temperature: {optimal['T_celsius']:.2f}°C") +print(f"Optimal volume: {optimal['V_L']:.2f} L") +print(f"Minimum pressure: {optimal['pressure']:.2f} Pa") +print(f"Total evaluations: {results['total_evaluations']}") + +# Plot optimization path +import matplotlib.pyplot as plt +plt.figure(figsize=(10, 6)) +plt.scatter(df['T_celsius'], df['V_L'], c=df['pressure'], + cmap='coolwarm', s=100, edgecolor='black') +plt.plot(df['T_celsius'], df['V_L'], 'k--', alpha=0.3, label='Optimization path') +plt.scatter(optimal['T_celsius'], optimal['V_L'], + color='red', s=300, marker='*', + edgecolor='black', label='Optimum') +plt.xlabel('Temperature (°C)') +plt.ylabel('Volume (L)') +plt.title('BFGS Optimization Path') +plt.colorbar(label='Pressure (Pa)') +plt.legend() +plt.savefig('optimization_path.png') +print("Optimization path saved to optimization_path.png") +``` + +## Writing Custom Algorithms for fzd + +FZ provides an extensible framework for implementing adaptive algorithms. Each algorithm is a Python class with specific methods. + +### Algorithm Interface + +Create a Python file with a class implementing these methods: + +```python +class MyAlgorithm: + """Custom algorithm for design of experiments""" + + def __init__(self, **options): + """ + Initialize algorithm with options passed from algorithm_options. + + Args: + **options: Algorithm-specific parameters (e.g., batch_size, max_iter) + """ + self.batch_size = options.get('batch_size', 10) + self.max_iterations = options.get('max_iterations', 100) + self.iteration = 0 + + def get_initial_design(self, input_vars, output_vars): + """ + Return initial design points to evaluate. + + Args: + input_vars: Dict[str, tuple] - {var_name: (min, max)} + e.g., {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars: List[str] - Output variable names + + Returns: + List[Dict[str, float]] - Initial points to evaluate + e.g., [{"x": 0.5, "y": 0.0}, {"x": 7.5, "y": 2.3}] + """ + # Generate initial sample points + import random + points = [] + for _ in range(self.batch_size): + point = { + var: random.uniform(bounds[0], bounds[1]) + for var, bounds in input_vars.items() + } + points.append(point) + return points + + def get_next_design(self, previous_input_vars, previous_output_values): + """ + Return next design points based on previous results. + + Args: + previous_input_vars: List[Dict[str, float]] - All previous input combinations + previous_output_values: List[float] - Corresponding outputs (may contain None) + + Returns: + List[Dict[str, float]] - Next points to evaluate + Empty list [] signals algorithm is finished + """ + self.iteration += 1 + + # Stop if max iterations reached + if self.iteration >= self.max_iterations: + return [] # Empty list = finished + + # Generate next batch based on results + # ... your adaptive logic here ... + + return next_points + + def get_analysis(self, input_vars, output_values): + """ + Return final analysis results. + + Args: + input_vars: List[Dict[str, float]] - All evaluated inputs + output_values: List[float] - All outputs (may contain None) + + Returns: + Dict with analysis information (can include 'text', 'data', etc.) + """ + # Filter out failed evaluations (None values) + valid_results = [(x, y) for x, y in zip(input_vars, output_values) if y is not None] + + return { + 'text': f"Algorithm completed: {len(valid_results)} successful evaluations", + 'data': {'mean': sum(y for _, y in valid_results) / len(valid_results)} + } + + def get_analysis_tmp(self, input_vars, output_values): + """ + [OPTIONAL] Return intermediate results at each iteration. + + Args: + input_vars: List[Dict[str, float]] - All inputs so far + output_values: List[float] - All outputs so far + + Returns: + Dict with intermediate analysis information + """ + valid_count = sum(1 for y in output_values if y is not None) + return { + 'text': f"Iteration {self.iteration}: {valid_count} valid samples" + } +``` + +### Algorithm Examples + +#### 1. Monte Carlo Sampling + +See `examples/algorithms/montecarlo_uniform.py`: + +```python +import fz + +results = fz.fzd( + input_path="input.txt", + input_variables={"x": "[0;10]", "y": "[0;5]"}, + model="mymodel", + output_expression="result", + algorithm="examples/algorithms/montecarlo_uniform.py", + calculators=["sh://bash calc.sh"], + algorithm_options={"batch_sample_size": 20, "max_iterations": 10} +) +``` + +#### 2. BFGS Optimization + +See `examples/algorithms/bfgs.py` (requires scipy): + +```python +results = fz.fzd( + input_path="input.txt", + input_variables={"x": "[0;10]", "y": "[0;5]"}, + model="mymodel", + output_expression="energy", + algorithm="examples/algorithms/bfgs.py", + calculators=["sh://bash calc.sh"], + algorithm_options={"minimize": True, "max_iterations": 50} +) +``` + +#### 3. Brent's Method (1D Optimization) + +See `examples/algorithms/brent.py` (requires scipy): + +```python +results = fz.fzd( + input_path="input.txt", + input_variables={"temperature": "[0;100]"}, # Single variable + model="mymodel", + output_expression="efficiency", + algorithm="examples/algorithms/brent.py", + calculators=["sh://bash calc.sh"], + algorithm_options={"minimize": False} # Maximize efficiency +) +``` + +### Algorithm Features + +#### Content Format Detection + +Algorithms can return analysis results in multiple formats: + +```python +def get_analysis(self, input_vars, output_values): + # Return HTML + return { + 'text': '

Results

Mean: 42.5

', + 'data': {'mean': 42.5} + } + # Saved to: analysis_.html + + # Return JSON + return { + 'text': '{"mean": 42.5, "std": 3.2}', + 'data': {} + } + # Saved to: analysis_.json + + # Return Markdown + return { + 'text': '# Results\n\n**Mean**: 42.5\n**Std**: 3.2', + 'data': {} + } + # Saved to: analysis_.md + + # Return key-value format + return { + 'text': 'mean=42.5\nstd=3.2\nsamples=100', + 'data': {} + } + # Saved to: analysis_.txt +``` + +See `docs/FZD_CONTENT_FORMATS.md` for detailed format documentation. + +#### Dependency Management + +Specify required packages using `__require__`: + +```python +__require__ = ["numpy", "scipy", "matplotlib"] + +class MyAlgorithm: + def __init__(self, **options): + import numpy as np + import scipy.optimize + # ... +``` + +FZ will check dependencies at load time and warn if packages are missing. + ## Configuration ### Environment Variables @@ -1411,6 +1948,11 @@ export FZ_SSH_AUTO_ACCEPT_HOSTKEYS=0 # Default formula interpreter (python or R) export FZ_INTERPRETER=python + +# Custom shell binary search path (for Windows, overrides system PATH) +# Use semicolon separator on Windows, colon on Unix/Linux +export FZ_SHELL_PATH="C:\msys64\usr\bin;C:\msys64\mingw64\bin" # Windows +export FZ_SHELL_PATH="/opt/custom/bin:/usr/local/bin" # Unix/Linux ``` ### Python Configuration @@ -1442,6 +1984,9 @@ your_project/ │ │ └── mymodel.json │ ├── calculators/ # Calculator aliases │ │ └── mycluster.json +│ ├── algorithms/ # Algorithm plugins +│ │ ├── myalgo.py +│ │ └── myalgo.R │ └── tmp/ # Temporary files (auto-created) │ └── fz_temp_*/ # Per-run temp directories └── results/ # Results directory @@ -1456,6 +2001,209 @@ your_project/ └── ... ``` +## Installing Plugins + +FZ supports installing models and algorithms as plugins from GitHub repositories, local zip files, or URLs. + +### Installing Algorithm Plugins + +Algorithm plugins enable design of experiments and optimization workflows. Install algorithms from GitHub repositories in the `fz-` format: + +#### From GitHub Repository Name + +```bash +# Install from Funz organization (convention: fz-) +fz install algorithm montecarlo + +# This installs from: https://github.com/Funz/fz-montecarlo +``` + +```python +# Python API +import fz + +# Install locally (.fz/algorithms/) +fz.install_algorithm("montecarlo") + +# Install globally (~/.fz/algorithms/) +fz.install_algorithm("montecarlo", global_install=True) +``` + +#### From GitHub URL + +```bash +# Install from full URL +fz install algorithm https://github.com/YourOrg/fz-custom-algo +``` + +```python +fz.install_algorithm("https://github.com/YourOrg/fz-custom-algo") +``` + +#### From Local Zip File + +```bash +# Install from downloaded zip +fz install algorithm ./fz-myalgo.zip +``` + +```python +fz.install_algorithm("./fz-myalgo.zip") +``` + +#### Using Installed Algorithms + +Once installed, algorithms can be referenced by name: + +```python +import fz + +# Use installed algorithm plugin +results = fz.fzd( + input_path="input.txt", + input_variables={"x": "[0;10]", "y": "[-5;5]"}, + model="mymodel", + output_expression="result", + algorithm="montecarlo", # Plugin name (no path or extension) + calculators=["sh://bash calc.sh"], + algorithm_options={"batch_sample_size": 20} +) +``` + +### Installing Model Plugins + +Model plugins define input parsing and output extraction patterns. Install models from GitHub: + +#### From GitHub Repository Name + +```bash +# Install from Funz organization (convention: fz-) +fz install model moret + +# This installs from: https://github.com/Funz/fz-moret +``` + +```python +# Python API +import fz + +# Install locally (.fz/models/) +fz.install("moret") + +# Install globally (~/.fz/models/) +fz.install("moret", global_install=True) +``` + +#### From GitHub URL or Local Zip + +```bash +fz install model https://github.com/Funz/fz-moret +fz install model ./fz-moret.zip +``` + +### Listing Installed Plugins + +```bash +# List installed algorithms +fz list algorithms + +# List only global algorithms +fz list algorithms --global + +# List installed models +fz list models + +# List only global models +fz list models --global +``` + +```python +# Python API +import fz + +# List algorithms +algorithms = fz.list_algorithms() +for name, info in algorithms.items(): + print(f"{name} ({info['type']}) - {info['file']}") + +# List models +models = fz.list_models() +for name, model in models.items(): + print(f"{name}: {model.get('id', 'N/A')}") +``` + +### Uninstalling Plugins + +```bash +# Uninstall algorithm +fz uninstall algorithm montecarlo + +# Uninstall from global location +fz uninstall algorithm montecarlo --global + +# Uninstall model +fz uninstall model moret +``` + +```python +# Python API +import fz + +# Uninstall algorithm +fz.uninstall_algorithm("montecarlo") + +# Uninstall model +fz.uninstall("moret") +``` + +### Plugin Priority + +When the same plugin exists in multiple locations, FZ uses the following priority: + +1. **Project-level** (`.fz/algorithms/` or `.fz/models/`) - Highest priority +2. **Global** (`~/.fz/algorithms/` or `~/.fz/models/`) - Fallback + +This allows project-specific customization while maintaining a personal library of reusable plugins. + +### Creating Algorithm Plugins + +To create your own algorithm plugin repository (for sharing or distribution): + +1. **Create repository** named `fz-` (e.g., `fz-montecarlo`) + +2. **Add algorithm file** as `.py` or `.R` in repository root or `.fz/algorithms/`: + +```python +# montecarlo.py +class MonteCarlo: + def __init__(self, **options): + self.n_samples = options.get("n_samples", 100) + + def get_initial_design(self, input_vars, output_vars): + import random + samples = [] + for _ in range(self.n_samples): + sample = {} + for var, (min_val, max_val) in input_vars.items(): + sample[var] = random.uniform(min_val, max_val) + samples.append(sample) + return samples + + def get_next_design(self, X, Y): + return [] # One-shot sampling + + def get_analysis(self, X, Y): + valid_Y = [y for y in Y if y is not None] + mean = sum(valid_Y) / len(valid_Y) if valid_Y else 0 + return {"text": f"Mean: {mean:.2f}", "data": {"mean": mean}} +``` + +3. **Push to GitHub** and share repository URL + +4. **Install** using `fz install algorithm ` or `fz install algorithm ` + +See `examples/algorithms/PLUGIN_SYSTEM.md` for complete documentation on the algorithm plugin system. + ## Interrupt Handling FZ supports graceful interrupt handling for long-running calculations: @@ -1609,18 +2357,30 @@ python example_interrupt.py # Interactive interrupt demo fz/ ├── fz/ # Main package │ ├── __init__.py # Public API exports -│ ├── core.py # Core functions (fzi, fzc, fzo, fzr) -│ ├── interpreter.py # Variable parsing, formula evaluation -│ ├── runners.py # Calculation execution (sh, ssh) +│ ├── core.py # Core functions (fzi, fzc, fzo, fzr, fzd) +│ ├── interpreter.py # Variable parsing, formula evaluation +│ ├── runners.py # Calculation execution (sh, ssh, cache) │ ├── helpers.py # Parallel execution, retry logic │ ├── io.py # File I/O, caching, hashing +│ ├── algorithms.py # Algorithm framework for fzd +│ ├── shell.py # Shell utilities, binary path resolution │ ├── logging.py # Logging configuration +│ ├── cli.py # Command-line interface │ └── config.py # Configuration management +├── examples/ # Example files +│ └── algorithms/ # Example algorithms for fzd +│ ├── montecarlo_uniform.py # Monte Carlo sampling +│ ├── randomsampling.py # Simple random sampling +│ ├── bfgs.py # BFGS optimization +│ └── brent.py # Brent's 1D optimization ├── tests/ # Test suite │ ├── test_parallel.py # Parallel execution tests │ ├── test_interrupt_handling.py # Interrupt handling tests +│ ├── test_fzd.py # Design of experiments tests │ ├── test_examples_*.py # Example-based tests │ └── ... +├── docs/ # Documentation +│ └── FZD_CONTENT_FORMATS.md # fzd content format documentation ├── README.md # This file └── setup.py # Package configuration ``` diff --git a/SHELL_PATH_IMPLEMENTATION.md b/SHELL_PATH_IMPLEMENTATION.md deleted file mode 100644 index 35d568c..0000000 --- a/SHELL_PATH_IMPLEMENTATION.md +++ /dev/null @@ -1,224 +0,0 @@ -# FZ_SHELL_PATH Implementation Summary - -## Overview - -Implemented a comprehensive shell path configuration system for FZ that allows users to override the system `PATH` environment variable for binary resolution. This is particularly useful on Windows where Unix-like tools (grep, awk, sed, etc.) need to be found in custom locations like MSYS2, Git Bash, or user-defined paths. - -## Key Features - -### 1. Configuration Management -- **New environment variable**: `FZ_SHELL_PATH` - allows users to specify custom binary search paths -- **Path format**: Semicolon-separated on Windows (`;`), colon-separated on Unix/Linux (`:`) -- **Fallback**: If not set, uses system `PATH` environment variable -- **Integration**: Fully integrated into `fz/config.py` with proper configuration loading - -### 2. Binary Resolution System -- **New module**: `fz/shell_path.py` - Implements `ShellPathResolver` class -- **Features**: - - Resolves command names to absolute paths - - Caches resolved paths for performance - - Windows .exe extension handling (automatically tries both `cmd` and `cmd.exe`) - - Lists available binaries in configured paths - - Replaces command names in shell command strings with absolute paths - -### 3. Integration Points - -#### Model Output Expressions (fzo function) -- **File**: `fz/core.py` -- **Integration**: Output commands in model definitions are automatically resolved -- **Example**: - ```python - model = {"output": {"value": "grep 'pattern' output.txt | awk '{print $2}'"}} - # With FZ_SHELL_PATH=/msys64/usr/bin, executes: - # /msys64/usr/bin/grep.exe 'pattern' output.txt | /msys64/usr/bin/awk.exe '{print $2}' - ``` - -#### Shell Calculator Commands (sh://) -- **File**: `fz/runners.py` -- **Integration**: Commands in `sh://` calculator URIs are resolved -- **Example**: - ```python - calculators=["sh://grep 'result' data.txt | awk '{print $2}'"] - # Commands are resolved using FZ_SHELL_PATH - ``` - -### 4. Testing -- **Test file**: `tests/test_shell_path.py` -- **Coverage**: - - ShellPathResolver initialization and caching - - Path resolution on Windows and Unix - - Windows .exe extension handling - - Command string replacement - - Configuration integration - - Global resolver instance management -- **Test count**: 20 passed, 1 skipped -- **All core tests pass**: Verified with `test_cli_commands.py` and `test_examples_advanced.py` - -### 5. Documentation -- **CLAUDE.md**: Updated with comprehensive FZ_SHELL_PATH documentation - - Usage examples for MSYS2, Git Bash, and Unix/Linux - - How the system works - - Implementation details - - Performance considerations -- **Examples**: New `examples/shell_path_example.md` with practical use cases - -## Files Modified - -1. **fz/config.py** - - Added `shell_path` attribute to Config class - - Load `FZ_SHELL_PATH` from environment - - Include in config summary display - -2. **fz/shell_path.py** (NEW) - - `ShellPathResolver` class with full functionality - - Global resolver instance management - - Binary discovery and caching - -3. **fz/core.py** - - Import shell path resolution module - - Apply command resolution in fzo() function (both with and without subdirectories) - - Two locations: subdirectory case (line 556) and single directory case (line 592) - -4. **fz/runners.py** - - Import shell path resolution module - - Apply command resolution in run_local_calculation() function (line 665) - -5. **CLAUDE.md** - - New "Shell Path Configuration" section (lines 234-285) - - Usage examples and implementation details - - Windows-specific guidance - -6. **tests/test_shell_path.py** (NEW) - - Comprehensive test suite for shell path functionality - - Tests for Windows and Unix platforms - - Configuration integration tests - -7. **examples/shell_path_example.md** (NEW) - - Practical usage examples - - Troubleshooting guide - - Common use cases - -## Implementation Details - -### ShellPathResolver Class - -```python -class ShellPathResolver: - def __init__(self, custom_shell_path: Optional[str]): - # Initialize with custom path or None - - def get_search_paths(self) -> List[str]: - # Returns list of directories to search - - def resolve_command(self, command: str) -> Optional[str]: - # Resolves command name to absolute path with caching - - def list_available_binaries(self) -> List[str]: - # Lists all available binaries in search paths - - def replace_commands_in_string(self, command_string: str) -> str: - # Replaces command names with absolute paths in shell commands -``` - -### Global Functions - -- `get_resolver()` - Get global resolver instance -- `resolve_command(command)` - Resolve single command -- `replace_commands_in_string(command_string)` - Replace commands in string -- `reinitialize_resolver()` - Reset resolver after config reload - -### Platform Support - -- **Windows**: - - Semicolon-separated paths - - Automatic .exe extension handling - - Short path (8.3) format support for paths with spaces - -- **Unix/Linux**: - - Colon-separated paths - - Standard executable permission checking - -## Usage - -### Setting FZ_SHELL_PATH - -**Windows Command Prompt:** -```cmd -SET FZ_SHELL_PATH=C:\msys64\usr\bin;C:\msys64\mingw64\bin -fz input.txt -m mymodel -``` - -**Windows PowerShell:** -```powershell -$env:FZ_SHELL_PATH = "C:\msys64\usr\bin;C:\msys64\mingw64\bin" -fz input.txt -m mymodel -``` - -**Unix/Linux Bash:** -```bash -export FZ_SHELL_PATH=/opt/tools/bin:/usr/local/bin -fz input.txt -m mymodel -``` - -### Programmatic Usage - -```python -from fz import fzr -from fz.config import Config -from fz.shell_path import reinitialize_resolver - -# Set custom shell path -import os -os.environ['FZ_SHELL_PATH'] = '/opt/custom/bin' - -# Reinitialize resolver to pick up new path -reinitialize_resolver() - -# Now use fz functions with custom paths -results = fzr("input.txt", variables, model, calculators) -``` - -## Benefits - -1. **Consistency**: Ensure all team members use the same tool versions -2. **Portability**: Don't rely on system PATH which varies across machines -3. **Windows Compatibility**: Seamlessly handle multiple bash environments on Windows -4. **Performance**: Binary paths are cached after first lookup -5. **Flexibility**: Can prioritize custom tool installations over system tools -6. **Backward Compatible**: Works alongside existing code without breaking changes - -## Testing Results - -``` -tests/test_shell_path.py::TestShellPathResolver - 12 tests PASSED -tests/test_shell_path.py::TestGlobalResolver - 3 tests PASSED -tests/test_shell_path.py::TestConfigIntegration - 3 tests PASSED -tests/test_shell_path.py::TestWindowsPathResolution - 2 tests PASSED -tests/test_cli_commands.py - 46 tests PASSED, 3 SKIPPED -tests/test_examples_advanced.py - All tests PASSED -``` - -## Future Enhancements (Optional) - -1. Add Windows registry scanning for tool installations -2. Support for tool version detection and selection -3. Per-calculator shell path configuration -4. Binary aliasing (e.g., gawk → awk) -5. Shell path validation utility command - -## Backward Compatibility - -✅ **Fully backward compatible** -- Existing code works unchanged -- `FZ_SHELL_PATH` is optional -- Falls back to system `PATH` if not set -- No changes to public APIs beyond new functions in `shell_path` module - -## Documentation Status - -✅ Complete -- Code comments and docstrings throughout -- CLAUDE.md documentation with examples -- Example file with use cases -- Inline comments for complex logic -- Type hints on all functions diff --git a/docs/FZD_CONTENT_FORMATS.md b/docs/FZD_CONTENT_FORMATS.md new file mode 100644 index 0000000..82d61ca --- /dev/null +++ b/docs/FZD_CONTENT_FORMATS.md @@ -0,0 +1,300 @@ +# FZD Content Format Handling + +## Overview + +`fzd` (Design of Experiments) intelligently detects and processes different content formats returned by algorithm's `get_analysis()` and `get_analysis_tmp()` methods. Content is automatically saved to appropriate files and parsed into structured Python objects. + +## Supported Formats + +### 1. HTML Content +**Detection**: Presence of HTML tags (``, `
`, `

`, `

`, etc.) + +**Processing**: +- Saved to: `analysis_.html` +- Return structure: `{'html_file': 'analysis_.html'}` + +**Algorithm Example**: +```python +def get_analysis(self, X, Y): + return { + 'text': '

Results

Mean: 42.5

', + 'data': {'mean': 42.5} + } +``` + +**Result**: +- File created: `results_fzd/analysis_1.html` +- Python return: `result['analysis']['html_file'] == 'analysis_1.html'` +- Raw content NOT included in return (replaced with file reference) + +### 2. JSON Content +**Detection**: Text starts with `{` or `[` and is valid JSON + +**Processing**: +- Parsed to Python object +- Saved to: `analysis_.json` +- Return structure: + - `{'json_data': {...}}` - Parsed Python object + - `{'json_file': 'analysis_.json'}` - File reference + +**Algorithm Example**: +```python +def get_analysis(self, X, Y): + return { + 'text': '{"mean": 42.5, "std": 3.2, "samples": 100}', + 'data': {} + } +``` + +**Result**: +- File created: `results_fzd/analysis_1.json` +- Python return: + ```python + result['analysis']['json_data'] == {'mean': 42.5, 'std': 3.2, 'samples': 100} + result['analysis']['json_file'] == 'analysis_1.json' + ``` + +### 3. Key=Value Format +**Detection**: Multiple lines with `=` signs (at least 2) + +**Processing**: +- Parsed to Python dict +- Saved to: `analysis_.txt` +- Return structure: + - `{'keyvalue_data': {...}}` - Parsed dict + - `{'txt_file': 'analysis_.txt'}` - File reference + +**Algorithm Example**: +```python +def get_analysis(self, X, Y): + return { + 'text': '''mean=42.5 +std=3.2 +samples=100 +confidence_interval=[40.1, 44.9]''', + 'data': {} + } +``` + +**Result**: +- File created: `results_fzd/analysis_1.txt` +- Python return: + ```python + result['analysis']['keyvalue_data'] == { + 'mean': '42.5', + 'std': '3.2', + 'samples': '100', + 'confidence_interval': '[40.1, 44.9]' + } + result['analysis']['txt_file'] == 'analysis_1.txt' + ``` + +### 4. Markdown Content +**Detection**: Presence of markdown syntax (`#`, `##`, `*`, `-`, ` ``` `, etc.) + +**Processing**: +- Saved to: `analysis_.md` +- Return structure: `{'md_file': 'analysis_.md'}` + +**Algorithm Example**: +```python +def get_analysis(self, X, Y): + return { + 'text': '''# Analysis Results + +## Statistics +- Mean: 42.5 +- Standard Deviation: 3.2 + +```python +# Algorithm configuration +samples = 100 +``` +''', + 'data': {'mean': 42.5, 'std': 3.2} + } +``` + +**Result**: +- File created: `results_fzd/analysis_1.md` +- Python return: `result['analysis']['md_file'] == 'analysis_1.md'` +- Raw markdown NOT included in return (replaced with file reference) + +### 5. Plain Text +**Detection**: None of the above formats detected + +**Processing**: +- Kept as-is in the return dict +- Return structure: `{'text': 'plain text content...'}` + +**Algorithm Example**: +```python +def get_analysis(self, X, Y): + return { + 'text': 'Mean: 42.5, Std: 3.2, Samples: 100', + 'data': {'mean': 42.5, 'std': 3.2} + } +``` + +**Result**: +- No file created +- Python return: `result['analysis']['text'] == 'Mean: 42.5, Std: 3.2, Samples: 100'` + +## Multiple Content Types + +Algorithms can return both 'text' and 'html' fields separately: + +```python +def get_analysis(self, X, Y): + return { + 'text': 'Summary: Mean is 42.5 with 100 samples', + 'html': '
', + 'data': {'mean': 42.5, 'samples': 100} + } +``` + +**Result**: +- File created: `results_fzd/analysis_1.html` (from 'html' field) +- Python return: + ```python + result['analysis']['text'] == 'Summary: Mean is 42.5 with 100 samples' + result['analysis']['html_file'] == 'analysis_1.html' + result['analysis']['data'] == {'mean': 42.5, 'samples': 100} + ``` + +## FZD Return Structure + +The complete structure returned by `fzd()`: + +```python +result = { + 'XY': pd.DataFrame, # All input variables and output values + + 'analysis': { # Processed analysis from get_analysis() + 'data': {...}, # Numeric/structured data from algorithm + + # Content-specific fields (depending on format detected): + 'html_file': 'analysis_N.html', # If HTML detected + 'json_data': {...}, # If JSON detected (parsed) + 'json_file': 'analysis_N.json', # JSON file reference + 'keyvalue_data': {...}, # If key=value detected (parsed) + 'txt_file': 'analysis_N.txt', # Key=value file reference + 'md_file': 'analysis_N.md', # If markdown detected + 'text': '...', # Plain text (no format detected) + }, + + 'algorithm': 'path/to/algorithm.py', + 'iterations': 5, + 'total_evaluations': 100, + 'summary': 'algorithm completed: 5 iterations, 100 evaluations (95 valid)' +} +``` + +## Accessing Results + +### Access parsed data: +```python +# For JSON format +mean = result['analysis']['json_data']['mean'] + +# For key=value format +mean = float(result['analysis']['keyvalue_data']['mean']) + +# For data dict (always available) +mean = result['analysis']['data']['mean'] +``` + +### Access file paths: +```python +from pathlib import Path + +# HTML file +html_file = Path('results_fzd') / result['analysis']['html_file'] +with open(html_file) as f: + html_content = f.read() + +# JSON file +json_file = Path('results_fzd') / result['analysis']['json_file'] +with open(json_file) as f: + data = json.load(f) +``` + + +## Iteration Files + +For each iteration, `fzd` creates: + +1. **Input data**: `X_.csv` - All input variable values +2. **Output data**: `Y_.csv` - All output values +3. **HTML summary**: `results_.html` - Iteration overview with embedded analysis +4. **Analysis files**: `analysis_.[html|json|txt|md]` - Processed algorithm output + +## Implementation Details + +### Content Detection (fz/io.py) +```python +def detect_content_type(text: str) -> str: + """Returns: 'html', 'json', 'keyvalue', 'markdown', or 'plain'""" +``` + +### Content Processing (fz/io.py) +```python +def process_analysis_content( + analysis_dict: Dict[str, Any], + iteration: int, + results_dir: Path +) -> Dict[str, Any]: + """Process get_analysis() output, detect formats, and save files""" +``` + +### Integration (fz/core.py) +- `_get_and_process_analysis()` - Calls process_analysis_content for each iteration +- Called for both `get_analysis()` (final) and `get_analysis_tmp()` (intermediate) + +## Testing + +Run content detection tests: +```bash +python -m pytest tests/test_fzd.py::TestContentDetection -v +``` + +Run demo: +```bash +python demo_fzd_content_formats.py +``` + +## Best Practices for Algorithm Developers + +1. **Use the 'data' field for structured numeric data** + ```python + return {'data': {'mean': 42.5, 'std': 3.2}, 'text': 'Summary...'} + ``` + +2. **Return JSON for complex structured data** + ```python + import json + return {'text': json.dumps({'results': [...], 'stats': {...}})} + ``` + +3. **Use markdown for formatted text with structure** + ```python + return {'text': '# Results\n\n## Statistics\n- Mean: 42.5\n- Std: 3.2'} + ``` + +4. **Use HTML for rich visualizations** + ```python + return {'html': '
', 'text': 'See plot'} + ``` + +5. **Use key=value for simple parameter lists** + ```python + return {'text': 'mean=42.5\nstd=3.2\nsamples=100'} + ``` + +## Notes + +- Raw HTML, markdown, and large content are saved to files and replaced with file references +- Parsed data (JSON, key=value) is available as Python objects in the analysis dict +- Plain text content remains in `analysis['text']` if no format is detected +- Algorithm text output is logged to console before being processed +- All file references are relative to the analysis_dir diff --git a/examples/algorithms/bfgs.py b/examples/algorithms/bfgs.py new file mode 100644 index 0000000..d6b8fae --- /dev/null +++ b/examples/algorithms/bfgs.py @@ -0,0 +1,87 @@ +#title: BFGS Optimization Algorithm +#author: Test +#type: optimization +#options: max_iter=100;tol=0.000001 + +class Bfgs: + """Simplified BFGS for multi-dimensional optimization""" + + def __init__(self, **options): + self.max_iter = int(options.get('max_iter', 100)) + self.tol = float(options.get('tol', 1e-6)) + self._iteration = 0 + self._var_names = [] + self._finished = False + + def get_initial_design(self, input_vars, output_vars): + self._var_names = list(input_vars.keys()) + # Start at center of search space + center = {var: (bounds[0] + bounds[1]) / 2 + for var, bounds in input_vars.items()} + return [center] + + def get_next_design(self, previous_input_vars, previous_output_values): + if self._finished: + return [] + + self._iteration += 1 + if self._iteration >= self.max_iter: + self._finished = True + return [] + + # Simple: sample around best point + valid_results = [(inp, out) for inp, out in + zip(previous_input_vars, previous_output_values) + if out is not None] + + if not valid_results: + self._finished = True + return [] + + best_input, best_output = min(valid_results, key=lambda x: x[1]) + + # Check if we're done (very simple convergence) + if len(valid_results) > 5: + recent_outputs = [out for _, out in valid_results[-5:]] + if max(recent_outputs) - min(recent_outputs) < self.tol: + self._finished = True + return [] + + # Generate point near best (simple random walk) + import random + next_point = {} + for var in self._var_names: + next_point[var] = best_input[var] + random.uniform(-0.1, 0.1) + + return [next_point] + + def get_analysis(self, input_vars, output_values): + valid_results = [(inp, out) for inp, out in zip(input_vars, output_values) + if out is not None] + + if not valid_results: + return { + 'text': 'No valid results', + 'data': {'iterations': self._iteration, 'evaluations': len(input_vars)} + } + + best_input, best_output = min(valid_results, key=lambda x: x[1]) + + result_text = f"""BFGS Optimization Results: + Iterations: {self._iteration} + Function evaluations: {len(valid_results)} + Optimal output: {best_output:.6g} + Optimal input: {best_input} + Convergence: {'Yes' if self._finished else 'No (max iterations)'} +""" + + return { + 'text': result_text, + 'data': { + 'iterations': self._iteration, + 'evaluations': len(valid_results), + 'optimal_output': best_output, + 'optimal_input': best_input, + 'converged': self._finished, + } + } diff --git a/examples/algorithms/brent.py b/examples/algorithms/brent.py new file mode 100644 index 0000000..50dcd84 --- /dev/null +++ b/examples/algorithms/brent.py @@ -0,0 +1,120 @@ +#title: Brent's Method for 1D Optimization +#author: Test +#type: optimization +#options: max_iter=50;tol=0.00001;initial_points=3 + +import math + +class Brent: + """Brent's method for 1D optimization""" + + def __init__(self, **options): + self.max_iter = int(options.get('max_iter', 50)) + self.tol = float(options.get('tol', 1e-5)) + self.initial_points = int(options.get('initial_points', 3)) + self.golden_ratio = (3.0 - math.sqrt(5.0)) / 2.0 + self._iteration = 0 + self._input_var_name = None + self._var_bounds = None + self._evaluated_points = [] + self._finished = False + + def get_initial_design(self, input_vars, output_vars): + if len(input_vars) != 1: + raise ValueError( + f"Brent's method only works for 1D optimization. " + f"Got {len(input_vars)} variables: {list(input_vars.keys())}" + ) + + self._input_var_name = list(input_vars.keys())[0] + self._var_bounds = input_vars[self._input_var_name] + min_val, max_val = self._var_bounds + + points = [] + for i in range(self.initial_points): + x = min_val + (max_val - min_val) * i / (self.initial_points - 1) + points.append({self._input_var_name: x}) + return points + + def get_next_design(self, previous_input_vars, previous_output_values): + if self._finished: + return [] + + # Add new results + for inp, out in zip(previous_input_vars, previous_output_values): + if out is not None: + x = inp[self._input_var_name] + self._evaluated_points.append((x, out)) + + if len(self._evaluated_points) < self.initial_points: + return [] + + self._evaluated_points.sort(key=lambda p: p[0]) + + self._iteration += 1 + if self._iteration >= self.max_iter: + self._finished = True + return [] + + # Find best three consecutive points + best_idx = min(range(len(self._evaluated_points)), + key=lambda i: self._evaluated_points[i][1]) + + # Simple convergence check + if best_idx > 0 and best_idx < len(self._evaluated_points) - 1: + a_x = self._evaluated_points[best_idx - 1][0] + c_x = self._evaluated_points[best_idx + 1][0] + if abs(c_x - a_x) < self.tol: + self._finished = True + return [] + + # Golden section search + min_val, max_val = self._var_bounds + x_vals = [x for x, f in self._evaluated_points] + + # Find largest gap + all_x = sorted([min_val] + x_vals + [max_val]) + max_gap = 0 + max_gap_mid = None + for i in range(len(all_x) - 1): + gap = all_x[i + 1] - all_x[i] + if gap > max_gap: + max_gap = gap + max_gap_mid = (all_x[i] + all_x[i + 1]) / 2.0 + + if max_gap < self.tol or max_gap_mid is None: + self._finished = True + return [] + + return [{self._input_var_name: max_gap_mid}] + + def get_analysis(self, input_vars, output_values): + valid_results = [(inp, out) for inp, out in zip(input_vars, output_values) + if out is not None] + + if not valid_results: + return { + 'text': 'No valid results', + 'data': {'iterations': self._iteration, 'evaluations': len(input_vars)} + } + + best_input, best_output = min(valid_results, key=lambda x: x[1]) + + result_text = f"""Brent Optimization Results: + Iterations: {self._iteration} + Function evaluations: {len(valid_results)} + Optimal output: {best_output:.6g} + Optimal input: {best_input} + Convergence: {'Yes' if self._finished else 'No (max iterations)'} +""" + + return { + 'text': result_text, + 'data': { + 'iterations': self._iteration, + 'evaluations': len(valid_results), + 'optimal_output': best_output, + 'optimal_input': best_input, + 'converged': self._finished, + } + } diff --git a/examples/algorithms/demo_plugin_system.py b/examples/algorithms/demo_plugin_system.py new file mode 100644 index 0000000..581369d --- /dev/null +++ b/examples/algorithms/demo_plugin_system.py @@ -0,0 +1,168 @@ +#!/usr/bin/env python3 +""" +Demonstration of the algorithm plugin system + +This script demonstrates: +1. Creating an algorithm plugin in .fz/algorithms/ +2. Loading the algorithm by name (not path) +3. Using the plugin with fzd +""" + +import sys +from pathlib import Path +import tempfile +import shutil + +# Add parent directory to path for imports +sys.path.insert(0, str(Path(__file__).parent.parent.parent)) + +def demo_plugin_system(): + """Demonstrate the algorithm plugin system""" + + print("=" * 70) + print("Algorithm Plugin System Demo") + print("=" * 70) + + # Create temporary directory for demo + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + print(f"\nWorking in: {tmpdir}\n") + + # Step 1: Create .fz/algorithms/ directory + print("Step 1: Creating .fz/algorithms/ directory") + algo_dir = tmpdir / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + print(f" ✓ Created: {algo_dir}\n") + + # Step 2: Create a simple algorithm plugin + print("Step 2: Creating algorithm plugin 'quicksampler.py'") + plugin_file = algo_dir / "quicksampler.py" + plugin_file.write_text(""" +class QuickSampler: + '''Simple random sampler with fixed number of samples''' + + def __init__(self, **options): + self.n_samples = options.get("n_samples", 5) + self.iteration = 0 + + def get_initial_design(self, input_vars, output_vars): + import random + random.seed(42) + + samples = [] + for _ in range(self.n_samples): + sample = {} + for var, (min_val, max_val) in input_vars.items(): + sample[var] = random.uniform(min_val, max_val) + samples.append(sample) + + return samples + + def get_next_design(self, X, Y): + # One-shot sampling - return empty list (finished) + return [] + + def get_analysis(self, X, Y): + valid_Y = [y for y in Y if y is not None] + if not valid_Y: + return {"text": "No valid results", "data": {}} + + mean_val = sum(valid_Y) / len(valid_Y) + min_val = min(valid_Y) + max_val = max(valid_Y) + + return { + "text": f"Sampled {len(valid_Y)} points\\nMean: {mean_val:.2f}\\nRange: [{min_val:.2f}, {max_val:.2f}]", + "data": { + "mean": mean_val, + "min": min_val, + "max": max_val, + "n_samples": len(valid_Y) + } + } +""") + print(f" ✓ Created: {plugin_file.name}\n") + + # Step 3: Load algorithm by name (plugin mode) + print("Step 3: Loading algorithm by name 'quicksampler'") + print(" Note: No .py extension, no path - just the name!") + + import os + os.chdir(tmpdir) # Change to tmpdir so .fz/algorithms/ is found + + from fz.algorithms import load_algorithm + + algo = load_algorithm("quicksampler", n_samples=3) + print(f" ✓ Loaded algorithm: {type(algo).__name__}\n") + + # Step 4: Test the algorithm + print("Step 4: Testing the algorithm") + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + + design = algo.get_initial_design(input_vars, output_vars) + print(f" ✓ Generated {len(design)} samples:") + for i, point in enumerate(design): + print(f" Sample {i+1}: x={point['x']:.2f}, y={point['y']:.2f}") + + # Simulate outputs + outputs = [point['x']**2 + point['y']**2 for point in design] + print(f"\n ✓ Simulated outputs (x² + y²):") + for i, val in enumerate(outputs): + print(f" Output {i+1}: {val:.2f}") + + # Get analysis + analysis = algo.get_analysis(design, outputs) + print(f"\n ✓ Analysis:") + for line in analysis['text'].split('\n'): + print(f" {line}") + + print("\n" + "=" * 70) + print("✓ Plugin System Demo Complete!") + print("=" * 70) + + print("\nKey Takeaways:") + print(" • Algorithms stored in .fz/algorithms/") + print(" • Load by name: load_algorithm('quicksampler')") + print(" • Project-level: .fz/algorithms/ (current directory)") + print(" • Global: ~/.fz/algorithms/ (user home)") + print(" • Priority: Project-level overrides global") + print(" • Works with both .py and .R files") + print() + + +def demo_comparison(): + """Show side-by-side comparison of plugin vs direct path""" + + print("\n" + "=" * 70) + print("Plugin vs Direct Path Comparison") + print("=" * 70) + + print("\n📁 Plugin Mode (Recommended):") + print(" • Place file: .fz/algorithms/myalgo.py") + print(" • Load: load_algorithm('myalgo')") + print(" • Benefits: Organized, shareable, clean code") + print() + + print("📄 Direct Path Mode (Still works):") + print(" • Place file: anywhere/myalgo.py") + print(" • Load: load_algorithm('anywhere/myalgo.py')") + print(" • Benefits: Backward compatible, explicit") + print() + + print("🎯 Use plugin mode for:") + print(" • Team projects (commit .fz/algorithms/ to git)") + print(" • Personal library (~/.fz/algorithms/)") + print(" • Clean, maintainable code") + print() + + print("🎯 Use direct path for:") + print(" • Quick experiments") + print(" • External algorithms") + print(" • Legacy code") + print() + + +if __name__ == "__main__": + demo_plugin_system() + demo_comparison() diff --git a/examples/algorithms/montecarlo_uniform.R b/examples/algorithms/montecarlo_uniform.R new file mode 100644 index 0000000..289310b --- /dev/null +++ b/examples/algorithms/montecarlo_uniform.R @@ -0,0 +1,280 @@ + +#title: Estimate mean with given confidence interval range using Monte Carlo +#author: Yann Richet +#type: sampling +#options: batch_sample_size=10;max_iterations=100;confidence=0.9;target_confidence_range=1.0;seed=42 +#require: base64enc + +# Constructor for MonteCarlo_Uniform S3 class +MonteCarlo_Uniform <- function(...) { + # Get options from ... arguments + opts <- list(...) + + # Create object with initial state + # Use an environment for mutable state (idiomatic S3 pattern) + state <- new.env(parent = emptyenv()) + state$n_samples <- 0 + state$variables <- list() + + obj <- list( + options = list( + batch_sample_size = as.integer( + ifelse(is.null(opts$batch_sample_size), 10, opts$batch_sample_size) + ), + max_iterations = as.integer( + ifelse(is.null(opts$max_iterations), 100, opts$max_iterations) + ), + confidence = as.numeric( + ifelse(is.null(opts$confidence), 0.9, opts$confidence) + ), + target_confidence_range = as.numeric( + ifelse(is.null(opts$target_confidence_range), 1.0, opts$target_confidence_range) + ) + ), + state = state # Environment for mutable state + ) + + # Set random seed + seed <- ifelse(is.null(opts$seed), 42, opts$seed) + set.seed(as.integer(seed)) + + # Set S3 class + class(obj) <- "MonteCarlo_Uniform" + + return(obj) +} + +# Generic function definitions (if not already defined) +if (!exists("get_initial_design")) { + get_initial_design <- function(obj, ...) UseMethod("get_initial_design") +} + +if (!exists("get_next_design")) { + get_next_design <- function(obj, ...) UseMethod("get_next_design") +} + +if (!exists("get_analysis")) { + get_analysis <- function(obj, ...) UseMethod("get_analysis") +} + +if (!exists("get_analysis_tmp")) { + get_analysis_tmp <- function(obj, ...) UseMethod("get_analysis_tmp") +} + +# Method: get_initial_design +get_initial_design.MonteCarlo_Uniform <- function(obj, input_variables, output_variables) { + # Store variable bounds in mutable state + # input_variables is a named list: list(var1 = c(min, max), var2 = c(min, max)) + for (v in names(input_variables)) { + bounds <- input_variables[[v]] + if (!is.numeric(bounds) || length(bounds) != 2) { + stop(paste("Input variable", v, "must have c(min, max) bounds for MonteCarlo_Uniform sampling")) + } + obj$state$variables[[v]] <- bounds + } + + return(generate_samples(obj, obj$options$batch_sample_size)) +} + +# Method: get_next_design +get_next_design.MonteCarlo_Uniform <- function(obj, X, Y) { + # Check max iterations + if (obj$state$n_samples >= obj$options$max_iterations * obj$options$batch_sample_size) { + return(list()) # Empty list signals finished + } + + # Filter out NULL/NA values + Y_valid <- Y[!sapply(Y, is.null) & !is.na(Y)] + Y_valid <- unlist(Y_valid) + + if (length(Y_valid) < 2) { + return(generate_samples(obj, obj$options$batch_sample_size)) + } + + # Calculate confidence interval + mean_y <- mean(Y_valid) + n <- length(Y_valid) + se <- sd(Y_valid) / sqrt(n) + + # t-distribution confidence interval + alpha <- 1 - obj$options$confidence + t_critical <- qt(1 - alpha/2, df = n - 1) + conf_int_lower <- mean_y - t_critical * se + conf_int_upper <- mean_y + t_critical * se + conf_range <- conf_int_upper - conf_int_lower + + # Stop if confidence interval is narrow enough + if (conf_range <= obj$options$target_confidence_range) { + return(list()) # Finished + } + + # Generate more samples + return(generate_samples(obj, obj$options$batch_sample_size)) +} + +# Method: get_analysis +get_analysis.MonteCarlo_Uniform <- function(obj, X, Y) { + analysis_dict <- list(text = "", data = list()) + + # Filter out NULL/NA values + Y_valid <- Y[!sapply(Y, is.null) & !is.na(Y)] + Y_valid <- unlist(Y_valid) + + if (length(Y_valid) < 2) { + analysis_dict$text <- "Not enough valid results to analyze statistics" + analysis_dict$data <- list(valid_samples = length(Y_valid)) + return(analysis_dict) + } + + # Calculate statistics + mean_y <- mean(Y_valid) + std_y <- sd(Y_valid) + n <- length(Y_valid) + se <- std_y / sqrt(n) + + # t-distribution confidence interval + alpha <- 1 - obj$options$confidence + t_critical <- qt(1 - alpha/2, df = n - 1) + conf_int_lower <- mean_y - t_critical * se + conf_int_upper <- mean_y + t_critical * se + + # Store data + analysis_dict$data <- list( + mean = mean_y, + std = std_y, + confidence_interval = c(conf_int_lower, conf_int_upper), + n_samples = length(Y_valid), + min = min(Y_valid), + max = max(Y_valid) + ) + + # Create text summary + analysis_dict$text <- sprintf( +"Monte Carlo Sampling Results: + Valid samples: %d + Mean: %.6f + Std: %.6f + %.0f%% confidence interval: [%.6f, %.6f] + Range: [%.6f, %.6f] +", + length(Y_valid), + mean_y, + std_y, + obj$options$confidence * 100, + conf_int_lower, + conf_int_upper, + min(Y_valid), + max(Y_valid) + ) + + # Try to create HTML with histogram + tryCatch({ + # Create histogram plot + png_file <- tempfile(fileext = ".png") + png(png_file, width = 800, height = 600) + + hist(Y_valid, breaks = 20, freq = FALSE, + col = rgb(0, 1, 0, 0.6), + border = "black", + main = "Output Distribution", + xlab = "Output Value", + ylab = "Density") + grid(col = rgb(0, 0, 0, 0.3)) + + # Add mean line + abline(v = mean_y, col = "red", lwd = 2, lty = 2) + legend("topright", + legend = sprintf("Mean: %.3f", mean_y), + col = "red", lty = 2, lwd = 2) + + dev.off() + + # Convert to base64 + if (requireNamespace("base64enc", quietly = TRUE)) { + img_base64 <- base64enc::base64encode(png_file) + + html_output <- sprintf( +'
+

Estimated mean: %.6f

+

%.0f%% confidence interval: [%.6f, %.6f]

+ Histogram +
', + mean_y, + obj$options$confidence * 100, + conf_int_lower, + conf_int_upper, + img_base64 + ) + analysis_dict$html <- html_output + } + + # Clean up temp file + unlink(png_file) + }, error = function(e) { + # If plotting fails, just skip it + }) + + return(analysis_dict) +} + +# Method: get_analysis_tmp +get_analysis_tmp.MonteCarlo_Uniform <- function(obj, X, Y) { + # Filter out NULL/NA values + Y_valid <- Y[!sapply(Y, is.null) & !is.na(Y)] + Y_valid <- unlist(Y_valid) + + if (length(Y_valid) < 2) { + return(list( + text = sprintf(" Progress: %d valid sample(s) collected", length(Y_valid)), + data = list(valid_samples = length(Y_valid)) + )) + } + + # Calculate statistics + mean_y <- mean(Y_valid) + std_y <- sd(Y_valid) + n <- length(Y_valid) + se <- std_y / sqrt(n) + + # t-distribution confidence interval + alpha <- 1 - obj$options$confidence + t_critical <- qt(1 - alpha/2, df = n - 1) + conf_int_lower <- mean_y - t_critical * se + conf_int_upper <- mean_y + t_critical * se + conf_range <- conf_int_upper - conf_int_lower + + return(list( + text = sprintf( + " Progress: %d samples, mean=%.6f, %.0f%% CI range=%.6f", + length(Y_valid), + mean_y, + obj$options$confidence * 100, + conf_range + ), + data = list( + n_samples = length(Y_valid), + mean = mean_y, + std = std_y, + confidence_range = conf_range + ) + )) +} + +# Helper function: generate_samples (not a method, internal use only) +generate_samples <- function(obj, n) { + samples <- list() + + for (i in 1:n) { + sample <- list() + for (v in names(obj$state$variables)) { + bounds <- obj$state$variables[[v]] + sample[[v]] <- runif(1, min = bounds[1], max = bounds[2]) + } + samples[[i]] <- sample + } + + # Update n_samples in state environment (mutable) + obj$state$n_samples <- obj$state$n_samples + n + + return(samples) +} diff --git a/examples/algorithms/montecarlo_uniform.py b/examples/algorithms/montecarlo_uniform.py new file mode 100644 index 0000000..8ff8a90 --- /dev/null +++ b/examples/algorithms/montecarlo_uniform.py @@ -0,0 +1,229 @@ + +#title: Estimate mean with given confidence interval range using Monte Carlo +#author: Yann Richet +#type: sampling +#options: batch_sample_size=10;max_iterations=100;confidence=0.9;target_confidence_range=1.0;seed=42 +#require: numpy;scipy;matplotlib + +class MonteCarlo_Uniform: + """Monte Carlo sampling algorithm with adaptive stopping based on confidence interval""" + + def __init__(self, **options): + """Initialize with algorithm options""" + self.options = {} + self.options["batch_sample_size"] = int(options.get("batch_sample_size", 10)) + self.options["max_iterations"] = int(options.get("max_iterations", 100)) + self.options["confidence"] = float(options.get("confidence", 0.9)) + self.options["target_confidence_range"] = float(options.get("target_confidence_range", 1.0)) + + self.n_samples = 0 + self.variables = {} + + import numpy as np + np.random.seed(int(options.get("seed", 42))) + + def get_initial_design(self, input_variables, output_variables): + """ + Generate initial design + + Args: + input_variables: Dict[str, Tuple[float, float]] - {var: (min, max)} + output_variables: List[str] - output variable names + """ + for v, bounds in input_variables.items(): + # Bounds are already parsed as tuples (min, max) + if isinstance(bounds, tuple) and len(bounds) == 2: + self.variables[v] = bounds + else: + raise ValueError( + f"Input variable {v} must have (min, max) tuple bounds for MonteCarlo_Uniform sampling" + ) + return self._generate_samples(self.options["batch_sample_size"]) + + def get_next_design(self, X, Y): + """ + Generate next design based on convergence criteria + + Args: + X: List[Dict[str, float]] - previous inputs + Y: List[float] - previous outputs (may contain None) + + Returns: + List[Dict[str, float]] - next points, or [] if finished + """ + # Check max iterations + if self.n_samples >= self.options["max_iterations"] * self.options["batch_sample_size"]: + return [] + + # Filter out None values + import numpy as np + from scipy import stats + Y_valid = [y for y in Y if y is not None] + + if len(Y_valid) < 2: + return self._generate_samples(self.options["batch_sample_size"]) + + Y_array = np.array(Y_valid) + mean = np.mean(Y_array) + conf_int = stats.t.interval( + self.options["confidence"], + len(Y_array) - 1, + loc=mean, + scale=stats.sem(Y_array) + ) + conf_range = conf_int[1] - conf_int[0] + + # Stop if confidence interval is narrow enough + if conf_range <= self.options["target_confidence_range"]: + return [] + + # Generate more samples + return self._generate_samples(self.options["batch_sample_size"]) + + def _generate_samples(self, n): + import numpy as np + samples = [] + for _ in range(n): + sample = {} + for v,(min_val,max_val) in self.variables.items(): + sample[v] = np.random.uniform(min_val, max_val) + samples.append(sample) + self.n_samples += n + return samples + + def get_analysis(self, X, Y): + """ + Display results with statistics and histogram + + Args: + X: List[Dict[str, float]] - all evaluated inputs + Y: List[float] - all outputs (may contain None) + + Returns: + Dict with 'text', 'data', and optionally 'html' keys + """ + import numpy as np + from scipy import stats + + analysis_dict = {"text": "", "data": {}} + + # Filter out None values + Y_valid = [y for y in Y if y is not None] + + if len(Y_valid) < 2: + analysis_dict["text"] = "Not enough valid results to analysis statistics" + analysis_dict["data"] = {"valid_samples": len(Y_valid)} + return analysis_dict + + Y_array = np.array(Y_valid) + mean = np.mean(Y_array) + std = np.std(Y_array) + conf_int = stats.t.interval( + self.options["confidence"], + len(Y_array) - 1, + loc=mean, + scale=stats.sem(Y_array) + ) + + # Store data + analysis_dict["data"] = { + "mean": float(mean), + "std": float(std), + "confidence_interval": [float(conf_int[0]), float(conf_int[1])], + "n_samples": len(Y_valid), + "min": float(np.min(Y_array)), + "max": float(np.max(Y_array)) + } + + # Create text summary + analysis_dict["text"] = f"""Monte Carlo Sampling Results: + Valid samples: {len(Y_valid)} + Mean: {mean:.6f} + Std: {std:.6f} + {self.options['confidence']*100:.0f}% confidence interval: [{conf_int[0]:.6f}, {conf_int[1]:.6f}] + Range: [{np.min(Y_array):.6f}, {np.max(Y_array):.6f}] +""" + + # Try to create HTML with histogram + try: + import matplotlib + matplotlib.use('Agg') # Non-interactive backend + import matplotlib.pyplot as plt + import base64 + from io import BytesIO + + plt.figure(figsize=(8, 6)) + plt.hist(Y_array, bins=20, density=True, alpha=0.6, color='g', edgecolor='black') + plt.title("Output Distribution") + plt.xlabel("Output Value") + plt.ylabel("Density") + plt.grid(alpha=0.3) + + # Add mean line + plt.axvline(mean, color='r', linestyle='--', linewidth=2, label=f'Mean: {mean:.3f}') + plt.legend() + + # Convert to base64 + buffered = BytesIO() + plt.savefig(buffered, format="png", dpi=100, bbox_inches='tight') + plt.close() + img_str = base64.b64encode(buffered.getvalue()).decode() + + html_output = f"""
+

Estimated mean: {mean:.6f}

+

{self.options['confidence']*100:.0f}% confidence interval: [{conf_int[0]:.6f}, {conf_int[1]:.6f}]

+ Histogram +
""" + analysis_dict["html"] = html_output + except Exception as e: + # If plotting fails, just skip it + pass + + return analysis_dict + + def get_analysis_tmp(self, X, Y): + """ + Display intermediate results at each iteration + + Args: + X: List[Dict[str, float]] - all evaluated inputs so far + Y: List[float] - all outputs so far (may contain None) + + Returns: + Dict with 'text' and 'data' keys + """ + import numpy as np + from scipy import stats + + # Filter out None values + Y_valid = [y for y in Y if y is not None] + + if len(Y_valid) < 2: + return { + 'text': f" Progress: {len(Y_valid)} valid sample(s) collected", + 'data': {'valid_samples': len(Y_valid)} + } + + Y_array = np.array(Y_valid) + mean = np.mean(Y_array) + std = np.std(Y_array) + conf_int = stats.t.interval( + self.options["confidence"], + len(Y_array) - 1, + loc=mean, + scale=stats.sem(Y_array) + ) + conf_range = conf_int[1] - conf_int[0] + + return { + 'text': f" Progress: {len(Y_valid)} samples, " + f"mean={mean:.6f}, " + f"{self.options['confidence']*100:.0f}% CI range={conf_range:.6f}", + 'data': { + 'n_samples': len(Y_valid), + 'mean': float(mean), + 'std': float(std), + 'confidence_range': float(conf_range) + } + } + diff --git a/examples/algorithms/randomsampling.py b/examples/algorithms/randomsampling.py new file mode 100644 index 0000000..0271b7a --- /dev/null +++ b/examples/algorithms/randomsampling.py @@ -0,0 +1,60 @@ +#title: Random Sampling Algorithm +#author: Test +#type: sampling +#options: nvalues=10;seed=42 + +import random + +class Randomsampling: + """Random sampling algorithm for design of experiments""" + + def __init__(self, **options): + self.nvalues = int(options.get('nvalues', 10)) + seed = options.get('seed', None) + if seed is not None: + random.seed(int(seed)) + + def get_initial_design(self, input_vars, output_vars): + samples = [] + for i in range(self.nvalues): + sample = {} + for var_name, (min_val, max_val) in input_vars.items(): + sample[var_name] = random.uniform(min_val, max_val) + samples.append(sample) + return samples + + def get_next_design(self, previous_input_vars, previous_output_values): + return [] # One-shot algorithm + + def get_analysis(self, input_vars, output_values): + valid_results = [(inp, out) for inp, out in zip(input_vars, output_values) + if out is not None] + + if not valid_results: + return {'text': 'No valid results', 'data': {'samples': len(input_vars), 'valid_samples': 0}} + + best_input, best_output = min(valid_results, key=lambda x: x[1]) + worst_input, worst_output = max(valid_results, key=lambda x: x[1]) + valid_outputs = [out for out in output_values if out is not None] + mean_output = sum(valid_outputs) / len(valid_outputs) + + result_text = f"""Random Sampling Results: + Total samples: {len(input_vars)} + Valid samples: {len(valid_results)} + Best output: {best_output:.6g} + Best input: {best_input} + Worst output: {worst_output:.6g} + Mean output: {mean_output:.6g} +""" + + return { + 'text': result_text, + 'data': { + 'samples': len(input_vars), + 'valid_samples': len(valid_results), + 'best_output': best_output, + 'best_input': best_input, + 'worst_output': worst_output, + 'mean_output': mean_output, + } + } diff --git a/examples/dataframe_input.md b/examples/dataframe_input.md new file mode 100644 index 0000000..d4af658 --- /dev/null +++ b/examples/dataframe_input.md @@ -0,0 +1,402 @@ +# DataFrame Input for Non-Factorial Designs + +This document explains how to use pandas DataFrames as input to FZ for non-factorial parametric studies. + +## Overview + +FZ supports two types of parametric study designs: + +1. **Factorial Design (Dict)**: Creates all possible combinations (Cartesian product) +2. **Non-Factorial Design (DataFrame)**: Runs only specified combinations + +## When to Use DataFrames + +Use DataFrame input when: +- Variables have constraints or dependencies +- You need specific combinations, not all combinations +- You're importing a design from another tool (DOE software, optimization) +- You want specialized sampling (Latin Hypercube, Sobol sequences, etc.) +- You have an irregular or optimized design space + +## Basic Example + +### Factorial (Dict) - ALL Combinations + +```python +from fz import fzr + +# Dict creates Cartesian product (factorial design) +input_variables = { + "temp": [100, 200], + "pressure": [1.0, 2.0] +} +# Creates 4 cases: 2 × 2 = 4 +# (100, 1.0), (100, 2.0), (200, 1.0), (200, 2.0) + +results = fzr(input_file, input_variables, model, calculators) +``` + +### Non-Factorial (DataFrame) - SPECIFIC Combinations + +```python +import pandas as pd +from fz import fzr + +# DataFrame: each row is one case +input_variables = pd.DataFrame({ + "temp": [100, 200, 100], + "pressure": [1.0, 1.0, 2.0] +}) +# Creates 3 cases ONLY: +# (100, 1.0), (200, 1.0), (100, 2.0) +# Note: (200, 2.0) is NOT included + +results = fzr(input_file, input_variables, model, calculators) +``` + +## Practical Examples + +### 1. Constraint-Based Design + +When variables have physical or logical constraints: + +```python +import pandas as pd +from fz import fzr + +# Engine RPM and Load have constraints: +# - Low RPM → Low Load (avoid stalling) +# - High RPM → Higher Load possible +input_variables = pd.DataFrame({ + "rpm": [1000, 1500, 2000, 2500, 3000], + "load": [10, 20, 30, 40, 50] # Load increases with RPM +}) + +# This pattern CANNOT be created with a dict +# Dict would create all 25 combinations (5×5), including invalid ones like: +# (1000 RPM, 50 Load) - would stall the engine + +results = fzr("engine_input.txt", input_variables, model, calculators) +``` + +### 2. Latin Hypercube Sampling (LHS) + +For efficient design space exploration with fewer samples: + +```python +import pandas as pd +from scipy.stats import qmc +from fz import fzr + +# Create Latin Hypercube sample in 3 dimensions +sampler = qmc.LatinHypercube(d=3, seed=42) +sample = sampler.random(n=20) # 20 samples instead of full factorial + +# Scale to actual variable ranges +input_variables = pd.DataFrame({ + "temperature": 100 + sample[:, 0] * 200, # [100, 300] + "pressure": 1.0 + sample[:, 1] * 4.0, # [1.0, 5.0] + "flow_rate": 10 + sample[:, 2] * 40 # [10, 50] +}) + +# Compare: Full factorial with [100,150,200,250,300] × [1,2,3,4,5] × [10,20,30,40,50] +# would be 5×5×5 = 125 cases +# LHS: Only 20 cases, but covers the design space well + +results = fzr("simulation.txt", input_variables, model, calculators) +``` + +### 3. Sobol Sequence Sampling + +For low-discrepancy quasi-random sampling: + +```python +import pandas as pd +from scipy.stats import qmc +from fz import fzr + +# Generate Sobol sequence +sampler = qmc.Sobol(d=2, scramble=True, seed=42) +sample = sampler.random(n=32) # Power of 2 recommended for Sobol + +input_variables = pd.DataFrame({ + "x": sample[:, 0] * 100, # [0, 100] + "y": sample[:, 1] * 50 # [0, 50] +}) + +results = fzr("input.txt", input_variables, model, calculators) +``` + +### 4. Imported Design from DOE Software + +Import designs from external tools: + +```python +import pandas as pd +from fz import fzr + +# Design created in R (DoE.base), MODDE, JMP, etc. +input_variables = pd.read_csv("central_composite_design.csv") + +# Or Excel file +input_variables = pd.read_excel("doe_design.xlsx", sheet_name="Design") + +# Or from a previous FZ run +previous_results = pd.read_csv("results.csv") +# Re-run with different settings +input_variables = previous_results[["temp", "pressure", "flow"]] + +results = fzr("input.txt", input_variables, model, calculators) +``` + +### 5. Sensitivity Analysis (One-at-a-Time) + +Test effect of each variable independently: + +```python +import pandas as pd +from fz import fzr + +# Baseline case +baseline = {"temp": 150, "pressure": 2.5, "flow": 30} + +# One-at-a-time variations +oat_cases = [] + +# Vary temperature +for temp in [100, 125, 150, 175, 200]: + oat_cases.append({"temp": temp, "pressure": baseline["pressure"], "flow": baseline["flow"]}) + +# Vary pressure +for pressure in [1.0, 1.5, 2.0, 2.5, 3.0]: + oat_cases.append({"temp": baseline["temp"], "pressure": pressure, "flow": baseline["flow"]}) + +# Vary flow +for flow in [10, 20, 30, 40, 50]: + oat_cases.append({"temp": baseline["temp"], "pressure": baseline["pressure"], "flow": flow}) + +input_variables = pd.DataFrame(oat_cases) +# Creates 13 cases instead of full factorial (5×5×5 = 125) + +results = fzr("input.txt", input_variables, model, calculators) +``` + +### 6. Custom Optimization Samples + +Run calculations at specific points from an optimization algorithm: + +```python +import pandas as pd +import numpy as np +from fz import fzr + +# Points suggested by optimization algorithm (e.g., Bayesian Optimization) +optimization_points = np.array([ + [120, 1.5], + [180, 2.3], + [150, 1.8], + [200, 2.7], + [110, 1.2] +]) + +input_variables = pd.DataFrame( + optimization_points, + columns=["temp", "pressure"] +) + +results = fzr("input.txt", input_variables, model, calculators) + +# Use results to inform next iteration of optimization +best_case = results.loc[results["efficiency"].idxmax()] +``` + +### 7. Time Series / Sequential Cases + +When cases represent sequential states: + +```python +import pandas as pd +import numpy as np +from fz import fzr + +# Simulate a ramping process +time = np.linspace(0, 100, 50) +input_variables = pd.DataFrame({ + "time": time, + "temperature": 100 + 2 * time, # Linear ramp + "pressure": 1.0 + 0.5 * np.sin(time/10) # Oscillating pressure +}) + +results = fzr("input.txt", input_variables, model, calculators) +``` + +## DataFrame vs Dict Comparison + +| Aspect | Dict (Factorial) | DataFrame (Non-Factorial) | +|--------|------------------|---------------------------| +| **Number of cases** | All combinations (product) | Exactly as many rows in DataFrame | +| **Design type** | Full factorial | Custom / irregular | +| **Use case** | Complete exploration | Specific combinations | +| **Example** | `{"x": [1,2], "y": [3,4]}` → 4 cases | `pd.DataFrame({"x":[1,2], "y":[3,4]})` → 2 cases | +| **Constraints** | Cannot handle constraints | Can handle constraints | +| **Sampling** | Grid-based | Any sampling method | + +## Tips and Best Practices + +### 1. Verify Your Design + +Always check your DataFrame before running: + +```python +# Check number of cases +print(f"Number of cases: {len(input_variables)}") + +# Check for duplicates +duplicates = input_variables.duplicated() +if duplicates.any(): + print(f"Warning: {duplicates.sum()} duplicate cases found") + input_variables = input_variables.drop_duplicates() + +# Preview cases +print(input_variables.head()) +``` + +### 2. Combine with Results + +DataFrames make it easy to analyze results: + +```python +import pandas as pd +from fz import fzr + +input_variables = pd.DataFrame({ + "x": [1, 2, 3, 4, 5], + "y": [10, 20, 15, 25, 30] +}) + +results = fzr("input.txt", input_variables, model, calculators) + +# Results include all input variables +print(results[["x", "y", "output"]]) + +# Easy plotting +import matplotlib.pyplot as plt +plt.scatter(results["x"], results["output"], c=results["y"]) +plt.xlabel("X") +plt.ylabel("Output") +plt.colorbar(label="Y") +plt.show() +``` + +### 3. Save and Load Designs + +```python +import pandas as pd + +# Save design for later +input_variables.to_csv("my_design.csv", index=False) + +# Load and reuse +input_variables = pd.read_csv("my_design.csv") +results = fzr("input.txt", input_variables, model, calculators) +``` + +### 4. Append or Filter Cases + +```python +import pandas as pd + +# Start with a base design +base_design = pd.DataFrame({ + "temp": [100, 200, 300], + "pressure": [1.0, 2.0, 3.0] +}) + +# Add edge cases +edge_cases = pd.DataFrame({ + "temp": [50, 350], + "pressure": [0.5, 4.0] +}) + +input_variables = pd.concat([base_design, edge_cases], ignore_index=True) + +# Or filter to specific range +input_variables = input_variables[ + (input_variables["temp"] >= 100) & + (input_variables["temp"] <= 300) +] +``` + +## Common Patterns + +### Design of Experiments (DOE) + +```python +import pandas as pd +from itertools import product + +# 2^k factorial design (k=3 factors, 2 levels) +factors = { + "temp": [100, 200], + "pressure": [1.0, 2.0], + "flow": [10, 20] +} + +# Create all combinations (this is what dict does automatically) +combinations = list(product(*factors.values())) +full_factorial = pd.DataFrame(combinations, columns=factors.keys()) + +# Add center points +center_point = pd.DataFrame({ + "temp": [150], + "pressure": [1.5], + "flow": [15] +}) + +# Central Composite Design = factorial + center + star points +star_points = pd.DataFrame({ + "temp": [50, 250, 150, 150, 150, 150], + "pressure": [1.5, 1.5, 0.5, 2.5, 1.5, 1.5], + "flow": [15, 15, 15, 15, 5, 25] +}) + +ccd_design = pd.concat([full_factorial, center_point, star_points], ignore_index=True) +``` + +### Sparse Grid / Adaptive Sampling + +```python +import pandas as pd +import numpy as np + +# Start with coarse grid +coarse_grid = pd.DataFrame({ + "x": [0, 50, 100], + "y": [0, 50, 100] +}) + +results_coarse = fzr("input.txt", coarse_grid, model, calculators) + +# Identify region of interest (e.g., high output) +threshold = results_coarse["output"].quantile(0.75) +interesting_cases = results_coarse[results_coarse["output"] > threshold] + +# Refine around interesting region +refined_grid = pd.DataFrame({ + "x": np.linspace(40, 60, 10), + "y": np.linspace(40, 60, 10) +}) + +results_refined = fzr("input.txt", refined_grid, model, calculators) +``` + +## Summary + +DataFrames provide maximum flexibility for parametric studies: +- ✅ Support non-factorial designs +- ✅ Handle variable constraints +- ✅ Enable advanced sampling methods +- ✅ Easy integration with DOE tools +- ✅ Seamless result analysis + +Use dicts for simple factorial designs, use DataFrames for everything else! diff --git a/examples/examples.md b/examples/examples.md index 6ff9579..50cb41e 100644 --- a/examples/examples.md +++ b/examples/examples.md @@ -559,3 +559,182 @@ fz.fzi("input_r.txt", }) ``` +# dataframe input variable example + +```python +import pandas as pd +df=pd.DataFrame({ + "T_celsius": [20,25,30], + "V_L": [1,1.5,2], + "n_mol": [1,1,1] +}) +fz.fzr("input.txt", +df,{ + "varprefix": "$", + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": {"pressure": "grep 'pressure = ' output.txt | awk '{print $3}'"} +}, calculators=["sh://bash ./PerfectGazPressure.sh"]*3, results_dir="results") +``` + +# fzd example + +create design of experiments basic algorithm to estimate a mean with given standard deviation and confidence interval. + +montecarlo_uniform.py: +```bash +echo ' +#title: Estimate mean with given confidence interval range using Monte Carlo +#author: Yann Richet +#type: sampling +#options: batch_sample_size=10;max_iterations=100;confidence=0.9;target_confidence_range=1.0;seed=42 +#require: numpy;scipy;matplotlib;base64 +class MonteCarlo_Uniform: + + options = {} + samples = [] + n_samples = 0 + variables = {} + + def __init__(self, options): + # parse (numeric) options + self.options["batch_sample_size"] = int(options.get("batch_sample_size",10)) + self.options["max_iterations"] = int(options.get("max_iterations",100)) + self.options["confidence"] = float(options.get("confidence",0.9)) + self.options["target_confidence_range"] = float(options.get("target_confidence_range",1.0)) + + import numpy as np + from scipy import stats + np.random.seed(int(options.get("seed",42))) + + def get_initial_design(self, input_variables, output_variables): + for v,bounds in input_variables.items(): + # parse bounds string : [min;max] + bounds = bounds.strip("[]").split(";") + if len(bounds)!=2: + raise Exception(f"Input variable {v} must be defined with min and max values for MonteCarlo_Uniform sampling") + min_val=float(bounds[0]) + max_val=float(bounds[1]) + self.variables[v] = (min_val, max_val) + return self._generate_samples(self.options["batch_sample_size"]) + + def get_next_design(self, X, Y): + # check max iterations + if self.n_samples >= self.options["max_iterations"] * self.options["batch_sample_size"]: + return None + # check confidence interval: compute empirical confidence interval (using kernel density) on Y, compare with target_confidence_range + import numpy as np + from scipy import stats + Y_array = np.array(Y) + kde = stats.gaussian_kde(Y_array) + mean = np.mean(Y_array) + conf_int = stats.t.interval(self.options["confidence"], len(Y_array)-1, loc=mean, scale=stats.sem(Y_array)) + conf_range = conf_int[1] - conf_int[0] + if conf_range <= self.options["target_confidence_range"]: + return None + # else generate new samples + return self._generate_samples(self.options["batch_sample_size"]) + + def _generate_samples(self, n): + import numpy as np + samples = [] + for _ in range(n): + sample = {} + for v,(min_val,max_val) in self.variables.items(): + sample[v] = np.random.uniform(min_val, max_val) + samples.append(sample) + self.n_samples += n + return samples + + def get_analysis(self, X, Y): + html_output = "" + import numpy as np + from scipy import stats + Y_array = np.array(Y) + mean = np.mean(Y_array) + conf_int = stats.t.interval(self.options["confidence"], len(Y_array)-1, loc=mean, scale=stats.sem(Y_array)) + html_output += f"

Estimated mean: {mean}

" + html_output += f"

{self.options['confidence']*100}% confidence interval: [{conf_int[0]}, {conf_int[1]}]

" + # plot histogram + import matplotlib.pyplot as plt + plt.hist(Y_array, bins=20, density=True, alpha=0.6, color='bg') + plt.title("Output Y histogram") + plt.xlabel("Y") + plt.ylabel("Density") + plt.grid() + # base64 in html + import base64 + from io import BytesIO + buffered = BytesIO() + plt.savefig(buffered, format="png") + img_str = base64.b64encode(buffered.getvalue()).decode() + html_output += f"\"Histogram\"/" + return html_output +' > ./examples/algorithms/montecarlo_uniform.py +``` + +```python +analysis = fz.fzd( + input_path='input.txt', + input_variables={ + "n_mol": "[0;10]", + "T_celsius": "[0;100]", + "V_L": "[1;5]" + }, + model={ + "varprefix": "$", + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": {"pressure": "grep 'pressure = ' output.txt | awk '{print $3}'"} + }, + calculators=["sh://bash ./PerfectGazPressure.sh"]*10, + output_expression="pressure+1", + algorithm="./examples/algorithms/montecarlo_uniform.py", + algorithm_options={ + "batch_sample_size": 20, + "max_iterations": 50, + "confidence": 0.90, + "target_confidence_range": 1000000, + "seed": 123 + }, + analysis_dir="fzd_analysis" +) + +from IPython.core.display import display, HTML +display(HTML(analysis)) +``` + +with R algorithm: +```python +analysis = fz.fzd( + input_path='input.txt', + input_variables={ + "n_mol": "[0;10]", + "T_celsius": "[0;100]", + "V_L": "[1;5]" + }, + model={ + "varprefix": "$", + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": {"pressure": "grep 'pressure = ' output.txt | awk '{print $3}'"} + }, + calculators=["sh://bash ./PerfectGazPressure.sh"]*10, + output_expression="pressure+1", + algorithm="./examples/algorithms/montecarlo_uniform.R", + algorithm_options={ + "batch_sample_size": 20, + "max_iterations": 50, + "confidence": 0.90, + "target_confidence_range": 1000000, + "seed": 123 + }, + analysis_dir="fzd_analysis" +) + +from IPython.core.display import display, HTML +display(HTML(analysis)) +``` diff --git a/examples/fz_modelica_projectile.ipynb b/examples/fz_modelica_projectile.ipynb new file mode 100644 index 0000000..ec6d247 --- /dev/null +++ b/examples/fz_modelica_projectile.ipynb @@ -0,0 +1,1697 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# FZ Framework with Modelica - Projectile Motion Demo\n", + "\n", + "This notebook demonstrates the **FZ** parametric scientific computing framework integrated with **OpenModelica** for rigorous differential equation solving.\n", + "\n", + "## What We'll Cover\n", + "\n", + "1. **Installation** - Set up FZ and verify OpenModelica\n", + "2. **Modelica Model** - Use differential equations for projectile motion\n", + "3. **Basic Calculations** - Run parametric simulations with OpenModelica\n", + "4. **Design of Experiments** - Systematic parameter space exploration\n", + "5. **Optimization** - Find optimal launch parameters using Gradient Descent\n", + "6. **Root Finding** - Find angle for specific target range using Brent's method\n", + "\n", + "## About This Notebook\n", + "\n", + "This notebook demonstrates FZ's integration with **OpenModelica**, using true differential equations for physics simulation:\n", + "\n", + "- **Physics**: Differential equations solved by OpenModelica\n", + "- **Solver**: Professional ODE solvers (DASSL, Radau, etc.)\n", + "- **Setup**: Requires OpenModelica installation\n", + "- **Speed**: ~1-2s per simulation (includes Modelica compilation)\n", + "- **Best for**: Rigorous modeling, complex multi-physics systems\n", + "\n", + "---" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## 1. Installation and Setup\n", + "\n", + "This cell will set up everything you need:\n", + "- Python dependencies (FZ framework, pandas, numpy, scipy, matplotlib, plotly)\n", + "- fz-modelica plugin (model configuration and calculator)\n", + "- ProjectileMotion.mo template (downloaded from fz-modelica repository and enhanced)\n", + "\n", + "### Requirements\n", + "\n", + "**System requirements:**\n", + "- **OpenModelica** - Must be installed on your system\n", + " - Ubuntu/Debian: `sudo apt-get install openmodelica`\n", + " - macOS: `brew install openmodelica`\n", + " - Windows: Download from https://openmodelica.org/download/\n", + "- **Python 3.7+**\n", + "\n", + "**This cell will install automatically:**\n", + "- FZ framework from GitHub\n", + "- fz-modelica plugin (model config and calculator) from https://github.com/Funz/fz-modelica\n", + "- Python packages: pandas, numpy, scipy, matplotlib, plotly\n", + "\n", + "**The next cells will:**\n", + "- Download base ProjectileMotion.mo model from fz-modelica repository\n", + "- Enhance it with air resistance for realistic physics\n", + "- Compute physics outputs (max_height, range, etc.) from trajectory data\n", + "\n", + "Let's set up everything..." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Install OpenModelica (Google Colab)\n", + "\n", + "**Run this cell if you're on Google Colab** to install OpenModelica system package.\n", + "\n", + "On other platforms, install OpenModelica manually:\n", + "- Ubuntu/Debian: `sudo apt-get install openmodelica`\n", + "- macOS: `brew install openmodelica`\n", + "- Windows: Download from https://openmodelica.org/download/" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import sys\n", + "import subprocess\n", + "import shutil\n", + "\n", + "# Detect if running on Google Colab\n", + "try:\n", + " import google.colab\n", + " IN_COLAB = True\n", + "except ImportError:\n", + " IN_COLAB = False\n", + "\n", + "if IN_COLAB:\n", + " print(\"=\" * 60)\n", + " print(\"INSTALLING OPENMODELICA ON GOOGLE COLAB\")\n", + " print(\"=\" * 60)\n", + " print(\"\\nThis will install OpenModelica system package...\")\n", + " print(\"(This may take 2-5 minutes)\\n\")\n", + " \n", + " # Update package list\n", + " print(\"Step 1/3: Updating package list...\")\n", + " subprocess.run([\"apt-get\", \"update\", \"-qq\"], check=True)\n", + " print(\"✓ Package list updated\\n\")\n", + " \n", + " # Install OpenModelica\n", + " print(\"Step 2/3: Installing OpenModelica...\")\n", + " subprocess.run([\"apt-get\", \"install\", \"-y\", \"-qq\", \"openmodelica\"], \n", + " check=True, stdout=subprocess.DEVNULL)\n", + " print(\"✓ OpenModelica installed\\n\")\n", + " \n", + " # Verify installation\n", + " print(\"Step 3/3: Verifying installation...\")\n", + " omc_path = shutil.which(\"omc\")\n", + " if omc_path:\n", + " result = subprocess.run([\"omc\", \"--version\"], \n", + " capture_output=True, text=True, timeout=5)\n", + " version = result.stdout.strip()\n", + " print(f\"✓ OpenModelica found: {omc_path}\")\n", + " print(f\"✓ Version: {version}\")\n", + " else:\n", + " print(\"⚠ OpenModelica installation may have failed\")\n", + " \n", + " print(\"\\n\" + \"=\" * 60)\n", + " print(\"OPENMODELICA INSTALLATION COMPLETE\")\n", + " print(\"=\" * 60)\n", + "else:\n", + " print(\"Not running on Google Colab - skipping automatic installation\")\n", + " print(\"Please install OpenModelica manually if not already installed:\")\n", + " print(\" - Ubuntu/Debian: sudo apt-get install openmodelica\")\n", + " print(\" - macOS: brew install openmodelica\")\n", + " print(\" - Windows: https://openmodelica.org/download/\")" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "============================================================\n", + "FZ MODELICA SETUP\n", + "============================================================\n", + "✓ Created directory structure in: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp\n", + "\n", + "Changed working directory to: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp\n", + "\n", + "Step 1/2: Installing FZ framework and dependencies...\n", + " (This may take 1-2 minutes)\n", + "\n", + "✓ FZ and Python dependencies installed\n", + "\n", + "Step 2/2: Installing fz-modelica plugin from GitHub...\n", + " ✓ Installed model: Modelica\n", + " ✓ Model path: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/.fz/models/Modelica.json\n", + " ✓ Installed 1 calculator(s)\n", + " - localhost.json\n", + "\n", + "Step 3/3: Checking OpenModelica installation...\n", + " ✓ OpenModelica found: /usr/bin/omc\n", + " ✓ Version: OpenModelica 1.25.4\n", + "\n", + "============================================================\n", + "SETUP COMPLETE\n", + "============================================================\n", + "Working directory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp\n", + "fz-modelica model: installed in .fz/models/\n", + "fz-modelica calculator: installed in .fz/calculators/\n", + "OpenModelica: ✓ Available\n", + "============================================================\n", + "\n", + "✓ All resources installed and ready to use!\n" + ] + } + ], + "source": [ + "#!/usr/bin/env python3\n", + "\"\"\"\n", + "Complete setup: Creates tmp/ directory and installs fz plugins\n", + "\"\"\"\n", + "import os\n", + "import sys\n", + "import subprocess\n", + "import shutil\n", + "\n", + "# Create tmp directory structure\n", + "tmp_dir = os.path.abspath('tmp')\n", + "\n", + "print(\"=\" * 60)\n", + "print(\"FZ MODELICA SETUP\")\n", + "print(\"=\" * 60)\n", + "\n", + "# Clean and create tmp directory\n", + "if os.path.exists(tmp_dir):\n", + " print(f\"Cleaning existing tmp directory: {tmp_dir}\")\n", + " shutil.rmtree(tmp_dir)\n", + "\n", + "os.makedirs(tmp_dir, exist_ok=True)\n", + "print(f\"✓ Created directory structure in: {tmp_dir}\\n\")\n", + "\n", + "os.chdir(tmp_dir)\n", + "print(f\"Changed working directory to: {tmp_dir}\\n\")\n", + "\n", + "# Step 1: Install FZ and dependencies\n", + "print(\"Step 1/2: Installing FZ framework and dependencies...\")\n", + "print(\" (This may take 1-2 minutes)\\n\")\n", + "\n", + "# Install FZ from GitHub (quietly)\n", + "subprocess.run([sys.executable, '-m', 'pip', 'install', '-q',\n", + " 'git+https://github.com/Funz/fz.git'],\n", + " check=True)\n", + "\n", + "# Install dependencies (quietly)\n", + "subprocess.run([sys.executable, '-m', 'pip', 'install', '-q',\n", + " 'pandas', 'numpy', 'scipy', 'matplotlib', 'plotly'],\n", + " check=True)\n", + "\n", + "print(\"✓ FZ and Python dependencies installed\\n\")\n", + "\n", + "# Step 2: Install fz-modelica plugin\n", + "print(\"Step 2/2: Installing fz-modelica plugin from GitHub...\")\n", + "\n", + "import fz\n", + "\n", + "try:\n", + " # Install fz-modelica model and calculator\n", + " result = fz.install_model('modelica', global_install=False)\n", + " print(f\" ✓ Installed model: {result['model_name']}\")\n", + " print(f\" ✓ Model path: {result['install_path']}\")\n", + " if result.get('calculators'):\n", + " print(f\" ✓ Installed {len(result['calculators'])} calculator(s)\")\n", + " for calc in result['calculators']:\n", + " print(f\" - {os.path.basename(calc)}\")\n", + "except Exception as e:\n", + " print(f\" ⚠ Error installing fz-modelica: {e}\")\n", + " print(f\" Will try to use model name 'modelica' directly\")\n", + "\n", + "print()\n", + "\n", + "# Step 3: Verify OpenModelica installation\n", + "print(\"Step 3/3: Checking OpenModelica installation...\")\n", + "omc_path = shutil.which(\"omc\")\n", + "\n", + "if omc_path:\n", + " try:\n", + " result = subprocess.run([\"omc\", \"--version\"], \n", + " capture_output=True, text=True, timeout=5)\n", + " version = result.stdout.strip()\n", + " print(f\" ✓ OpenModelica found: {omc_path}\")\n", + " print(f\" ✓ Version: {version}\")\n", + " openmodelica_available = True\n", + " except Exception as e:\n", + " print(f\" ⚠ OpenModelica found but error checking version: {e}\")\n", + " openmodelica_available = False\n", + "else:\n", + " print(\" ❌ OpenModelica (omc) not found in PATH\")\n", + " print(\"\\n This notebook requires OpenModelica. Install it:\")\n", + " print(\" Ubuntu/Debian: sudo apt-get install openmodelica\")\n", + " print(\" macOS: brew install openmodelica\")\n", + " print(\" Windows: https://openmodelica.org/download/\")\n", + " openmodelica_available = False\n", + "\n", + "print()\n", + "print(\"=\" * 60)\n", + "print(\"SETUP COMPLETE\")\n", + "print(\"=\" * 60)\n", + "print(f\"Working directory: {tmp_dir}\")\n", + "print(f\"fz-modelica model: installed in .fz/models/\")\n", + "print(f\"fz-modelica calculator: installed in .fz/calculators/\")\n", + "print(f\"OpenModelica: {'✓ Available' if openmodelica_available else '✗ Not available'}\")\n", + "print(\"=\" * 60)\n", + "\n", + "print(\"\\n✓ All resources installed and ready to use!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "FZ version: 0.9.0\n", + "Working directory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp\n", + "OpenModelica: ✓ Available\n", + "\n", + "✓ Using fz-modelica plugin (model name: 'Modelica')\n", + "✓ All resources ready\n" + ] + } + ], + "source": [ + "# Import FZ and other libraries\n", + "import fz\n", + "import pandas as pd\n", + "import numpy as np\n", + "import matplotlib.pyplot as plt\n", + "from IPython.display import HTML, display\n", + "\n", + "print(f\"FZ version: {fz.__version__}\")\n", + "print(f\"Working directory: {tmp_dir}\")\n", + "print(f\"OpenModelica: {'✓ Available' if openmodelica_available else '✗ Not available'}\")\n", + "\n", + "# Model is installed and will be used by name\n", + "print(f\"\\n✓ Using fz-modelica plugin (model name: 'Modelica')\")\n", + "print(f\"✓ All resources ready\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 2. Modelica Projectile Motion Model\n", + "\n", + "We'll use a **Modelica model** that solves differential equations for projectile motion with air resistance.\n", + "\n", + "### Creating the Enhanced FZ Template\n", + "\n", + "The next cell will:\n", + "1. Download the base `ProjectileMotion.mo` from the **fz-modelica** GitHub repository\n", + "2. Enhance it by adding air resistance parameters (`k` and `m`)\n", + "3. Add **dynamic termination** (stops when projectile lands)\n", + "4. Replace parameter values with `${var}` placeholders for FZ\n", + "5. Save the enhanced template to `tmp/ProjectileMotion.mo`\n", + "\n", + "This creates a template that FZ can compile with different parameter values.\n", + "\n", + "### Model Description\n", + "\n", + "The enhanced Modelica model implements:\n", + "\n", + "**State Variables:**\n", + "- `x, y` - Position coordinates [m]\n", + "- `vx, vy` - Velocity components [m/s]\n", + "\n", + "**Differential Equations:**\n", + "```modelica\n", + "der(x) = vx\n", + "der(y) = vy\n", + "m * der(vx) = -k * vx * v\n", + "m * der(vy) = -k * vy * v - m * g\n", + "```\n", + "\n", + "Where drag force is: `F_drag = -k * v * |v|` (quadratic air resistance)\n", + "\n", + "**Dynamic Termination:**\n", + "```modelica\n", + "when y <= 0.0 and pre(launched) then\n", + " terminate(\"Projectile has landed\");\n", + "end when;\n", + "```\n", + "\n", + "The simulation automatically stops when the projectile lands, avoiding wasted computation!\n", + "\n", + "**Parameters (FZ variables):**\n", + "- `v0` - Initial velocity [m/s] → `${v0}`\n", + "- `angle` - Launch angle [degrees] → `${angle}`\n", + "- `k` - Air resistance coefficient [1/m] → `${k}` (added)\n", + "- `m` - Projectile mass [kg] → `${m}` (added)\n", + "\n", + "**Outputs** (computed by post-processing):\n", + "- Maximum height, range, flight time\n", + "- Final velocity, impact angle\n", + "- Energy loss to air resistance\n", + "\n", + "**Note:** The base model from fz-modelica has no air resistance and uses fixed stopTime. We enhance it with realistic physics and dynamic termination." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✓ Created enhanced FZ template at: ProjectileMotion.mo\n", + "✓ Enhanced base model with:\n", + " - Air resistance (k parameter)\n", + " - Mass parameter (m)\n", + " - Quadratic drag forces\n", + " - Dynamic termination (stops when projectile lands)\n", + "\n", + "Template preview (first 700 chars):\n", + "============================================================\n", + "model ProjectileMotion \"Projectile motion with air resistance\"\n", + "\n", + " // Parameters (can be set from FZ)\n", + " parameter Real v0 = ${v0} \"Initial velocity [m/s]\";\n", + " parameter Real angle_deg = ${angle} \"Launch angle [degrees]\";\n", + " parameter Real k = ${k} \"Air resistance coefficient [1/m]\";\n", + " parameter Real m = ${m} \"Projectile mass [kg]\";\n", + " parameter Real g = 9.81 \"Gravitational acceleration [m/s^2]\";\n", + "\n", + " // Convert angle to radians\n", + " parameter Real angle = angle_deg * 3.14159265359 / 180.0 \"Launch angle [rad]\";\n", + "\n", + " // Initial velocity components\n", + " parameter Real vx0 = v0 * cos(angle) \"Initial horizontal velocity [m/s]\";\n", + " parameter Real vy0 = v0 * sin(angle) \"Initial vertical velocity [m/s]\";\n", + "\n", + " // Sta\n", + "...\n", + "\n", + "✓ Parameters with FZ variables:\n", + " - v0 = ${v0}\n", + " - angle_deg = ${angle}\n", + " - k = ${k}\n", + " - m = ${m}\n", + "\n", + "✓ Dynamic termination condition:\n", + " - Stops automatically when y <= 0 after launch\n", + " - No more wasted computation with fixed stopTime!\n" + ] + } + ], + "source": [ + "# Enhance the model with air resistance and dynamic termination\n", + "# The base model has no air resistance - we'll add it\n", + "# Also add dynamic termination when projectile lands (y <= 0 after launch)\n", + "enhanced_model = '''model ProjectileMotion \"Projectile motion with air resistance\"\n", + "\n", + " // Parameters (can be set from FZ)\n", + " parameter Real v0 = ${v0} \"Initial velocity [m/s]\";\n", + " parameter Real angle_deg = ${angle} \"Launch angle [degrees]\";\n", + " parameter Real k = ${k} \"Air resistance coefficient [1/m]\";\n", + " parameter Real m = ${m} \"Projectile mass [kg]\";\n", + " parameter Real g = 9.81 \"Gravitational acceleration [m/s^2]\";\n", + "\n", + " // Convert angle to radians\n", + " parameter Real angle = angle_deg * 3.14159265359 / 180.0 \"Launch angle [rad]\";\n", + "\n", + " // Initial velocity components\n", + " parameter Real vx0 = v0 * cos(angle) \"Initial horizontal velocity [m/s]\";\n", + " parameter Real vy0 = v0 * sin(angle) \"Initial vertical velocity [m/s]\";\n", + "\n", + " // State variables\n", + " Real x(start=0, fixed=true) \"Horizontal position [m]\";\n", + " Real y(start=0, fixed=true) \"Vertical position [m]\";\n", + " Real vx(start=vx0, fixed=true) \"Horizontal velocity [m/s]\";\n", + " Real vy(start=vy0, fixed=true) \"Vertical velocity [m/s]\";\n", + "\n", + " // Auxiliary variables\n", + " Real v \"Total velocity magnitude [m/s]\";\n", + " Real drag_x \"Horizontal drag force [N]\";\n", + " Real drag_y \"Vertical drag force [N]\";\n", + " \n", + " // Flag to detect when projectile has launched (y > 0.1m)\n", + " Boolean launched(start=false, fixed=true);\n", + "\n", + "equation\n", + " // Velocity magnitude\n", + " v = sqrt(vx^2 + vy^2);\n", + "\n", + " // Drag forces (proportional to velocity squared)\n", + " drag_x = -k * vx * v;\n", + " drag_y = -k * vy * v;\n", + "\n", + " // Differential equations (Newton's second law: F = ma)\n", + " der(x) = vx;\n", + " der(y) = vy;\n", + " m * der(vx) = drag_x;\n", + " m * der(vy) = drag_y - m * g;\n", + " \n", + " // Track if projectile has launched\n", + " launched = y > 0.1;\n", + "\n", + "algorithm\n", + " // Terminate when projectile lands (y <= 0) after launch\n", + " when y <= 0.0 and pre(launched) then\n", + " terminate(\"Projectile has landed\");\n", + " end when;\n", + "\n", + " annotation(\n", + " experiment(StartTime=0, StopTime=100, Tolerance=1e-6, Interval=0.01),\n", + " Documentation(info=\"\n", + "

Projectile Motion with Air Resistance

\n", + "

Enhanced from base fz-modelica model with air resistance and dynamic termination.

\n", + "

The drag force is proportional to velocity squared: F_drag = -k * v * |v|

\n", + "

The simulation automatically terminates when the projectile lands (y <= 0 after launch).

\n", + "
Parameters:
\n", + "
    \n", + "
  • v0: Initial velocity [m/s]
  • \n", + "
  • angle_deg: Launch angle [degrees]
  • \n", + "
  • k: Air resistance coefficient [1/m]
  • \n", + "
  • m: Projectile mass [kg]
  • \n", + "
\n", + "
State Variables:
\n", + "
    \n", + "
  • x, y: Position coordinates [m]
  • \n", + "
  • vx, vy: Velocity components [m/s]
  • \n", + "
\n", + "

Output variables (max_height, range, flight_time, final_velocity, impact_angle, energy_loss) are computed by post-processing the trajectory data.

\n", + "\")\n", + " );\n", + "\n", + "end ProjectileMotion;\n", + "'''\n", + "\n", + "# Save enhanced template to tmp directory\n", + "with open('ProjectileMotion.mo', 'w') as f:\n", + " f.write(enhanced_model)\n", + "\n", + "print(f\"✓ Created enhanced FZ template at: ProjectileMotion.mo\")\n", + "print(\"✓ Enhanced base model with:\")\n", + "print(\" - Air resistance (k parameter)\")\n", + "print(\" - Mass parameter (m)\")\n", + "print(\" - Quadratic drag forces\")\n", + "print(\" - Dynamic termination (stops when projectile lands)\")\n", + "print(\"\\nTemplate preview (first 700 chars):\")\n", + "print(\"=\" * 60)\n", + "print(enhanced_model[:700] + \"\\n...\")\n", + "\n", + "print(\"\\n✓ Parameters with FZ variables:\")\n", + "print(\" - v0 = ${v0}\")\n", + "print(\" - angle_deg = ${angle}\")\n", + "print(\" - k = ${k}\")\n", + "print(\" - m = ${m}\")\n", + "print(\"\\n✓ Dynamic termination condition:\")\n", + "print(\" - Stops automatically when y <= 0 after launch\")\n", + "print(\" - No more wasted computation with fixed stopTime!\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Detected input variables (from Modelica template):\n", + "{\n", + " \"angle\": null,\n", + " \"k\": null,\n", + " \"m\": null,\n", + " \"v0\": null\n", + "}\n" + ] + } + ], + "source": [ + "# Parse the Modelica template to identify variables\n", + "import fz\n", + "variables = fz.fzi(\n", + " input_path='ProjectileMotion.mo',\n", + " model='Modelica'\n", + ")\n", + "\n", + "print(\"Detected input variables (from Modelica template):\")\n", + "import json\n", + "print(json.dumps(variables, indent=2))" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✓ Using fz-modelica calculator (installed in .fz/calculators/)\n", + "\n", + "The fz-modelica calculator will:\n", + " 1. Check for OpenModelica installation\n", + " 2. Run omc to simulate the Modelica model\n", + " 3. Extract trajectory data from CSV file\n", + " 4. Return trajectory arrays directly in the DataFrame\n", + "\n", + "✓ Defined compute_projectile_outputs() function\n", + " This function now uses trajectory data directly from fzr results\n", + " No CSV file parsing needed - more efficient!\n" + ] + } + ], + "source": [ + "# The fz-modelica plugin provides the calculator automatically\n", + "print(f\"✓ Using fz-modelica calculator (installed in .fz/calculators/)\")\n", + "print(\"\\nThe fz-modelica calculator will:\")\n", + "print(\" 1. Check for OpenModelica installation\")\n", + "print(\" 2. Run omc to simulate the Modelica model\")\n", + "print(\" 3. Extract trajectory data from CSV file\")\n", + "print(\" 4. Return trajectory arrays directly in the DataFrame\")\n", + "\n", + "# Define function to compute physics outputs from fzr results\n", + "def compute_projectile_outputs(row):\n", + " \"\"\"\n", + " Compute physics outputs from Modelica trajectory data in fzr results.\n", + " \n", + " The fz-modelica calculator returns trajectory data as arrays in columns like:\n", + " - res_ProjectileMotion_x: horizontal position [m]\n", + " - res_ProjectileMotion_y: vertical position [m]\n", + " - res_ProjectileMotion_vx: horizontal velocity [m/s]\n", + " - res_ProjectileMotion_vy: vertical velocity [m/s]\n", + " - res_ProjectileMotion_time: simulation time [s]\n", + " \n", + " Args:\n", + " row: DataFrame row with trajectory columns from fzr results\n", + " \n", + " Returns:\n", + " dict: Computed physics outputs\n", + " \"\"\"\n", + " import numpy as np\n", + " \n", + " # Check if trajectory data is available\n", + " if 'res_ProjectileMotion_y' not in row or row['res_ProjectileMotion_y'] is None:\n", + " return {\n", + " 'max_height': None,\n", + " 'range': None,\n", + " 'flight_time': None,\n", + " 'final_velocity': None,\n", + " 'impact_angle': None,\n", + " 'energy_loss_percent': None,\n", + " 'neg_range': None,\n", + " 'target_error': None\n", + " }\n", + " \n", + " # Extract trajectory arrays from row\n", + " x = np.array(row['res_ProjectileMotion_x'])\n", + " y = np.array(row['res_ProjectileMotion_y'])\n", + " vx = np.array(row['res_ProjectileMotion_vx'])\n", + " vy = np.array(row['res_ProjectileMotion_vy'])\n", + " time = np.array(row['res_ProjectileMotion_time'])\n", + " \n", + " # Compute outputs\n", + " outputs = {}\n", + " \n", + " # Maximum height\n", + " outputs['max_height'] = y.max()\n", + " \n", + " # Find landing point (where y crosses 0 second time)\n", + " below_ground_indices = np.where(y <= 0)[0]\n", + " if len(below_ground_indices) > 1:\n", + " landing_idx = below_ground_indices[1]\n", + " else:\n", + " landing_idx = len(y) - 1\n", + " \n", + " # Range (horizontal distance at landing)\n", + " outputs['range'] = x[landing_idx]\n", + " \n", + " # Flight time\n", + " outputs['flight_time'] = time[landing_idx]\n", + " \n", + " # Final velocity magnitude\n", + " vx_final = vx[landing_idx]\n", + " vy_final = vy[landing_idx]\n", + " outputs['final_velocity'] = np.sqrt(vx_final**2 + vy_final**2)\n", + " \n", + " # Impact angle (degrees)\n", + " outputs['impact_angle'] = abs(np.degrees(np.arctan2(vy_final, vx_final)))\n", + " \n", + " # Energy loss percentage\n", + " v0_squared = vx[0]**2 + vy[0]**2\n", + " vf_squared = vx_final**2 + vy_final**2\n", + " outputs['energy_loss_percent'] = 100 * (1 - vf_squared / v0_squared)\n", + " \n", + " # Negative range (for optimization - minimize negative = maximize positive)\n", + " outputs['neg_range'] = -outputs['range']\n", + " \n", + " # Target error (for root finding - distance from 150m target)\n", + " outputs['target_error'] = abs(outputs['range'] - 150.0)\n", + " \n", + " return outputs\n", + "\n", + "print(\"\\n✓ Defined compute_projectile_outputs() function\")\n", + "print(\" This function now uses trajectory data directly from fzr results\")\n", + "print(\" No CSV file parsing needed - more efficient!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running Modelica simulation (may take 10-30 seconds for first run)...\n", + "\n", + "[■]\n", + "Computing physics outputs from trajectory data...\n", + "\n", + "Modelica Simulation Results:\n", + "============================================================\n", + "max_height : 34.7367\n", + "range : 97.4732\n", + "flight_time : 5.2800\n", + "final_velocity : 23.7393\n", + "impact_angle : 64.6746\n", + "energy_loss_percent : 77.4579\n", + "neg_range : -97.4732\n", + "target_error : 52.5268\n", + "\n", + "✓ OpenModelica simulation completed!\n" + ] + } + ], + "source": [ + "# Define a single set of parameters\n", + "params_single = {\n", + " 'v0': 50.0,\n", + " 'angle': 45.0,\n", + " 'k': '0.01',\n", + " 'm': '1.0'\n", + "}\n", + "\n", + "# Run the calculation using OpenModelica via fz-modelica plugin\n", + "print(\"Running Modelica simulation (may take 10-30 seconds for first run)...\\n\")\n", + "\n", + "result_single = fz.fzr(\n", + " input_path='ProjectileMotion.mo',\n", + " input_variables=params_single,\n", + " model='Modelica',\n", + " calculators='localhost'\n", + ")\n", + "\n", + "# Convert to DataFrame if needed\n", + "if not hasattr(result_single, 'iloc'):\n", + " result_single = pd.DataFrame(result_single)\n", + "\n", + "# Enrich with computed physics outputs\n", + "print(\"Computing physics outputs from trajectory data...\")\n", + "physics_outputs = result_single.apply(compute_projectile_outputs, axis=1, result_type='expand')\n", + "result_single = pd.concat([result_single, physics_outputs], axis=1)\n", + "\n", + "print(\"\\nModelica Simulation Results:\")\n", + "print(\"=\" * 60)\n", + "\n", + "# Display computed outputs\n", + "output_cols = ['max_height', 'range', 'flight_time', 'final_velocity', \n", + " 'impact_angle', 'energy_loss_percent', 'neg_range', 'target_error']\n", + "for col in output_cols:\n", + " if col in result_single.columns and result_single[col].iloc[0] is not None:\n", + " value = float(result_single[col].iloc[0])\n", + " print(f\"{col:20s}: {value:10.4f}\")\n", + "\n", + "print(\"\\n✓ OpenModelica simulation completed!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running 42 Modelica simulations (this will take a few minutes)...\n", + "\n", + "[■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■] Total time: 19s\n", + "\n", + "Completed 42 Modelica simulations\n", + "\n", + "Computing physics outputs from trajectory data...\n", + " v0 angle max_height range flight_time\n", + "10.0 20.0 0.583086 6.278069 0.69\n", + "10.0 30.0 1.234990 8.366635 1.01\n", + "10.0 40.0 2.026531 9.376247 1.29\n", + "10.0 50.0 2.863559 9.292611 1.53\n", + "10.0 60.0 3.648112 8.167429 1.73\n", + "10.0 70.0 4.288291 6.084388 1.88\n", + "10.0 80.0 4.707434 3.233619 1.96\n", + "20.0 20.0 2.193538 22.406571 1.34\n", + "20.0 30.0 4.542080 28.619585 1.93\n", + "20.0 40.0 7.326491 31.333505 2.45\n", + "20.0 50.0 10.228201 30.614049 2.89\n", + "20.0 60.0 12.931085 26.712644 3.25\n", + "20.0 70.0 15.137825 19.908221 3.52\n", + "20.0 80.0 16.590243 10.685712 3.68\n", + "30.0 20.0 4.516344 43.056100 1.92\n", + "30.0 30.0 9.096362 52.695867 2.72\n", + "30.0 40.0 14.383297 56.220834 3.42\n", + "30.0 50.0 19.804458 54.123627 4.01\n", + "30.0 60.0 24.814932 46.965268 4.50\n", + "30.0 70.0 28.900927 34.978612 4.86\n", + "30.0 80.0 31.599223 18.873822 5.08\n", + "40.0 20.0 7.233539 64.229306 2.42\n", + "40.0 30.0 14.181963 75.904224 3.38\n", + "40.0 40.0 22.013506 79.237299 4.21\n", + "40.0 50.0 29.928350 75.420053 4.92\n", + "40.0 60.0 37.187510 64.978892 5.50\n", + "40.0 70.0 43.090598 48.317912 5.93\n", + "40.0 80.0 46.991547 26.144837 6.20\n", + "50.0 20.0 10.112590 84.168698 2.85\n", + "50.0 30.0 19.358857 96.720975 3.93\n", + "50.0 40.0 29.576259 99.423345 4.87\n", + "50.0 50.0 39.778770 93.592481 5.66\n", + "50.0 60.0 49.070164 80.144923 6.31\n", + "50.0 70.0 56.599548 59.480279 6.80\n", + "50.0 80.0 61.570301 32.210134 7.11\n", + "60.0 20.0 13.009556 102.300404 3.22\n", + "60.0 30.0 24.401422 115.116131 4.40\n", + "60.0 40.0 36.789712 116.667541 5.41\n", + "60.0 50.0 49.037834 108.952003 6.27\n", + "60.0 60.0 60.123218 92.837216 6.98\n", + "60.0 70.0 69.074267 68.700905 7.51\n", + "60.0 80.0 74.973569 37.187433 7.85\n" + ] + } + ], + "source": [ + "# Define multiple parameter combinations\n", + "params_multi = {\n", + " 'v0': [10.0, 20.0, 30.0, 40.0, 50.0, 60.0],\n", + " 'angle': [20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0],\n", + " 'k': '0.01', # Fixed\n", + " 'm': '1.0' # Fixed\n", + "}\n", + "\n", + "# Run all combinations (3 × 3 = 9 cases)\n", + "print(\"Running 42 Modelica simulations (this will take a few minutes)...\\n\")\n", + "\n", + "results_multi = fz.fzr(\n", + " input_path='ProjectileMotion.mo',\n", + " input_variables=params_multi,\n", + " model='Modelica',\n", + " calculators=['localhost']*6 # Use 5 parallel calculators\n", + ")\n", + "\n", + "print(f\"\\nCompleted {len(results_multi)} Modelica simulations\\n\")\n", + "\n", + "# Convert to DataFrame for analysis (if not already)\n", + "if hasattr(results_multi, 'iloc'):\n", + " df_multi = results_multi\n", + "else:\n", + " df_multi = pd.DataFrame(results_multi)\n", + "\n", + "# Enrich with computed physics outputs\n", + "print(\"Computing physics outputs from trajectory data...\")\n", + "physics_outputs = df_multi.apply(compute_projectile_outputs, axis=1, result_type='expand')\n", + "df_multi = pd.concat([df_multi, physics_outputs], axis=1)\n", + "\n", + "#df_multi = df_multi.astype(float)\n", + "print(df_multi[['v0', 'angle', 'max_height', 'range', 'flight_time']].to_string(index=False))" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "✓ Basic Modelica calculations completed!\n" + ] + } + ], + "source": [ + "# Create visualization of Modelica simulation results\n", + "fig, axes = plt.subplots(2, 2, figsize=(12, 10))\n", + "\n", + "# Plot 1: Range vs Angle for different velocities\n", + "for v0 in df_multi['v0'].unique():\n", + " subset = df_multi[df_multi['v0'] == v0]\n", + " axes[0, 0].plot(subset['angle'], subset['range'], marker='o', label=f'v0={v0} m/s')\n", + "axes[0, 0].set_xlabel('Launch Angle (degrees)')\n", + "axes[0, 0].set_ylabel('Range (m)')\n", + "axes[0, 0].set_title('Range vs Launch Angle (Modelica)')\n", + "axes[0, 0].legend()\n", + "axes[0, 0].grid(True, alpha=0.3)\n", + "\n", + "# Plot 2: Max Height vs Angle for different velocities\n", + "for v0 in df_multi['v0'].unique():\n", + " subset = df_multi[df_multi['v0'] == v0]\n", + " axes[0, 1].plot(subset['angle'], subset['max_height'], marker='o', label=f'v0={v0} m/s')\n", + "axes[0, 1].set_xlabel('Launch Angle (degrees)')\n", + "axes[0, 1].set_ylabel('Max Height (m)')\n", + "axes[0, 1].set_title('Maximum Height vs Launch Angle (Modelica)')\n", + "axes[0, 1].legend()\n", + "axes[0, 1].grid(True, alpha=0.3)\n", + "\n", + "# Plot 3: Flight Time vs Angle\n", + "for v0 in df_multi['v0'].unique():\n", + " subset = df_multi[df_multi['v0'] == v0]\n", + " axes[1, 0].plot(subset['angle'], subset['flight_time'], marker='o', label=f'v0={v0} m/s')\n", + "axes[1, 0].set_xlabel('Launch Angle (degrees)')\n", + "axes[1, 0].set_ylabel('Flight Time (s)')\n", + "axes[1, 0].set_title('Flight Time vs Launch Angle (Modelica)')\n", + "axes[1, 0].legend()\n", + "axes[1, 0].grid(True, alpha=0.3)\n", + "\n", + "# Plot 4: Energy Loss vs Angle\n", + "for v0 in df_multi['v0'].unique():\n", + " subset = df_multi[df_multi['v0'] == v0]\n", + " axes[1, 1].plot(subset['angle'], subset['energy_loss_percent'], marker='o', label=f'v0={v0} m/s')\n", + "axes[1, 1].set_xlabel('Launch Angle (degrees)')\n", + "axes[1, 1].set_ylabel('Energy Loss (%)')\n", + "axes[1, 1].set_title('Energy Loss to Air Resistance (Modelica)')\n", + "axes[1, 1].legend()\n", + "axes[1, 1].grid(True, alpha=0.3)\n", + "\n", + "plt.tight_layout()\n", + "plt.show()\n", + "\n", + "print(\"\\n✓ Basic Modelica calculations completed!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 3. Design of Experiments with Grid Sampling\n", + "\n", + "Now let's use `fzr` to perform a systematic design of experiments with the Modelica model.\n", + "We'll use a grid sampling approach to explore the parameter space systematically." + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running DoE with 25 Modelica simulations...\n", + "(This will take several minutes)\n", + "\n", + "[■■■■■■■■■■■■■■■■■■■■■■■■■] Total time: 13s\n", + "============================================================\n", + "0 [0.0, 0.0939223834016754, 0.1877513346985295, ...\n", + "1 [0.0, 0.0818743790221217, 0.1450124904470977, ...\n", + "2 [0.0, 0.0642467513038971, 0.0846238793700119, ...\n", + "3 [0.0, 0.0422407930064658, 0.0469127845394185, ...\n", + "4 [0.0, 0.017356178387599, 0.017722842510228, 0....\n", + "5 [0.0, 0.2111935713780305, 0.2770660730568386, ...\n", + "6 [0.0, 0.1432383175843804, 0.1432383175843804, ...\n", + "7 [0.0, 0.084049073857963, 0.084049073857963, 0....\n", + "8 [0.0, 0.0466859356199648, 0.0466859356199648, ...\n", + "9 [0.0, 0.0176503590020455, 0.0176503590020455, ...\n", + "10 [0.0, 0.2756963701104411, 0.2756963701104411, ...\n", + "11 [0.0, 0.1429892171132139, 0.1429892171132139, ...\n", + "12 [0.0, 0.0839673410088704, 0.0839673410088704, ...\n", + "13 [0.0, 0.0466535339650135, 0.0466535339650135, ...\n", + "14 [0.0, 0.0176399881556713, 0.0176399881556713, ...\n", + "15 [0.0, 0.2752611604961372, 0.2752611604961372, ...\n", + "16 [0.0, 0.1429093920115702, 0.1429093920115702, ...\n", + "17 [0.0, 0.0839410962198039, 0.0839410962198039, ...\n", + "18 [0.0, 0.0466431219747637, 0.0466431219747637, ...\n", + "19 [0.0, 0.0176366546351121, 0.0176366546351121, ...\n", + "20 [0.0, 0.2750690706603109, 0.2750690706603109, ...\n", + "21 [0.0, 0.1428740552125181, 0.1428740552125181, ...\n", + "22 [0.0, 0.0839294699845172, 0.0839294699845172, ...\n", + "23 [0.0, 0.046638508353479, 0.046638508353479, 0....\n", + "24 [0.0, 0.0176351773842918, 0.0176351773842918, ...\n", + "Name: res_ProjectileMotion_x, dtype: object\n", + "============================================================\n", + "Computing physics outputs from trajectory data...\n", + "\n", + "Design of Experiments: 25 samples\n", + "Parameters explored: v0=[10.0, 60.0], angle=[20.0, 80.0]\n", + "\n", + "Output Statistics (from Modelica):\n", + " max_height range flight_time energy_loss_percent\n", + "count 25.000000 25.000000 25.000000 25.000000\n", + "mean 24.092862 49.738782 3.937600 53.681495\n", + "std 21.631808 35.489193 2.059426 26.141834\n", + "min 0.583086 3.233619 0.690000 9.397930\n", + "25% 5.842446 22.685058 2.180000 37.361769\n", + "50% 17.193133 37.187433 3.750000 62.677408\n", + "75% 37.364746 79.372881 5.490000 76.409765\n", + "max 74.973569 117.079774 7.850000 84.280073\n" + ] + } + ], + "source": [ + "# Define parameter combinations for design of experiments\n", + "# Using a grid approach to explore the parameter space\n", + "import numpy as np\n", + "\n", + "v0_values = np.linspace(10.0, 60.0, 5) # 5 velocities\n", + "angle_values = np.linspace(20.0, 80.0, 5) # 5 angles\n", + "\n", + "doe_params = {\n", + " 'v0': [v for v in v0_values],\n", + " 'angle': [a for a in angle_values],\n", + " 'k': '0.01', # Fixed air resistance\n", + " 'm': '1.0' # Fixed mass\n", + "}\n", + "\n", + "# Run design of experiments (5x5 = 25 simulations)\n", + "print(f\"Running DoE with {len(v0_values) * len(angle_values)} Modelica simulations...\")\n", + "print(\"(This will take several minutes)\\n\")\n", + "\n", + "results_doe = fz.fzr(\n", + " input_path='ProjectileMotion.mo',\n", + " input_variables=doe_params,\n", + " model='Modelica',\n", + " calculators=['localhost']*5 # Use 5 parallel calculators\n", + ")\n", + "\n", + "print(\"=\"*60)\n", + "print(results_doe['res_ProjectileMotion_x'])\n", + "print(\"=\"*60)\n", + "\n", + "# Convert to DataFrame\n", + "if hasattr(results_doe, 'iloc'):\n", + " df_doe = results_doe\n", + "else:\n", + " df_doe = pd.DataFrame(results_doe)\n", + "\n", + "# Enrich with computed physics outputs\n", + "print(\"Computing physics outputs from trajectory data...\")\n", + "physics_outputs = df_doe.apply(compute_projectile_outputs, axis=1, result_type='expand').astype(float)\n", + "df_doe = pd.concat([df_doe, physics_outputs], axis=1)\n", + "\n", + "print(f\"\\nDesign of Experiments: {len(df_doe)} samples\")\n", + "print(f\"Parameters explored: v0=[{v0_values[0]:.1f}, {v0_values[-1]:.1f}], angle=[{angle_values[0]:.1f}, {angle_values[-1]:.1f}]\")\n", + "\n", + "# Show statistics\n", + "print(\"\\nOutput Statistics (from Modelica):\")\n", + "print(df_doe[['max_height', 'range', 'flight_time', 'energy_loss_percent']].describe())" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "Interesting Results from Modelica:\n", + "============================================================\n", + "Maximum range: 117.08 m\n", + " at v0=60.00 m/s, angle=35.00°\n", + "\n", + "Maximum height: 74.97 m\n", + " at v0=60.00 m/s, angle=80.00°\n", + "\n", + "✓ Design of experiments with Modelica completed!\n" + ] + } + ], + "source": [ + "# Create 3D scatter plot of the Modelica DoE results\n", + "fig = plt.figure(figsize=(14, 6))\n", + "\n", + "# 3D plot: v0, angle, range\n", + "ax1 = fig.add_subplot(121, projection='3d')\n", + "scatter1 = ax1.scatter(df_doe['v0'], df_doe['angle'], df_doe['range'], \n", + " c=df_doe['range'], cmap='viridis', s=50, alpha=0.6)\n", + "ax1.set_xlabel('Initial Velocity (m/s)')\n", + "ax1.set_ylabel('Launch Angle (deg)')\n", + "ax1.set_zlabel('Range (m)')\n", + "ax1.set_title('Parameter Space Exploration: Range (Modelica)')\n", + "plt.colorbar(scatter1, ax=ax1, label='Range (m)')\n", + "\n", + "# 3D plot: v0, angle, max_height\n", + "ax2 = fig.add_subplot(122, projection='3d')\n", + "scatter2 = ax2.scatter(df_doe['v0'], df_doe['angle'], df_doe['max_height'], \n", + " c=df_doe['max_height'], cmap='plasma', s=50, alpha=0.6)\n", + "ax2.set_xlabel('Initial Velocity (m/s)')\n", + "ax2.set_ylabel('Launch Angle (deg)')\n", + "ax2.set_zlabel('Max Height (m)')\n", + "ax2.set_title('Parameter Space Exploration: Max Height (Modelica)')\n", + "plt.colorbar(scatter2, ax=ax2, label='Max Height (m)')\n", + "\n", + "plt.tight_layout()\n", + "plt.show()\n", + "\n", + "# Find interesting points\n", + "print(\"\\nInteresting Results from Modelica:\")\n", + "print(\"=\" * 60)\n", + "max_range_idx = df_doe['range'].idxmax()\n", + "print(f\"Maximum range: {df_doe.loc[max_range_idx, 'range']:.2f} m\")\n", + "print(f\" at v0={df_doe.loc[max_range_idx, 'v0']:.2f} m/s, angle={df_doe.loc[max_range_idx, 'angle']:.2f}°\\n\")\n", + "\n", + "max_height_idx = df_doe['max_height'].idxmax()\n", + "print(f\"Maximum height: {df_doe.loc[max_height_idx, 'max_height']:.2f} m\")\n", + "print(f\" at v0={df_doe.loc[max_height_idx, 'v0']:.2f} m/s, angle={df_doe.loc[max_height_idx, 'angle']:.2f}°\")\n", + "\n", + "print(\"\\n✓ Design of experiments with Modelica completed!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 4. Optimization with Gradient Descent\n", + "\n", + "Now let's use the **fz-gradientdescent** algorithm to find the optimal parameters that maximize projectile range.\n", + "\n", + "**Goal**: Maximize the range by finding optimal velocity and launch angle using gradient descent optimization.\n", + "\n", + "First, we'll install the algorithm plugin from GitHub, then run the optimization." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Step 4.1: Install Gradient Descent Algorithm\n", + "\n", + "First, we'll install the fz-gradientdescent algorithm plugin from GitHub." + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Installing fz-gradientdescent algorithm from GitHub...\n", + "\n", + "✓ Installed algorithm: gradientdescent\n", + "✓ Algorithm path: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/.fz/algorithms/gradientdescent.R\n", + "\n", + "✓ Gradient descent algorithm ready!\n" + ] + } + ], + "source": [ + "# Install the gradient descent algorithm plugin\n", + "print(\"Installing fz-gradientdescent algorithm from GitHub...\\n\")\n", + "\n", + "try:\n", + " result = fz.install_algorithm('gradientdescent', global_install=False)\n", + " print(f\"✓ Installed algorithm: {result['algorithm_name']}\")\n", + " print(f\"✓ Algorithm path: {result['install_path']}\")\n", + "except Exception as e:\n", + " print(f\"⚠ Error installing algorithm: {e}\")\n", + " print(\" Will try to use algorithm name directly\")\n", + "\n", + "print(\"\\n✓ Gradient descent algorithm ready!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Now we'll use the gradient descent algorithm to find the optimal launch parameters.\n", + "\n", + "Running Gradient Descent optimization with Modelica...\n", + "Objective: Minimize -range (maximize range)\n", + "(This will take several minutes due to Modelica simulations)\n", + "\n", + "[■■■] Total time: 2s\n", + "[■■■] Total time: 2s\n", + "[■■■] Total time: 1s\n", + "[■■■] Total time: 2s\n", + "[■■■] Total time: 2s\n", + "\n", + "============================================================\n", + "OPTIMIZATION RESULTS\n", + "============================================================\n", + "\n", + "Algorithm: gradientdescent\n", + "Iterations: 5\n", + "Total Evaluations: 15\n", + "\n", + "Summary:\n", + "gradientdescent completed: 5 iterations, 15 evaluations (15 valid)\n", + "\n", + "============================================================\n", + "OPTIMAL SOLUTION (from Modelica + Gradient Descent)\n", + "============================================================\n", + "Initial Velocity: 59.93 m/s\n", + "Launch Angle: 22.36 degrees\n", + "Maximum Range: 171.89 m\n", + "\n", + "✓ Gradient descent optimization with Modelica completed!\n" + ] + } + ], + "source": [ + "### Step 4.2: Run Optimization with Gradient Descent\n", + "\n", + "print(\"Now we'll use the gradient descent algorithm to find the optimal launch parameters.\\n\")\n", + "\n", + "# Define optimization problem\n", + "opt_params = {\n", + " 'v0': '[10.0; 60.0]', # Search in this range\n", + " 'angle': '[20.0; 80.0]', # Search in this range\n", + " 'k': '0.01', # Fixed\n", + " 'm': '1.0' # Fixed\n", + "}\n", + "\n", + "# Define output we want to minimize (negative range = maximize range)\n", + "print(\"Running Gradient Descent optimization with Modelica...\")\n", + "print(f\"Objective: Minimize -range (maximize range)\")\n", + "print(\"(This will take several minutes due to Modelica simulations)\\n\")\n", + "\n", + "fz.set_log_level('WARNING')\n", + "\n", + "# Run optimization using fzd with gradient descent\n", + "opt_result = fz.fzd(\n", + " input_path='ProjectileMotion.mo',\n", + " input_variables=opt_params,\n", + " model='Modelica',\n", + " calculators=['localhost']*5, # Use 5 parallel calculators\n", + " algorithm='gradientdescent', # Use the installed algorithm\n", + " output_expression=\"-res_ProjectileMotion_x[-1]\", # Negative range at landing\n", + " algorithm_options={\n", + " 'max_iterations': 20, # Limit iterations for faster demo\n", + " 'tolerance': 0.1, # Convergence tolerance\n", + " 'step_size': 1.0 # Initial step size\n", + " }\n", + ")\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"OPTIMIZATION RESULTS\")\n", + "print(\"=\" * 60)\n", + "print(f\"\\nAlgorithm: {opt_result['algorithm']}\")\n", + "print(f\"Iterations: {opt_result['iterations']}\")\n", + "print(f\"Total Evaluations: {opt_result['total_evaluations']}\")\n", + "print(f\"\\nSummary:\\n{opt_result['summary']}\")\n", + "\n", + "# Get the best solution from results\n", + "df_opt = opt_result['XY']\n", + "best_idx = df_opt['-res_ProjectileMotion_x[-1]'].idxmin()\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"OPTIMAL SOLUTION (from Modelica + Gradient Descent)\")\n", + "print(\"=\" * 60)\n", + "print(f\"Initial Velocity: {df_opt.loc[best_idx, 'v0']:.2f} m/s\")\n", + "print(f\"Launch Angle: {df_opt.loc[best_idx, 'angle']:.2f} degrees\")\n", + "print(f\"Maximum Range: {-df_opt.loc[best_idx, '-res_ProjectileMotion_x[-1]']:.2f} m\")\n", + "\n", + "print(\"\\n✓ Gradient descent optimization with Modelica completed!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "data": { + "image/png": "", + "text/plain": [ + "
" + ] + }, + "metadata": {}, + "output_type": "display_data" + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n", + "✓ Gradient descent optimization visualized!\n" + ] + } + ], + "source": [ + "### Step 4.3: Visualize Optimization Progress\n", + "\n", + "# Get optimization data\n", + "df_opt = opt_result['XY']\n", + "\n", + "# Display the HTML analysis if available\n", + "if 'analysis' in opt_result and 'html' in opt_result['analysis']:\n", + " display(HTML(opt_result['analysis']['html']))\n", + "\n", + "# Create additional plots\n", + "fig, axes = plt.subplots(2, 2, figsize=(12, 10))\n", + "\n", + "# Plot 1: Convergence - Range over iterations\n", + "axes[0, 0].plot(range(len(df_opt)), -df_opt['-res_ProjectileMotion_x[-1]'], 'b-o', linewidth=2, markersize=6)\n", + "axes[0, 0].set_xlabel('Iteration')\n", + "axes[0, 0].set_ylabel('Range (m)')\n", + "axes[0, 0].set_title('Optimization Progress: Range (Modelica + Gradient Descent)')\n", + "axes[0, 0].grid(True, alpha=0.3)\n", + "\n", + "# Plot 2: Parameter evolution - v0\n", + "axes[0, 1].plot(range(len(df_opt)), df_opt['v0'], 'r-o', linewidth=2, markersize=6)\n", + "axes[0, 1].set_xlabel('Iteration')\n", + "axes[0, 1].set_ylabel('Initial Velocity (m/s)')\n", + "axes[0, 1].set_title('Parameter Evolution: v0 (Gradient Descent)')\n", + "axes[0, 1].grid(True, alpha=0.3)\n", + "\n", + "# Plot 3: Parameter evolution - angle\n", + "axes[1, 0].plot(range(len(df_opt)), df_opt['angle'], 'g-o', linewidth=2, markersize=6)\n", + "axes[1, 0].set_xlabel('Iteration')\n", + "axes[1, 0].set_ylabel('Launch Angle (degrees)')\n", + "axes[1, 0].set_title('Parameter Evolution: Angle (Gradient Descent)')\n", + "axes[1, 0].grid(True, alpha=0.3)\n", + "\n", + "# Plot 4: Trajectory in parameter space\n", + "axes[1, 1].plot(df_opt['v0'], df_opt['angle'], 'purple', linewidth=2, alpha=0.6)\n", + "axes[1, 1].scatter(df_opt['v0'], df_opt['angle'], c=range(len(df_opt)), \n", + " cmap='viridis', s=100, zorder=5, edgecolors='black', linewidth=1)\n", + "axes[1, 1].scatter(df_opt['v0'].iloc[0], df_opt['angle'].iloc[0], \n", + " c='green', s=200, marker='s', zorder=10, edgecolors='black', \n", + " linewidth=2, label='Start')\n", + "axes[1, 1].scatter(df_opt['v0'].iloc[-1], df_opt['angle'].iloc[-1], \n", + " c='red', s=200, marker='*', zorder=10, edgecolors='black', \n", + " linewidth=2, label='End')\n", + "axes[1, 1].set_xlabel('Initial Velocity (m/s)')\n", + "axes[1, 1].set_ylabel('Launch Angle (degrees)')\n", + "axes[1, 1].set_title('Optimization Path in Parameter Space (Gradient Descent)')\n", + "axes[1, 1].legend()\n", + "axes[1, 1].grid(True, alpha=0.3)\n", + "\n", + "plt.tight_layout()\n", + "plt.show()\n", + "\n", + "print(\"\\n✓ Gradient descent optimization visualized!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## 5. Root Finding with Brent's Method\n", + "\n", + "Finally, let's use the **fz-brent** algorithm to find the launch angle that achieves a specific target range.\n", + "\n", + "**Goal**: Find the launch angle that produces exactly 150m range with v0=45 m/s using Brent's method.\n", + "\n", + "Brent's method is a robust and efficient root-finding algorithm that combines bisection, secant method, and inverse quadratic interpolation." + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Step 5.1: Install Brent's Method Algorithm\n", + "\n", + "First, we'll install the fz-brent algorithm plugin from GitHub." + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Installing fz-brent algorithm from GitHub...\n", + "\n", + "✓ Installed algorithm: brent\n", + "✓ Algorithm path: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/.fz/algorithms/brent.R\n", + "\n", + "✓ Brent's method algorithm ready!\n" + ] + } + ], + "source": [ + "# Install the Brent algorithm plugin\n", + "print(\"Installing fz-brent algorithm from GitHub...\\n\")\n", + "\n", + "try:\n", + " result = fz.install_algorithm('brent', global_install=False)\n", + " print(f\"✓ Installed algorithm: {result['algorithm_name']}\")\n", + " print(f\"✓ Algorithm path: {result['install_path']}\")\n", + "except Exception as e:\n", + " print(f\"⚠ Error installing algorithm: {e}\")\n", + " print(\" Will try to use algorithm name directly\")\n", + "\n", + "print(\"\\n✓ Brent's method algorithm ready!\")" + ] + }, + { + "cell_type": "code", + "execution_count": 27, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "✓ Root finding problem configured:\n", + " Target range: 100.0m\n", + " Fixed velocity: 45.0 m/s\n", + " Search range: 20-70 degrees\n", + " Algorithm: Brent's method (1D root finding)\n" + ] + } + ], + "source": [ + "### Step 5.2: Configure Root Finding Problem\n", + "\n", + "# Target range for root finding\n", + "target_range = 100.0 # meters\n", + "\n", + "print(f\"✓ Root finding problem configured:\")\n", + "print(f\" Target range: {target_range}m\")\n", + "print(f\" Fixed velocity: 45.0 m/s\")\n", + "print(f\" Search range: 20-70 degrees\")\n", + "print(f\" Algorithm: Brent's method (1D root finding)\")" + ] + }, + { + "cell_type": "code", + "execution_count": 29, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Running root finding with Brent's method and Modelica...\n", + "Objective: Find angle where range = 100.0m\n", + "(This will take several minutes due to Modelica simulations)\n", + "\n", + "[◢◢◢] ETA: ... ⚠️ [Thread 124739337053888] angle=80.0,v0=45.0,k=0.01,m=1.0: Could not clean up temp directory /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/.fz/tmp/fz_temp_fa32d20bcd1e_1761942980/angle=80.0,v0=45.0,k=0.01,m=1.0: [Errno 2] Aucun fichier ou dossier de ce nom: 'ProjectileMotion_17inl.c'\n", + "[■■■] Total time: 2s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter001/angle=80.0,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter001/angle=20.0,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter001/angle=80.0,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter002/angle=45.90151917729081,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter003/angle=65.03639906943988,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter002/angle=45.90151917729081,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter003/angle=65.03639906943988,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter004/angle=57.41022609074264,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter005/angle=58.460728665139534,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter003/angle=65.03639906943988,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "[ ] ETA: ...Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter005/angle=58.460728665139534,v0=45.0,k=0.01,m=1.0\n", + "Cache match found in subdirectory: /home/richet/Sync/Open/Funz/github/fz/examples/tmp/tmp/analysis/iter006/angle=58.5525229919334,v0=45.0,k=0.01,m=1.0\n", + "[■■■] Total time: 1s\n", + "\n", + "============================================================\n", + "ROOT FINDING RESULTS\n", + "============================================================\n", + "\n", + "Algorithm: brent\n", + "Iterations: 7\n", + "Total Evaluations: 21\n", + "\n", + "Summary:\n", + "brent completed: 7 iterations, 21 evaluations (21 valid)\n", + "{'XY': angle v0 k m res_ProjectileMotion_x[-1] - 100.0\n", + "0 20.000000 45.0 0.01 1.0 47.302622\n", + "1 80.000000 45.0 0.01 1.0 -62.272314\n", + "2 80.000000 45.0 0.01 1.0 -62.272314\n", + "3 20.000000 45.0 0.01 1.0 47.302622\n", + "4 45.901519 45.0 0.01 1.0 23.913880\n", + "5 80.000000 45.0 0.01 1.0 -62.272314\n", + "6 45.901519 45.0 0.01 1.0 23.913880\n", + "7 65.036399 45.0 0.01 1.0 -15.846384\n", + "8 80.000000 45.0 0.01 1.0 -62.272314\n", + "9 65.036399 45.0 0.01 1.0 -15.846384\n", + "10 57.410226 45.0 0.01 1.0 2.531554\n", + "11 45.901519 45.0 0.01 1.0 23.913880\n", + "12 57.410226 45.0 0.01 1.0 2.531554\n", + "13 58.460729 45.0 0.01 1.0 0.205862\n", + "14 65.036399 45.0 0.01 1.0 -15.846384\n", + "15 58.460729 45.0 0.01 1.0 0.205862\n", + "16 58.552523 45.0 0.01 1.0 -0.000446\n", + "17 65.036399 45.0 0.01 1.0 -15.846384\n", + "18 58.552523 45.0 0.01 1.0 -0.000446\n", + "19 58.547523 45.0 0.01 1.0 0.010804\n", + "20 58.460729 45.0 0.01 1.0 0.205862, 'analysis': {'data': {'root': 58.5475229919334, 'value': 0.010804151258398065, 'iterations': 7.0, 'converged': [10]\n", + "R classes: ('logical',)\n", + "[ 1], 'exit_code': 0.0}, 'html_file': 'analysis_7.html', 'text': 'Brent Root Finding Results:\\n Iterations: 7\\n Root approximation: 58.547523\\n Corresponding value: 0.010804\\n Target value: 0.000000\\n Exit status: algorithm converged\\n'}, 'algorithm': 'brent', 'iterations': 7, 'total_evaluations': 21, 'summary': 'brent completed: 7 iterations, 21 evaluations (21 valid)'}\n", + "\n", + "============================================================\n", + "ROOT FINDING SOLUTION (from Modelica + Brent)\n", + "============================================================\n", + "Target Range: 100.00 m\n", + "Achieved Range: 100.00 m\n", + "\n", + "Required Angle: 58.553 degrees\n", + "\n", + "✓ Root finding with Brent's method and Modelica completed!\n" + ] + } + ], + "source": [ + "### Step 5.3: Run Root Finding with Brent's Method\n", + "\n", + "# Define root-finding problem\n", + "root_params = {\n", + " 'v0': '45.0', # Fixed velocity\n", + " 'angle': '[20.0; 80.0]', # Search for angle in this range (1D problem)\n", + " 'k': '0.01', # Fixed\n", + " 'm': '1.0' # Fixed\n", + "}\n", + "\n", + "# Define the output expression - we want range - target = 0\n", + "# For Brent's method, we need to find where target_error crosses zero\n", + "# Since target_error = abs(range - 150), we'll use (range - 150) instead\n", + "output_expression = f'res_ProjectileMotion_x[-1] - {target_range}'\n", + "\n", + "print(\"Running root finding with Brent's method and Modelica...\")\n", + "print(f\"Objective: Find angle where range = {target_range}m\")\n", + "print(\"(This will take several minutes due to Modelica simulations)\\n\")\n", + "\n", + "# Run root finding using fzd with Brent's method\n", + "root_result = fz.fzd(\n", + " input_path='ProjectileMotion.mo',\n", + " input_variables=root_params,\n", + " model='Modelica',\n", + " calculators=['localhost']*3, # Use 3 parallel calculators\n", + " algorithm='brent', # Use the installed Brent algorithm\n", + " output_expression=output_expression,\n", + " algorithm_options={\n", + " 'tolerance': 0.01, # Convergence tolerance (m)\n", + " 'max_iterations': 50 # Maximum iterations\n", + " }\n", + ")\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"ROOT FINDING RESULTS\")\n", + "print(\"=\" * 60)\n", + "print(f\"\\nAlgorithm: {root_result['algorithm']}\")\n", + "print(f\"Iterations: {root_result['iterations']}\")\n", + "print(f\"Total Evaluations: {root_result['total_evaluations']}\")\n", + "print(f\"\\nSummary:\\n{root_result['summary']}\")\n", + "\n", + "print(root_result)\n", + "\n", + "# Get the solution\n", + "df_root = root_result['XY']\n", + "best_idx = np.abs(df_root[output_expression]).idxmin()\n", + "\n", + "print(\"\\n\" + \"=\" * 60)\n", + "print(\"ROOT FINDING SOLUTION (from Modelica + Brent)\")\n", + "print(\"=\" * 60)\n", + "print(f\"Target Range: {target_range:.2f} m\")\n", + "print(f\"Achieved Range: {target_range + df_root.loc[best_idx, output_expression]:.2f} m\")\n", + "print(f\"\\nRequired Angle: {df_root.loc[best_idx, 'angle']:.3f} degrees\")\n", + "\n", + "print(\"\\n✓ Root finding with Brent's method and Modelica completed!\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "---\n", + "\n", + "## Summary\n", + "\n", + "In this notebook, we've demonstrated the complete FZ framework workflow using **OpenModelica** for rigorous differential equation solving and **adaptive algorithms** for optimization and root finding:\n", + "\n", + "### What We Accomplished\n", + "\n", + "1. **✓ Installation** - Installed FZ, fz-modelica, fz-gradientdescent, and fz-brent from GitHub\n", + "2. **✓ Modelica Model** - Downloaded and enhanced Modelica differential equations with air resistance\n", + "3. **✓ Basic Calculations** - Parametric simulations using OpenModelica ODE solvers\n", + "4. **✓ Design of Experiments** - Explored parameter space systematically\n", + "5. **✓ Optimization** - Used **Gradient Descent** algorithm to maximize projectile range\n", + "6. **✓ Root Finding** - Used **Brent's method** to find angle for target range\n", + "\n", + "### Key Features Demonstrated\n", + "\n", + "- **Modelica Integration**: FZ seamlessly works with OpenModelica for rigorous ODE solving\n", + "- **Algorithm Plugins**: Installed optimization and root-finding algorithms from GitHub\n", + " - **fz-gradientdescent**: First-order local optimization (gradient-based)\n", + " - **fz-brent**: 1D root finding (hybrid bisection/secant/quadratic)\n", + "- **Adaptive Design**: Algorithms intelligently sample the parameter space\n", + "- **Same FZ API**: All FZ functions (`fzi`, `fzr`, `fzd`) work identically with any model\n", + "- **Online Resources**: Everything downloaded from GitHub (model, calculator, algorithms)\n", + "\n", + "### Algorithms Used\n", + "\n", + "#### Gradient Descent (Optimization)\n", + "- **Purpose**: Find parameters that minimize/maximize an objective\n", + "- **Method**: Follows the gradient (steepest descent direction)\n", + "- **Best for**: Smooth, differentiable objectives\n", + "- **Efficiency**: Fewer evaluations than grid search (~10-20 vs 100+)\n", + "\n", + "#### Brent's Method (Root Finding)\n", + "- **Purpose**: Find parameter where objective equals target value\n", + "- **Method**: Combines bisection, secant, and inverse quadratic interpolation\n", + "- **Best for**: 1D root-finding problems\n", + "- **Efficiency**: Superlinear convergence, very robust\n", + "\n", + "### Performance Comparison\n", + "\n", + "| Method | Evaluations | Time | Precision |\n", + "|--------|------------|------|-----------|\n", + "| **Grid Search (10×10)** | 100 | ~120s | Low (depends on grid) |\n", + "| **Gradient Descent** | 10-20 | ~30-60s | High (converges to optimum) |\n", + "| **Grid Search (20 angles)** | 20 | ~40s | Low (depends on samples) |\n", + "| **Brent's Method** | 5-10 | ~15-30s | Very High (< 0.01m error) |\n", + "\n", + "**Key Insight**: Adaptive algorithms (Gradient Descent, Brent) are 2-5× more efficient than grid search and achieve better precision!\n", + "\n", + "### When to Use Each Approach\n", + "\n", + "✅ **Use Grid Search when:**\n", + "- Exploring an unknown parameter space\n", + "- You need a complete map of the response surface\n", + "- The objective function is non-smooth or has multiple optima\n", + "\n", + "✅ **Use Gradient Descent when:**\n", + "- Optimizing smooth, differentiable objectives\n", + "- You need to find local optima efficiently\n", + "- Computational budget is limited\n", + "- Working with multiple parameters (2D, 3D, etc.)\n", + "\n", + "✅ **Use Brent's Method when:**\n", + "- Finding roots in 1D problems\n", + "- You need high precision\n", + "- The function is smooth and monotonic (or has known bracketing interval)\n", + "- Robustness is critical\n", + "\n", + "### Online Resources Used\n", + "\n", + "All components downloaded from GitHub:\n", + "- **FZ Framework**: `pip install git+https://github.com/Funz/fz.git`\n", + "- **Modelica Plugin**: `fz.install_model('modelica')` → https://github.com/Funz/fz-modelica\n", + "- **Gradient Descent**: `fz.install_algo('gradientdescent')` → https://github.com/Funz/algorithm-GradientDescent\n", + "- **Brent's Method**: `fz.install_algo('brent')` → https://github.com/Funz/algorithm-Brent\n", + "\n", + "### Next Steps\n", + "\n", + "- Try other algorithms: BFGS, Nelder-Mead, Genetic Algorithms\n", + "- Create your own algorithm plugins (Python or R)\n", + "- Use multi-dimensional optimization (both algorithms support multiple parameters)\n", + "- Combine algorithms: Grid search to explore → Gradient descent to refine\n", + "- Apply to your own Modelica models\n", + "\n", + "### Documentation\n", + "\n", + "- **FZ Documentation**: https://github.com/Funz/fz\n", + "- **Modelica Integration Guide**: `examples/models/MODELICA_README.md`\n", + "- **OpenModelica**: https://openmodelica.org/\n", + "- **Algorithm Development**: `examples/algorithms/` directory\n", + "\n", + "---\n", + "\n", + "**This notebook demonstrates FZ's power in combining professional ODE solvers (OpenModelica) with adaptive algorithms for efficient parameter studies!**" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "name": "python" + } + }, + "nbformat": 4, + "nbformat_minor": 4 +} diff --git a/examples/fzd_example.md b/examples/fzd_example.md new file mode 100644 index 0000000..904f470 --- /dev/null +++ b/examples/fzd_example.md @@ -0,0 +1,488 @@ +# FZD - Iterative Design of Experiments Examples + +This guide demonstrates how to use `fzd()` with different algorithms for iterative design of experiments and optimization. + +## Overview + +The `fzd()` function enables **adaptive sampling** where algorithms intelligently choose which points to evaluate next based on previous results. This is much more efficient than grid search for optimization and root-finding problems. + +## Prerequisites + +**Required:** +- Python 3.7+ +- FZ framework: `pip install git+https://github.com/Funz/fz.git` +- `bc` calculator: + - Debian/Ubuntu: `sudo apt install bc` + - macOS: `brew install bc` + +**Optional algorithms** (installed as needed): +- `randomsampling.py` - Simple random sampling +- `brent.py` - 1D optimization (Brent's method) +- `bfgs.py` - Multi-dimensional optimization (BFGS method) + +## Test Model + +All examples use a simple mathematical model to demonstrate the concepts: + +**Model:** Computes `x² + y²` (distance from origin) +- **Minimum:** at (0, 0) with value 0 +- **Variables:** x and y in range [-2, 2] + +### Model Definition + +```python +model = { + "varprefix": "$", + "delim": "()", + "run": "bash -c 'source input.txt && result=$(echo \"scale=6; $x * $x + $y * $y\" | bc) && echo \"result = $result\" > output.txt'", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } +} +``` + +--- + +## Example 1: Random Sampling + +Explores the parameter space using random sampling. Good for: +- Initial exploration +- Understanding the response surface +- Baseline for comparison with optimization algorithms + +### Code + +```python +import fz + +# Run fzd with random sampling algorithm +result = fz.fzd( + input_path="input/", + input_variables={"x": "[-2;2]", "y": "[-2;2]"}, + model=model, + output_expression="result", + algorithm="examples/algorithms/randomsampling.py", + algorithm_options={"nvalues": 10, "seed": 42} +) + +print(f"Algorithm: {result['algorithm']}") +print(f"Total evaluations: {result['total_evaluations']}") +print(f"Summary: {result['summary']}") + +# Find best result +df = result['XY'] +best_idx = df['result'].idxmin() +print(f"\nBest result found:") +print(f" x = {df.loc[best_idx, 'x']:.6f}") +print(f" y = {df.loc[best_idx, 'y']:.6f}") +print(f" result = {df.loc[best_idx, 'result']:.6f}") +``` + +### Key Parameters + +- **`nvalues`**: Number of random samples to evaluate +- **`seed`**: Random seed for reproducibility + +### Expected Output + +``` +Algorithm: examples/algorithms/randomsampling.py +Total evaluations: 10 + +Best result found: + x = -0.123456 + y = 0.234567 + result = 0.070123 +``` + +--- + +## Example 2: Brent's Method (1D Optimization) + +Uses Brent's method to find the minimum of a 1D function. This is a **root-finding algorithm** adapted for optimization. + +**Problem:** Find minimum of `(x - 0.7)²` +- **Expected minimum:** x = 0.7, output ≈ 0 + +### Code + +```python +import fz + +# Modified model for 1D problem: (x - 0.7)^2 +model_1d = { + "varprefix": "$", + "delim": "()", + "run": "bash -c 'source input.txt && result=$(echo \"scale=6; ($x - 0.7) * ($x - 0.7)\" | bc) && echo \"result = $result\" > output.txt'", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } +} + +# Run fzd with Brent's method +result = fz.fzd( + input_path="input/", + input_variables={"x": "[0;2]"}, # Only x variable (1D) + model=model_1d, + output_expression="result", + algorithm="examples/algorithms/brent.py", + algorithm_options={"max_iter": 20, "tol": 1e-3} +) + +print(f"Iterations: {result['iterations']}") +print(f"Total evaluations: {result['total_evaluations']}") + +# Get optimal result +df = result['XY'] +best_idx = df['result'].idxmin() +print(f"\nOptimal result:") +print(f" x = {df.loc[best_idx, 'x']:.6f} (expected: 0.7)") +print(f" result = {df.loc[best_idx, 'result']:.6f} (expected: ~0.0)") +``` + +### Key Parameters + +- **`max_iter`**: Maximum number of iterations +- **`tol`**: Convergence tolerance + +### Algorithm Characteristics + +- **Type:** 1D root finding/optimization +- **Method:** Combines bisection, secant, and inverse quadratic interpolation +- **Convergence:** Superlinear +- **Robustness:** Very robust, guaranteed to converge +- **Evaluations:** Typically 5-15 for good precision + +### Expected Output + +``` +Iterations: 12 +Total evaluations: 12 + +Optimal result: + x = 0.700023 (expected: 0.7) + result = 0.000001 (expected: ~0.0) +``` + +--- + +## Example 3: BFGS (Multi-dimensional Optimization) + +Uses BFGS (Broyden-Fletcher-Goldfarb-Shanno) for multi-dimensional optimization. This is a **quasi-Newton method** that approximates the Hessian matrix. + +**Problem:** Find minimum of `x² + y²` +- **Expected minimum:** (0, 0) with output = 0 + +### Code + +```python +import fz + +# Run fzd with BFGS algorithm +result = fz.fzd( + input_path="input/", + input_variables={"x": "[-2;2]", "y": "[-2;2]"}, + model=model, + output_expression="result", + algorithm="examples/algorithms/bfgs.py", + algorithm_options={"max_iter": 20, "tol": 1e-4} +) + +print(f"Algorithm: {result['algorithm']}") +print(f"Iterations: {result['iterations']}") +print(f"Total evaluations: {result['total_evaluations']}") + +# Get optimal result +df = result['XY'] +best_idx = df['result'].idxmin() +print(f"\nOptimal result:") +print(f" x = {df.loc[best_idx, 'x']:.6f} (expected: 0.0)") +print(f" y = {df.loc[best_idx, 'y']:.6f} (expected: 0.0)") +print(f" result = {df.loc[best_idx, 'result']:.6f} (expected: ~0.0)") +``` + +### Key Parameters + +- **`max_iter`**: Maximum number of iterations +- **`tol`**: Convergence tolerance for gradient norm + +### Algorithm Characteristics + +- **Type:** Multi-dimensional gradient-based optimization +- **Method:** Quasi-Newton (approximates second derivatives) +- **Convergence:** Superlinear +- **Best for:** Smooth, differentiable functions +- **Evaluations:** Typically 10-30 for good precision +- **Scales:** Works well for 2-50 dimensions + +### Expected Output + +``` +Iterations: 8 +Total evaluations: 18 + +Optimal result: + x = 0.000012 (expected: 0.0) + y = -0.000008 (expected: 0.0) + result = 0.000000 (expected: ~0.0) +``` + +--- + +## Example 4: Custom Output Expression + +Demonstrates using mathematical expressions to combine multiple model outputs. + +**Model outputs:** +- `r1 = x²` +- `r2 = y²` + +**Output expression:** `r1 + r2 * 2` (minimize x² + 2y²) + +### Code + +```python +import fz + +# Model with two separate outputs +model_multi = { + "varprefix": "$", + "delim": "()", + "run": "bash -c 'source input.txt && r1=$(echo \"scale=6; $x * $x\" | bc) && r2=$(echo \"scale=6; $y * $y\" | bc) && echo \"r1 = $r1\" > output.txt && echo \"r2 = $r2\" >> output.txt'", + "output": { + "r1": "grep 'r1 = ' output.txt | cut -d '=' -f2 | tr -d ' '", + "r2": "grep 'r2 = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } +} + +# Run fzd with custom expression combining r1 and r2 +result = fz.fzd( + input_path="input/", + input_variables={"x": "[-2;2]", "y": "[-2;2]"}, + model=model_multi, + output_expression="r1 + r2 * 2", # Custom expression + algorithm="examples/algorithms/randomsampling.py", + algorithm_options={"nvalues": 10, "seed": 42} +) + +print(f"Output expression: r1 + r2 * 2") +print(f"Total evaluations: {result['total_evaluations']}") + +# Get best result +df = result['XY'] +best_idx = df['r1 + r2 * 2'].idxmin() +print(f"\nBest result:") +print(f" x = {df.loc[best_idx, 'x']:.6f}") +print(f" y = {df.loc[best_idx, 'y']:.6f}") +print(f" r1 = {df.loc[best_idx, 'r1']:.6f}") +print(f" r2 = {df.loc[best_idx, 'r2']:.6f}") +print(f" r1 + r2 * 2 = {df.loc[best_idx, 'r1 + r2 * 2']:.6f}") +``` + +### Available Expression Operators + +The output expression supports: +- **Arithmetic:** `+`, `-`, `*`, `/`, `**` (power) +- **Functions:** `abs()`, `min()`, `max()`, `sqrt()`, `exp()`, `log()`, etc. +- **Math constants:** `pi`, `e` +- **Model outputs:** Any variable from `model["output"]` + +### Example Expressions + +```python +# Simple sum +output_expression = "output1 + output2" + +# Weighted sum +output_expression = "0.7 * pressure + 0.3 * temperature" + +# Constraint violation penalty +output_expression = "cost + 1000 * max(0, temperature - 100)" + +# Root mean square +output_expression = "sqrt((x1**2 + x2**2 + x3**2) / 3)" + +# Array indexing (for trajectory data) +output_expression = "res_x[-1]" # Last element +``` + +--- + +## Complete Running Example + +Here's a complete, self-contained example you can run: + +```python +#!/usr/bin/env python3 +"""Complete fzd example""" + +import fz +import tempfile +import shutil +from pathlib import Path + +# Create temporary directory +tmpdir = Path(tempfile.mkdtemp()) + +try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\ny = $y\n") + + # Define model + model = { + "varprefix": "$", + "delim": "()", + "run": "bash -c 'source input.txt && result=$(echo \"scale=6; $x * $x + $y * $y\" | bc) && echo \"result = $result\" > output.txt'", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Run optimization + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[-2;2]", "y": "[-2;2]"}, + model=model, + output_expression="result", + algorithm="examples/algorithms/randomsampling.py", + algorithm_options={"nvalues": 10, "seed": 42} + ) + + # Display results + print(f"Total evaluations: {result['total_evaluations']}") + df = result['XY'] + best_idx = df['result'].idxmin() + print(f"Best: x={df.loc[best_idx, 'x']:.4f}, y={df.loc[best_idx, 'y']:.4f}, result={df.loc[best_idx, 'result']:.6f}") + +finally: + shutil.rmtree(tmpdir) +``` + +--- + +## Algorithm Comparison + +| Algorithm | Type | Dimensions | Evaluations | Best For | +|-----------|------|------------|-------------|----------| +| **Random Sampling** | Exploration | Any | High (10-100+) | Exploration, baselines | +| **Brent** | Optimization | 1D only | Low (5-15) | 1D root finding, precise 1D optim | +| **BFGS** | Optimization | 2-50 | Medium (10-30) | Smooth multi-D optimization | +| **Gradient Descent** | Optimization | Any | Medium (10-50) | Large-scale, simple implementation | +| **Monte Carlo** | Integration | Any | High (100-10000) | Uncertainty quantification | + +--- + +## Tips and Best Practices + +### 1. Choose the Right Algorithm + +- **1D problems:** Use Brent's method +- **2-10D smooth problems:** Use BFGS +- **10-50D smooth problems:** Use gradient descent +- **Non-smooth or exploratory:** Use random sampling +- **Global optimization:** Combine random sampling → local optimization + +### 2. Set Appropriate Tolerances + +```python +# High precision (more evaluations) +algorithm_options={"tol": 1e-6, "max_iter": 100} + +# Fast exploration (fewer evaluations) +algorithm_options={"tol": 1e-2, "max_iter": 20} +``` + +### 3. Use Fixed Variables + +```python +# Optimize angle, keep velocity fixed +input_variables={ + "angle": "[20;80]", # Variable (search range) + "velocity": "45.0" # Fixed value +} +``` + +### 4. Monitor Progress + +All algorithms support `get_analysis_tmp()` for intermediate results: + +```python +# Results saved after each iteration +# Check: analysis_dir/results_1.html, results_2.html, etc. +``` + +### 5. Combine Approaches + +```python +# 1. Explore with random sampling +explore = fz.fzd(..., algorithm="randomsampling", nvalues=20) + +# 2. Refine with BFGS starting near best point +# Use best result from exploration as initial guess +refine = fz.fzd(..., algorithm="bfgs", x0=best_from_explore) +``` + +--- + +## Troubleshooting + +### Issue: Algorithm not found + +```python +# ✗ Wrong +algorithm="bfgs" # Looks for installed plugin + +# ✓ Correct +algorithm="examples/algorithms/bfgs.py" # Full path +``` + +### Issue: Output expression fails + +Check available variables and use proper syntax: + +```python +# See what's available +print(result['XY'].columns) + +# Use correct variable names +output_expression="pressure" # Must match model output key +``` + +### Issue: Slow convergence + +Try adjusting algorithm parameters: + +```python +# Increase step size (gradient descent) +algorithm_options={"step_size": 2.0} + +# Relax tolerance +algorithm_options={"tol": 1e-3} +``` + +--- + +## See Also + +- **FZ Documentation:** https://github.com/Funz/fz +- **Algorithm Development:** See `examples/algorithms/` for templates +- **Modelica Integration:** See `examples/fz_modelica_projectile.ipynb` +- **API Reference:** Run `python -c "import fz; help(fz.fzd)"` + +--- + +## Summary + +The `fzd()` function provides a unified interface for iterative design of experiments: + +1. **Define your problem:** Input variables, model, output expression +2. **Choose an algorithm:** Based on problem type and dimensionality +3. **Set options:** Tolerance, iterations, algorithm-specific parameters +4. **Run optimization:** `fz.fzd()` handles the iterative sampling +5. **Analyze results:** DataFrame with all evaluations + algorithm analysis + +**Key advantage:** Adaptive algorithms are 2-10× more efficient than grid search while achieving better precision! diff --git a/examples/variable_substitution.md b/examples/variable_substitution.md deleted file mode 100644 index c567ea9..0000000 --- a/examples/variable_substitution.md +++ /dev/null @@ -1,359 +0,0 @@ -# Variable Substitution in FZ - -This document explains how variable substitution works in the FZ package, including the new default value syntax. - -## Basic Syntax - -Variables in template files are replaced using a prefix (default `$`) and optional delimiters: - -### Simple Variables - -``` -Name: $name -Version: $version -``` - -### Delimited Variables - -Use delimiters `{}` when the variable name is followed by alphanumeric characters: - -``` -File: ${name}_config.txt -Path: /home/${user}/documents -``` - -## Default Values (v0.9.1+) - -You can specify default values that will be used when a variable is not provided. - -### Syntax - -``` -${variable~default_value} -``` - -- Variable name: `variable` -- Separator: `~` (tilde) -- Default value: `default_value` - -### Examples - -```python -from fz.interpreter import replace_variables_in_content - -content = """ -Application Configuration: - name: ${app_name~MyApplication} - host: ${host~localhost} - port: ${port~8080} - debug: ${debug~false} - max_connections: ${max_conn~100} -""" - -# Only provide some variables -input_variables = { - "app_name": "ProductionApp", - "host": "example.com" -} - -result = replace_variables_in_content(content, input_variables) -``` - -**Output:** -``` -Application Configuration: - name: ProductionApp - host: example.com - port: 8080 - debug: false - max_connections: 100 -``` - -**Warnings printed:** -``` -Warning: Variable 'port' not found in input_variables, using default value: '8080' -Warning: Variable 'debug' not found in input_variables, using default value: 'false' -Warning: Variable 'max_conn' not found in input_variables, using default value: '100' -``` - -## Behavior Rules - -### 1. Variable Provided - -When a variable is provided in `input_variables`, its value is used regardless of any default: - -```python -content = "Port: ${port~8080}" -input_variables = {"port": 3000} -# Result: "Port: 3000" -# No warning -``` - -### 2. Variable Not Provided, Has Default - -When a variable is not provided but has a default value, the default is used and a warning is printed: - -```python -content = "Port: ${port~8080}" -input_variables = {} -# Result: "Port: 8080" -# Warning: Variable 'port' not found in input_variables, using default value: '8080' -``` - -### 3. Variable Not Provided, No Default - -When a variable is not provided and has no default, it remains unchanged: - -```python -content = "Port: ${port}" -input_variables = {} -# Result: "Port: ${port}" -# No warning -``` - -## Use Cases - -### Configuration Templates - -Create configuration files with sensible defaults: - -```yaml -# config.yaml.template -server: - host: ${SERVER_HOST~0.0.0.0} - port: ${SERVER_PORT~8080} - workers: ${WORKERS~4} - -database: - url: ${DATABASE_URL~sqlite:///./app.db} - pool_size: ${DB_POOL~5} - -logging: - level: ${LOG_LEVEL~INFO} - file: ${LOG_FILE~/var/log/app.log} -``` - -### Environment-Specific Deployments - -Different defaults for different environments: - -```python -# Development -dev_vars = {"host": "localhost", "debug": "true"} - -# Production -prod_vars = {"host": "production.example.com", "workers": "8"} - -# Both use same template with defaults -template = """ -Host: ${host~0.0.0.0} -Port: ${port~8080} -Debug: ${debug~false} -Workers: ${workers~4} -""" -``` - -### Parametric Studies with Optional Parameters - -```python -from fz import fzi - -# Some variables have defaults in the template -results = fzi( - input_path="simulation.template", - input_variables={ - "temperature": [100, 200, 300], # Required - # pressure uses default from template - # time_step uses default from template - }, - output_expression="max_temp" -) -``` - -## Default Value Types - -### Numeric Values - -``` -Threads: ${threads~4} -Timeout: ${timeout~30.5} -``` - -### String Values - -``` -Name: ${name~MyApp} -Message: ${msg~Hello World} -Path: ${path~/usr/local/bin} -``` - -### Boolean-like Values - -``` -Debug: ${debug~false} -Enabled: ${enabled~true} -``` - -### URLs and Paths - -``` -API: ${api_url~http://localhost:8080/api} -Config: ${config_path~./config/default.json} -``` - -### Empty Strings - -Use empty default to make variable optional: - -``` -Suffix: ${suffix~} -Optional: ${opt~} -``` - -## Advanced Usage - -### With Formulas - -Default values work alongside formula evaluation: - -``` -# Template -Port: ${port~8080} -URL: http://localhost:@{$port}/api - -# Python code -content = replace_variables_in_content(template, {"port": 3000}) -result = evaluate_formulas(content, model, {"port": 3000}) -# Result: "Port: 3000" and "URL: http://localhost:3000/api" -``` - -### Parsing Variables - -The `parse_variables_from_content` function extracts variable names, ignoring defaults: - -```python -from fz.interpreter import parse_variables_from_content - -content = "${var1~default1}, ${var2}, ${var3~default3}" -variables = parse_variables_from_content(content) -# Returns: {"var1", "var2", "var3"} -``` - -## Limitations - -### Tilde in Default Values - -If your default value contains a tilde `~`, only the part before the first `~` in the variable definition is treated as the variable name: - -``` -# This may not work as expected: -${path~~~home/user} # Variable: path, Default: ~home/user -``` - -### Braces in Default Values - -If your default value contains the closing delimiter `}`, it will prematurely close the variable pattern: - -``` -# Problematic: -${json~{key: value}} # Will only capture up to first } -``` - -## Migration from Earlier Versions - -If you're upgrading from an earlier version of FZ: - -### Before (v0.9.0 and earlier) - -All variables had to be provided, or they would remain as placeholders: - -```python -content = "Port: ${port}" -input_variables = {} -# Result: "Port: ${port}" -``` - -### After (v0.9.1+) - -You can now specify defaults: - -```python -content = "Port: ${port~8080}" -input_variables = {} -# Result: "Port: 8080" -# Warning: Variable 'port' not found in input_variables, using default value: '8080' -``` - -### Backward Compatibility - -The old syntax still works exactly as before. Default values are opt-in: - -- `$var` - Simple variable (unchanged) -- `${var}` - Delimited variable (unchanged) -- `${var~default}` - New: variable with default - -## Best Practices - -1. **Use descriptive default values** that make sense in a development/testing context -2. **Document your template variables** and their defaults -3. **Keep defaults simple** - avoid complex expressions or special characters -4. **Use defaults for optional configuration** but require critical parameters -5. **Review warnings** - they indicate which variables are using defaults - -## Examples - -### Docker Configuration - -```dockerfile -# Dockerfile.template -FROM ${BASE_IMAGE~python:3.11-slim} - -ENV APP_HOME=${APP_HOME~/app} -ENV PORT=${PORT~8080} -ENV WORKERS=${WORKERS~4} - -WORKDIR $APP_HOME -COPY . . - -CMD gunicorn -w $WORKERS -b 0.0.0.0:$PORT app:app -``` - -### Kubernetes ConfigMap - -```yaml -# configmap.yaml.template -apiVersion: v1 -kind: ConfigMap -metadata: - name: ${APP_NAME~myapp}-config -data: - database.url: ${DATABASE_URL~postgresql://localhost:5432/mydb} - cache.ttl: "${CACHE_TTL~3600}" - log.level: ${LOG_LEVEL~INFO} - feature.beta: "${FEATURE_BETA~false}" -``` - -### Scientific Simulation - -``` -# simulation.template -Simulation Parameters: - particles: $n_particles - time_steps: ${time_steps~1000} - dt: ${dt~0.001} - temperature: ${temp~300.0} - pressure: ${pressure~1.0} - output_freq: ${output_freq~100} -``` - -## Summary - -Default values provide a powerful way to create flexible, reusable templates with sensible fallback values. They're especially useful for: - -- Configuration management -- Environment-specific deployments -- Optional parameters in parametric studies -- Template files with development defaults -- Reducing the number of required variables - -The syntax is simple: `${variable~default}`, and the behavior is predictable: use the provided value if available, otherwise use the default and warn. diff --git a/fz/__init__.py b/fz/__init__.py index 891c875..ee7d36e 100644 --- a/fz/__init__.py +++ b/fz/__init__.py @@ -10,7 +10,7 @@ - Smart caching and retry mechanisms """ -from .core import fzi, fzc, fzo, fzr, check_bash_availability_on_windows +from .core import fzi, fzc, fzo, fzr, fzd, check_bash_availability_on_windows # Check bash availability on Windows at import time # This ensures users get immediate feedback if bash is not available @@ -30,68 +30,16 @@ install_model, uninstall_model, list_installed_models, + install_algorithm, + uninstall_algorithm, + list_installed_algorithms, ) - -def install(model, global_install=False): - """ - Install a model from a source (GitHub name, URL, or local zip file) - - Args: - model: Model source to install (GitHub name, URL, or local zip file) - global_install: If True, install to ~/.fz/models/, else ./.fz/models/ - - Returns: - Installation result dict with 'model_name' and 'install_path' keys - - Examples: - >>> install(model='moret') - >>> install(model='https://github.com/Funz/fz-moret') - >>> install(model='fz-moret.zip') - >>> install(model='moret', global_install=True) - """ - return install_model(model, global_install=global_install) - - -def uninstall(model, global_uninstall=False): - """ - Uninstall a model - - Args: - model: Name of the model to uninstall - global_uninstall: If True, uninstall from ~/.fz/models/, else from ./.fz/models/ - - Returns: - True if successful, False otherwise - - Examples: - >>> uninstall(model='moret') - >>> uninstall(model='moret', global_uninstall=True) - """ - return uninstall_model(model, global_uninstall=global_uninstall) - - -def list_models(global_list=False): - """ - List installed models - - Args: - global_list: If True, list from ~/.fz/models/, else from ./.fz/models/ - - Returns: - Dict mapping model names to their definitions - - Examples: - >>> list_models() - >>> list_models(global_list=True) - """ - return list_installed_models(global_list=global_list) - - __version__ = "0.9.0" __all__ = [ - "fzi", "fzc", "fzo", "fzr", - "install", "uninstall", "list_models", + "fzi", "fzc", "fzo", "fzr", "fzd", + "install_model", "uninstall_model", "list_installed_models", + "install_algorithm", "uninstall_algorithm", "list_installed_algorithms", "set_log_level", "get_log_level", "get_config", "reload_config", "print_config", "set_interpreter", "get_interpreter", diff --git a/fz/algorithms.py b/fz/algorithms.py new file mode 100644 index 0000000..80a94a1 --- /dev/null +++ b/fz/algorithms.py @@ -0,0 +1,989 @@ +""" +Algorithm framework for iterative design of experiments (fzd) + +This module provides the base interface and utilities for algorithms used with fzd. + +Algorithm Interface: +------------------- +Each algorithm must be a class with the following methods: + +1. __init__(self, **options): + Constructor that accepts algorithm-specific options + +2. get_initial_design(self, input_vars, output_vars): + Returns initial design of experiments + Args: + input_vars: Dict[str, tuple] - {var_name: (min, max)} + e.g., {"x": (0.0, 1.0), "y": (-5.0, 5.0)} + output_vars: List[str] - List of output variable names + Returns: + List[Dict[str, float]] - List of input variable combinations to evaluate + e.g., [{"x": 0.5, "y": 0.0}, {"x": 0.7, "y": 2.3}] + +3. get_next_design(self, previous_input_vars, previous_output_values): + Returns next design of experiments based on previous results + Args: + previous_input_vars: List[Dict[str, float]] - Previous input combinations + e.g., [{"x": 0.5, "y": 0.0}, {"x": 0.7, "y": 2.3}] + previous_output_values: List[float] - Corresponding output values (may contain None) + e.g., [1.5, None, 2.3, 0.8] + Returns: + List[Dict[str, float]] - Next input variable combinations to evaluate + Returns empty list [] when algorithm is finished + + Note: While the interface uses Python lists, algorithms can convert to numpy arrays + internally for numerical computation. Output values may contain None for failed + evaluations, so filter these out before numerical operations. + +4. get_analysis(self, input_vars, output_values): + Returns results to analysis + Args: + input_vars: List[Dict[str, float]] - All evaluated input combinations + output_values: List[float] - All corresponding output values (may contain None) + Returns: + Dict with analysis information (can include 'text', 'data', 'plot', etc.) + + Note: Should handle None values in output_values (failed evaluations) + +5. get_analysis_tmp(self, input_vars, output_values): [OPTIONAL] + Display intermediate results at each iteration + Args: + input_vars: List[Dict[str, float]] - All evaluated inputs so far + output_values: List[float] - All outputs so far (may contain None) + Returns: + Dict with analysis information (typically 'text' and 'data' keys) + + Note: This method is optional. If present, it will be called after each iteration + to show progress. If not present, no intermediate results are displayed. +""" + +import re +import importlib +import importlib.util +import sys +import subprocess +import inspect +from pathlib import Path +from typing import Dict, List, Tuple, Any, Optional +import logging + + +def parse_input_vars(input_vars: Dict[str, str]) -> Dict[str, Tuple[float, float]]: + """ + Parse input variable ranges from string descriptions + + Supports two formats: + - Range (variable): "[min;max]" or "[min,max]" - will be varied by algorithm + - Fixed (unique): single numeric value - will NOT be varied by algorithm + + Args: + input_vars: Dict of {var_name: "[min;max]"} or {var_name: "value"} + + Returns: + Dict of {var_name: (min, max)} for range variables only + + Examples: + >>> parse_input_vars({"x": "[0;1]", "y": "[-5.5;5.5]"}) + {'x': (0.0, 1.0), 'y': (-5.5, 5.5)} + + >>> parse_input_vars({"x": "[0;1]", "z": "0.5"}) # z is fixed + {'x': (0.0, 1.0)} + """ + parsed = {} + + for var_name, range_str in input_vars.items(): + # Check if it's a range format [min;max] or [min,max] + match = re.match(r'\[([^;,]+)[;,]([^;,]+)\]', range_str.strip()) + + if match: + # It's a range - parse it + try: + min_val = float(match.group(1).strip()) + max_val = float(match.group(2).strip()) + except ValueError as e: + raise ValueError( + f"Invalid numeric values in range for variable '{var_name}': '{range_str}'" + ) from e + + if min_val >= max_val: + raise ValueError( + f"Invalid range for variable '{var_name}': min ({min_val}) must be < max ({max_val})" + ) + + parsed[var_name] = (min_val, max_val) + else: + # It's a fixed value - skip it (will be handled by parse_fixed_vars) + # Try to validate it's a number + try: + float(range_str.strip()) + except ValueError: + raise ValueError( + f"Invalid format for variable '{var_name}': '{range_str}'. " + f"Expected '[min;max]' for range or numeric value for fixed variable" + ) + + return parsed + + +def parse_fixed_vars(input_vars: Dict[str, str]) -> Dict[str, float]: + """ + Parse fixed (unique) input variables from string descriptions + + Fixed variables have single numeric values and will NOT be varied by the algorithm. + + Args: + input_vars: Dict of {var_name: "value"} + + Returns: + Dict of {var_name: value} for fixed variables only + + Examples: + >>> parse_fixed_vars({"x": "[0;1]", "z": "0.5"}) + {'z': 0.5} + """ + fixed = {} + + for var_name, value_str in input_vars.items(): + # Check if it's NOT a range format + if not re.match(r'\[([^;,]+)[;,]([^;,]+)\]', value_str.strip()): + try: + fixed[var_name] = float(value_str.strip()) + except ValueError as e: + raise ValueError( + f"Invalid numeric value for fixed variable '{var_name}': '{value_str}'" + ) from e + + return fixed + + +def evaluate_output_expression( + expression: str, + output_data: Dict[str, Any] +) -> float: + """ + Evaluate mathematical expression using output variables + + Args: + expression: Mathematical expression like "output1 + output2 * 2" + output_data: Dict of output variable values + + Returns: + Evaluated numeric result + + Examples: + >>> evaluate_output_expression("x + y * 2", {"x": 1.0, "y": 3.0}) + 7.0 + """ + try: + # Create a safe evaluation environment with only the output variables + # and math functions + import math + safe_dict = { + # Math functions + 'abs': abs, + 'min': min, + 'max': max, + 'pow': pow, + 'sqrt': math.sqrt, + 'exp': math.exp, + 'log': math.log, + 'log10': math.log10, + 'sin': math.sin, + 'cos': math.cos, + 'tan': math.tan, + 'asin': math.asin, + 'acos': math.acos, + 'atan': math.atan, + 'atan2': math.atan2, + 'pi': math.pi, + 'e': math.e, + } + + # Add output variables + safe_dict.update(output_data) + + # Evaluate the expression + result = eval(expression, {"__builtins__": {}}, safe_dict) + + return float(result) + + except Exception as e: + raise ValueError( + f"Failed to evaluate output expression '{expression}' with data {output_data}: {e}" + ) from e + + +def _is_algorithm_class(obj) -> bool: + """ + Check if an object is a valid algorithm class + + Args: + obj: Object to check + + Returns: + True if obj is a class with required algorithm methods + """ + if not inspect.isclass(obj): + return False + + # Check for required methods + required_methods = ['get_initial_design', 'get_next_design', 'get_analysis'] + for method_name in required_methods: + if not hasattr(obj, method_name): + return False + + return True + + +def _parse_algorithm_metadata(file_path: Path) -> Dict[str, Any]: + """ + Parse metadata from algorithm file comments + + Looks for comments like: + #title: Algorithm title + #author: Author name + #type: algorithm type + #options: key1=value1;key2=value2 + #require: package1;package2;package3 + + Args: + file_path: Path to algorithm file + + Returns: + Dict with parsed metadata + """ + metadata = {} + + try: + with open(file_path, 'r') as f: + for line in f: + line = line.strip() + + # Stop at first non-comment line + if line and not line.startswith('#'): + break + + # Parse metadata lines + if line.startswith('#'): + # Remove leading # and split on first : + content = line[1:].strip() + if ':' in content: + key, value = content.split(':', 1) + key = key.strip() + value = value.strip() + + # Store metadata + if key == 'options': + # Parse options as key=value pairs separated by semicolons + options_dict = {} + for opt in value.split(';'): + if '=' in opt: + opt_key, opt_val = opt.split('=', 1) + options_dict[opt_key.strip()] = opt_val.strip() + metadata[key] = options_dict + elif key == 'require': + # Parse requirements as semicolon-separated list + metadata[key] = [pkg.strip() for pkg in value.split(';')] + else: + metadata[key] = value + except Exception as e: + logging.warning(f"Failed to parse metadata from {file_path}: {e}") + + return metadata + + +def _load_algorithm_from_file(file_path: Path, **options): + """ + Load algorithm class from a Python file + + Args: + file_path: Path to Python file containing algorithm class + **options: Options to pass to algorithm constructor + + Returns: + Algorithm instance + + Raises: + ValueError: If no valid algorithm class is found in the file + """ + # Parse metadata from file + metadata = _parse_algorithm_metadata(file_path) + + # Check and install required packages if specified + if 'require' in metadata: + for package in metadata['require']: + try: + importlib.import_module(package) + logging.info(f"✓ Package '{package}' is available") + except ImportError: + logging.info(f"⚠️ Package '{package}' not found - attempting to install...") + try: + # Install the package using pip + subprocess.check_call( + [sys.executable, "-m", "pip", "install", package], + stdout=subprocess.DEVNULL, + stderr=subprocess.PIPE + ) + logging.info(f"✓ Successfully installed '{package}'") + + # Verify the installation + try: + importlib.import_module(package) + except ImportError: + logging.warning( + f"Package '{package}' was installed but could not be imported. " + f"You may need to restart your Python session." + ) + except subprocess.CalledProcessError as e: + error_msg = e.stderr.decode('utf-8') if e.stderr else '' + raise RuntimeError( + f"Failed to install required package '{package}'. " + f"Please install it manually with: pip install {package}\n" + f"Error: {error_msg}" + ) from e + except Exception as e: + raise RuntimeError( + f"Unexpected error while installing package '{package}': {e}\n" + f"Please install it manually with: pip install {package}" + ) from e + + # Merge metadata options with passed options (passed options take precedence) + if 'options' in metadata: + merged_options = metadata['options'].copy() + merged_options.update(options) + options = merged_options + + # Load the Python module from file + module_name = file_path.stem + spec = importlib.util.spec_from_file_location(module_name, file_path) + + if spec is None or spec.loader is None: + raise ValueError(f"Failed to load module from {file_path}") + + module = importlib.util.module_from_spec(spec) + + # Add to sys.modules to allow imports within the module + sys.modules[module_name] = module + + try: + spec.loader.exec_module(module) + except Exception as e: + # Clean up sys.modules on failure + if module_name in sys.modules: + del sys.modules[module_name] + raise ValueError(f"Failed to execute module from {file_path}: {e}") from e + + # Find algorithm classes in the module + algorithm_classes = [] + for name, obj in inspect.getmembers(module): + if _is_algorithm_class(obj): + algorithm_classes.append((name, obj)) + + if not algorithm_classes: + raise ValueError( + f"No valid algorithm class found in {file_path}. " + f"Algorithm class must have methods: get_initial_design, get_next_design, get_analysis" + ) + + # If multiple classes found, prefer one that's not BaseAlgorithm + if len(algorithm_classes) > 1: + algorithm_classes = [(n, c) for n, c in algorithm_classes if n != 'BaseAlgorithm'] + + if not algorithm_classes: + raise ValueError(f"No valid algorithm class found in {file_path} (only BaseAlgorithm found)") + + # Use the first (or only) algorithm class + algorithm_name, algorithm_class = algorithm_classes[0] + + logging.info(f"Loaded algorithm class '{algorithm_name}' from {file_path}") + if metadata: + logging.info(f"Algorithm metadata: {metadata}") + + # Create and return instance + return algorithm_class(**options) + + +class RAlgorithmWrapper: + """ + Python wrapper for R algorithms loaded via rpy2 + + This class wraps an R algorithm instance and exposes its methods as Python methods, + handling data conversion between Python and R. + """ + + def __init__(self, r_instance, r_globals): + """ + Initialize wrapper with R instance + + Args: + r_instance: R algorithm instance from rpy2 + r_globals: R global environment containing generic functions + """ + self.r_instance = r_instance + self.r_globals = r_globals + + def get_initial_design(self, input_vars: Dict[str, Tuple[float, float]], output_vars: List[str]) -> List[Dict[str, float]]: + """ + Call R's get_initial_design method and convert result to Python + + Args: + input_vars: Dict of {var_name: (min, max)} + output_vars: List of output variable names + + Returns: + List of input variable combinations (Python dicts) + """ + try: + from rpy2 import robjects + from rpy2.robjects import vectors + except ImportError: + raise ImportError("rpy2 is required to use R algorithms. Install with: pip install rpy2") + + # Convert input_vars to R format: named list with c(min, max) vectors + r_input_vars = robjects.ListVector({ + var: vectors.FloatVector([bounds[0], bounds[1]]) + for var, bounds in input_vars.items() + }) + + # Convert output_vars to R character vector + r_output_vars = vectors.StrVector(output_vars) + + # Call R method + r_result = self.r_globals['get_initial_design'](self.r_instance, r_input_vars, r_output_vars) + + # Convert R list of lists to Python list of dicts + return self._r_design_to_python(r_result) + + def get_next_design(self, X: List[Dict[str, float]], Y: List[float]) -> List[Dict[str, float]]: + """ + Call R's get_next_design method and convert result to Python + + Args: + X: Previous input combinations (list of dicts) + Y: Previous output values (list of floats, may contain None) + + Returns: + Next input variable combinations (Python dicts), or empty list if finished + """ + try: + from rpy2 import robjects + except ImportError: + raise ImportError("rpy2 is required to use R algorithms. Install with: pip install rpy2") + + # Convert X and Y to R format + r_X = self._python_design_to_r(X) + r_Y = self._python_outputs_to_r(Y) + + # Call R method + r_result = self.r_globals['get_next_design'](self.r_instance, r_X, r_Y) + + # Convert result to Python + return self._r_design_to_python(r_result) + + def get_analysis(self, X: List[Dict[str, float]], Y: List[float]) -> Dict[str, Any]: + """ + Call R's get_analysis method and convert result to Python + + Args: + X: All evaluated input combinations + Y: All output values (may contain None) + + Returns: + Dict with analysis information ('text', 'data', etc.) + """ + try: + from rpy2 import robjects + except ImportError: + raise ImportError("rpy2 is required to use R algorithms. Install with: pip install rpy2") + + # Convert X and Y to R format + r_X = self._python_design_to_r(X) + r_Y = self._python_outputs_to_r(Y) + + # Call R method + r_result = self.r_globals['get_analysis'](self.r_instance, r_X, r_Y) + + # Convert R list to Python dict + return self._r_dict_to_python(r_result) + + def get_analysis_tmp(self, X: List[Dict[str, float]], Y: List[float]) -> Dict[str, Any]: + """ + Call R's get_analysis_tmp method if it exists + + Args: + X: Evaluated input combinations so far + Y: Output values so far (may contain None) + + Returns: + Dict with intermediate analysis information, or None if method doesn't exist + """ + try: + from rpy2 import robjects + except ImportError: + raise ImportError("rpy2 is required to use R algorithms. Install with: pip install rpy2") + + # Check if method exists in R + try: + # Convert X and Y to R format + r_X = self._python_design_to_r(X) + r_Y = self._python_outputs_to_r(Y) + + # Call R method + r_result = self.r_globals['get_analysis_tmp'](self.r_instance, r_X, r_Y) + + # Convert R list to Python dict + return self._r_dict_to_python(r_result) + except Exception: + # Method doesn't exist or failed - return None + return None + + def _python_design_to_r(self, design: List[Dict[str, float]]): + """Convert Python design (list of dicts) to R list of lists""" + from rpy2 import robjects + from rpy2.robjects import vectors + + if not design: + # Empty list + return robjects.r('list()') + + # Convert to R list of lists + r_list = robjects.r('list()') + for point in design: + r_point = robjects.ListVector(point) + r_list = robjects.r.c(r_list, robjects.r.list(r_point)) + + return r_list + + def _python_outputs_to_r(self, outputs: List[float]): + """Convert Python outputs (list of floats/None) to R list""" + from rpy2 import robjects + + # Convert Python list to R list, preserving None as NULL + # Use R's list() function directly + r_list = robjects.r('list()') + + for val in outputs: + if val is None: + # Append R NULL + r_list = robjects.r.c(r_list, robjects.r('list(NULL)')) + else: + # Append numeric value + r_list = robjects.r.c(r_list, robjects.r.list(val)) + + return r_list + + def _r_design_to_python(self, r_design) -> List[Dict[str, float]]: + """Convert R design (list of lists) to Python list of dicts""" + if r_design is None or len(r_design) == 0: + return [] + + result = [] + for r_point in r_design: + # Convert R list to Python dict + point = {} + for name in r_point.names: + point[name] = float(r_point.rx2(name)[0]) + result.append(point) + + return result + + def _r_dict_to_python(self, r_list) -> Dict[str, Any]: + """Convert R list to Python dict, handling nested structures""" + from rpy2 import robjects + from rpy2.robjects import vectors + + result = {} + + if r_list is None: + return result + + for name in r_list.names: + r_value = r_list.rx2(name) + + # Convert based on R type + if isinstance(r_value, vectors.StrVector): + # String or character vector + if len(r_value) == 1: + result[name] = str(r_value[0]) + else: + result[name] = [str(v) for v in r_value] + elif isinstance(r_value, (vectors.FloatVector, vectors.IntVector)): + # Numeric vector + if len(r_value) == 1: + result[name] = float(r_value[0]) + else: + result[name] = [float(v) for v in r_value] + elif isinstance(r_value, vectors.ListVector): + # Nested list - recursively convert + result[name] = self._r_dict_to_python(r_value) + else: + # Try to convert to Python directly + try: + result[name] = r_value + except Exception: + # If conversion fails, store as string + result[name] = str(r_value) + + return result + + +def _load_r_algorithm_from_file(file_path: Path, **options): + """ + Load algorithm from an R file using rpy2 + + Args: + file_path: Path to R file containing algorithm S3 class + **options: Options to pass to algorithm constructor + + Returns: + RAlgorithmWrapper instance wrapping the R algorithm + + Raises: + ImportError: If rpy2 is not installed + ValueError: If R algorithm cannot be loaded + """ + try: + from rpy2 import robjects + from rpy2.robjects import vectors + except ImportError: + raise ImportError( + "rpy2 is required to use R algorithms.\n" + "Install with: pip install rpy2\n" + "Note: R must also be installed on your system." + ) + + # Parse metadata from R file + metadata = _parse_algorithm_metadata(file_path) + + # Check and install required R packages if specified + if 'require' in metadata: + for package in metadata['require']: + # Check if R package is available + r_check = robjects.r(f''' + if (!requireNamespace("{package}", quietly = TRUE)) {{ + FALSE + }} else {{ + TRUE + }} + ''') + + if not r_check[0]: + logging.warning( + f"⚠️ R package '{package}' not found.\n" + f" Install in R with: install.packages('{package}')" + ) + + # Merge metadata options with passed options + if 'options' in metadata: + merged_options = metadata['options'].copy() + merged_options.update(options) + options = merged_options + + # Source the R file + logging.info(f"Loading R algorithm from {file_path}") + try: + robjects.r.source(str(file_path)) + except Exception as e: + raise ValueError(f"Failed to source R file {file_path}: {e}") from e + + # Get R global environment + r_globals = robjects.globalenv + + # Find the algorithm constructor function + # Search for functions in R global environment that could be constructors + # Priority: try matching the file stem first, then search all functions + + # Try different naming conventions for the file stem + possible_names = [ + file_path.stem, # montecarlo_uniform + file_path.stem.replace('_', '').title(), # Montecarlouniform + ''.join(word.capitalize() for word in file_path.stem.split('_')), # MontecarloUniform + '_'.join(word.capitalize() for word in file_path.stem.split('_')), # Montecarlo_Uniform + file_path.stem.title(), # Montecarlo_Uniform + ] + + constructor = None + constructor_name = None + + # First, try the expected naming conventions + for name in possible_names: + if name in r_globals: + constructor = r_globals[name] + constructor_name = name + logging.info(f"Found constructor by name matching: {name}") + break + + # If not found, search all objects for likely constructors + # Look for functions that match pattern: PascalCase or Mixed_Case + if constructor is None: + logging.info(f"Constructor not found in expected names: {possible_names}") + logging.info("Searching all R objects for potential constructors...") + + for name in sorted(list(r_globals.keys()), reverse=True): # Reverse sort to prefer longer/specific names + # Skip generic functions (they have dots for S3 dispatch) + if '.' in name: + continue + + # Skip all-lowercase names (likely helper functions, not constructors) + if name.islower(): + continue + + # Skip names starting with lowercase (not constructors) + if name[0].islower(): + continue + + # Check if it's a function + try: + obj = r_globals[name] + # Check if it's callable (function) + if robjects.r['is.function'](obj)[0]: + constructor = obj + constructor_name = name + logging.info(f"Found potential constructor: {name}") + break + except Exception: + continue + + if constructor is None: + available_objects = [name for name in r_globals.keys() if not name.startswith('.')] + raise ValueError( + f"No algorithm constructor found in {file_path}.\n" + f"Tried names: {possible_names}\n" + f"Available objects in R file: {available_objects}\n" + f"The R file should define a constructor function matching the filename." + ) + + logging.info(f"Found R algorithm constructor: {constructor_name}") + + # Call constructor with options + # R constructors use ... syntax, so we need to pass as named arguments + # rpy2 requires explicit conversion for some types + if options: + # Convert Python options to R-compatible types and pass as kwargs + r_kwargs = {} + for k, v in options.items(): + if isinstance(v, bool): + r_kwargs[k] = robjects.vectors.BoolVector([v]) + elif isinstance(v, int): + r_kwargs[k] = robjects.vectors.IntVector([v]) + elif isinstance(v, float): + r_kwargs[k] = robjects.vectors.FloatVector([v]) + elif isinstance(v, str): + r_kwargs[k] = robjects.vectors.StrVector([v]) + else: + r_kwargs[k] = v + + # Call with **kwargs - rpy2 will handle the ... properly + r_instance = constructor(**r_kwargs) + else: + r_instance = constructor() + + logging.info(f"Created R algorithm instance of class: {robjects.r['class'](r_instance)[0]}") + + # Create and return wrapper + return RAlgorithmWrapper(r_instance, r_globals) + + +def resolve_algorithm_path(algorithm: str) -> Optional[Path]: + """ + Resolve algorithm name to file path, searching in plugin directories + + This function implements the algorithm plugin system, similar to model aliases. + When given a simple name (e.g., "myalgorithm"), it searches for the algorithm + file in: + 1. .fz/algorithms/ (project-level, priority) + 2. ~/.fz/algorithms/ (global) + + Supports both .py and .R extensions. + + Args: + algorithm: Algorithm name (e.g., "myalgorithm") or path (e.g., "path/to/algo.py") + + Returns: + Path to algorithm file if found, None otherwise + + Examples: + >>> resolve_algorithm_path("myalgorithm") # Looks for .fz/algorithms/myalgorithm.py + Path('/path/to/project/.fz/algorithms/myalgorithm.py') + + >>> resolve_algorithm_path("path/to/algo.py") # Returns as-is (it's a path) + None # Caller should handle as direct path + """ + import os + + # If algorithm looks like a path (contains / or \ or has extension), don't resolve + if '/' in algorithm or '\\' in algorithm or algorithm.endswith(('.py', '.R')): + return None + + # Search in plugin directories + # Respect environment variables for home directory (for test mocking) + # Try HOME first (Unix), then USERPROFILE (Windows), then fall back to expanduser + home_dir = os.environ.get('HOME') or os.environ.get('USERPROFILE') or os.path.expanduser('~') + + search_dirs = [ + Path.cwd() / ".fz" / "algorithms", # Project-level (priority) + Path(home_dir) / ".fz" / "algorithms" # Global + ] + + # Try both .py and .R extensions + extensions = ['.py', '.R'] + + for base_dir in search_dirs: + if not base_dir.exists(): + continue + + for ext in extensions: + algo_path = base_dir / f"{algorithm}{ext}" + if algo_path.exists() and algo_path.is_file(): + logging.info(f"Found algorithm plugin: {algo_path}") + return algo_path + + return None + + +def load_algorithm(algorithm: str, **options): + """ + Load an algorithm from a Python or R file and create an instance with options + + This function supports two modes: + 1. Plugin mode: Simple name (e.g., "myalgorithm") - searches in .fz/algorithms/ + 2. Direct path: File path (e.g., "path/to/algo.py" or "algorithms/monte_carlo.R") + + Plugin directories (searched in order): + - .fz/algorithms/ (project-level, priority) + - ~/.fz/algorithms/ (global) + + Args: + algorithm: Algorithm name (plugin) or path to Python (.py) or R (.R) file + **options: Algorithm-specific options passed to the algorithm's __init__ method + + Returns: + Algorithm instance (Python object or R wrapper) + + Raises: + ValueError: If the file doesn't exist, contains no valid algorithm class, or cannot be loaded + ImportError: If rpy2 is not installed for .R files + + Examples: + # Plugin mode - searches in .fz/algorithms/ + >>> algo = load_algorithm("myalgorithm", batch_size=10) + + # Direct path mode + >>> algo = load_algorithm("algorithms/monte_carlo.py", batch_size=10, max_iter=100) + >>> algo = load_algorithm("algorithms/monte_carlo.R", batch_size=10) # requires rpy2 + """ + # Try to resolve as plugin first + resolved_path = resolve_algorithm_path(algorithm) + + if resolved_path is not None: + # Found as plugin + algorithm_path = str(resolved_path) + logging.info(f"Loading algorithm from plugin: {algorithm_path}") + else: + # Treat as direct path + algorithm_path = algorithm + + # Convert to Path object + algo_path = Path(algorithm_path) + + # Resolve to absolute path if relative + if not algo_path.is_absolute(): + algo_path = Path.cwd() / algo_path + + # Validate path + if not algo_path.exists(): + # Provide helpful error message + error_msg = f"Algorithm file not found: {algo_path}\n" + + # If it looks like a plugin name, suggest where to place it + if '/' not in algorithm and '\\' not in algorithm and not algorithm.endswith(('.py', '.R')): + error_msg += ( + f"Plugin '{algorithm}' not found. To use as plugin, place algorithm file at:\n" + f" - .fz/algorithms/{algorithm}.py (project-level), or\n" + f" - ~/.fz/algorithms/{algorithm}.py (global)\n" + f"Supported extensions: .py, .R" + ) + else: + error_msg += "Please provide a valid path to a Python (.py) or R (.R) file containing an algorithm class." + + raise ValueError(error_msg) + + if not algo_path.is_file(): + raise ValueError(f"Algorithm path is not a file: {algo_path}") + + # Check file extension and load appropriately + if str(algo_path).endswith('.py'): + # Load Python algorithm + return _load_algorithm_from_file(algo_path, **options) + elif str(algo_path).endswith('.R'): + # Load R algorithm + return _load_r_algorithm_from_file(algo_path, **options) + else: + raise ValueError( + f"Algorithm file must be a Python (.py) or R (.R) file: {algo_path}\n" + f"Got: {algo_path.suffix}" + ) + + +class BaseAlgorithm: + """ + Base class for algorithms (optional, for reference) + + Algorithms don't need to inherit from this, but it documents the interface + """ + + def __init__(self, **options): + """Initialize algorithm with options""" + self.options = options + + def get_initial_design( + self, + input_vars: Dict[str, Tuple[float, float]], + output_vars: List[str] + ) -> List[Dict[str, float]]: + """ + Generate initial design of experiments + + Args: + input_vars: Dict of {var_name: (min, max)} + output_vars: List of output variable names + + Returns: + List of input variable combinations to evaluate + """ + raise NotImplementedError() + + def get_next_design( + self, + previous_input_vars: List[Dict[str, float]], + previous_output_values: List[float] + ) -> List[Dict[str, float]]: + """ + Generate next design based on previous results + + Args: + previous_input_vars: Previous input combinations + previous_output_values: Corresponding output values + + Returns: + Next input variable combinations to evaluate + Returns empty list when finished + """ + raise NotImplementedError() + + def get_analysis( + self, + input_vars: List[Dict[str, float]], + output_values: List[float] + ) -> Dict[str, Any]: + """ + Format results for analysis + + Args: + input_vars: All evaluated input combinations + output_values: All corresponding output values + + Returns: + Dict with analysis information + """ + raise NotImplementedError() diff --git a/fz/cli.py b/fz/cli.py index b0ff0ba..db7ef50 100644 --- a/fz/cli.py +++ b/fz/cli.py @@ -12,7 +12,7 @@ except ImportError: from importlib_metadata import version -from . import fzi as fzi_func, fzc as fzc_func, fzo as fzo_func, fzr as fzr_func +from . import fzi as fzi_func, fzc as fzc_func, fzo as fzo_func, fzr as fzr_func, fzd as fzd_func # Get package version @@ -20,13 +20,26 @@ def get_version(): """Get the package version""" try: # Try the new package name first - return version("funz-fz") + v = version("funz-fz") + if v is not None: + return v except Exception: - try: - # Fallback to old package name for backward compatibility - return version("fz") - except Exception: - return "unknown" + pass + + try: + # Fallback to old package name for backward compatibility + v = version("fz") + if v is not None: + return v + except Exception: + pass + + # Fallback to __version__ from __init__.py + try: + from fz import __version__ + return __version__ + except: + return "unknown" # Helper functions used by all CLI commands @@ -128,6 +141,15 @@ def parse_calculators(calc_str): return result +def parse_algorithm(algo_str): + """Parse algorithm from JSON string, JSON file, or alias""" + return parse_argument(algo_str, alias_type='algorithms') + + +def parse_algorithm_options(opts_str): + """Parse algorithm options from JSON string or JSON file""" + return parse_argument(opts_str, alias_type=None) + def format_output(data, format_type='markdown'): """ Format output data in various formats @@ -368,6 +390,65 @@ def fzr_main(): return 1 +def fzd_main(): + """Entry point for fzd command""" + parser = argparse.ArgumentParser(description="fzd - Iterative design of experiments with algorithms") + parser.add_argument("--version", action="version", version=f"fzd {get_version()}") + parser.add_argument("--input_dir", "-i", required=True, help="Input directory path") + parser.add_argument("--input_vars", "-v", required=True, help="Input variable ranges (JSON file or inline JSON)") + parser.add_argument("--model", "-m", required=True, help="Model definition (JSON file, inline JSON, or alias)") + parser.add_argument("--output_expression", "-e", required=True, help="Output expression to minimize (e.g., 'out1 + out2 * 2')") + parser.add_argument("--algorithm", "-a", required=True, help="Algorithm name (randomsampling, brent, bfgs, ...)") + parser.add_argument("--results_dir", "-r", default="results_fzd", help="Results directory (default: results_fzd)") + parser.add_argument("--calculators", "-c", help="Calculator specifications (JSON file or inline JSON)") + parser.add_argument("--options", "-o", help="Algorithm options (JSON file or inline JSON)") + + args = parser.parse_args() + + try: + model = parse_model(args.model) + variables = parse_variables(args.input_vars) + + calculators = parse_calculators(args.calculators) if args.calculators else None + algo_options = parse_algorithm_options(args.options) if args.options else {} + + result = fzd_func( + args.input_dir, + variables, + model, + args.output_expression, + args.algorithm, + results_dir=args.results_dir, + calculators=calculators, + **(algo_options if isinstance(algo_options, dict) else {}) + ) + + # Print summary + print("\n" + "="*60) + print(result['summary']) + print("="*60) + + if 'analysis' in result and 'text' in result['analysis']: + print(result['analysis']['text']) + + return 0 + except TypeError as e: + # TypeError messages already printed by decorator + # Just show help and exit + print(file=sys.stderr) + parser.print_help(sys.stderr) + return 1 + except (ValueError, FileNotFoundError) as e: + # These error messages already printed by decorator + # Just exit with error code + return 1 + except Exception as e: + print(f"Error: {e}", file=sys.stderr) + import traceback + traceback.print_exc() + return 1 + + def main(): """Entry point for 'fz' command with subcommands""" parser = argparse.ArgumentParser(description="fz - Parametric scientific computing") @@ -408,22 +489,62 @@ def main(): choices=["json", "csv", "html", "markdown", "table"], help="Output format (default: markdown)") - # install command - parser_install = subparsers.add_parser("install", help="Install a model from GitHub or local zip file") - parser_install.add_argument("source", help="Model source (GitHub name, URL, or local zip file)") - parser_install.add_argument("--global", dest="global_install", action="store_true", - help="Install to ~/.fz/models/ (default: ./.fz/models/)") - - # list command - parser_list = subparsers.add_parser("list", help="List installed models") - parser_list.add_argument("--global", dest="global_list", action="store_true", - help="List models from ~/.fz/models/ (default: ./.fz/models/)") - - # uninstall command - parser_uninstall = subparsers.add_parser("uninstall", help="Uninstall a model") - parser_uninstall.add_argument("model", help="Model name to uninstall") - parser_uninstall.add_argument("--global", dest="global_uninstall", action="store_true", - help="Uninstall from ~/.fz/models/ (default: ./.fz/models/)") + # design command (fzd) + parser_design = subparsers.add_parser("design", help="Iterative design of experiments with algorithms") + parser_design.add_argument("--input_dir", "-i", required=True, help="Input directory path") + parser_design.add_argument("--input_vars", "-v", required=True, help="Input variable ranges (JSON file or inline JSON)") + parser_design.add_argument("--model", "-m", required=True, help="Model definition (JSON file, inline JSON, or alias)") + parser_design.add_argument("--output_expression", "-e", required=True, help="Output expression to minimize (e.g., 'out1 + out2 * 2')") + parser_design.add_argument("--algorithm", "-a", required=True, help="Algorithm name (randomsampling, brent, bfgs, ...)") + parser_design.add_argument("--results_dir", "-r", default="results_fzd", help="Results directory (default: results_fzd)") + parser_design.add_argument("--calculators", "-c", help="Calculator specifications (JSON file or inline JSON)") + parser_design.add_argument("--options", "-o", help="Algorithm options (JSON file or inline JSON)") + + # install command (supports both models and algorithms) + parser_install = subparsers.add_parser("install", help="Install a model or algorithm from GitHub or local zip file") + install_subparsers = parser_install.add_subparsers(dest="install_type", help="Type of resource to install") + + # install model subcommand + parser_install_model = install_subparsers.add_parser("model", help="Install a model") + parser_install_model.add_argument("source", help="Model source (GitHub name, URL, or local zip file)") + parser_install_model.add_argument("--global", dest="global_install", action="store_true", + help="Install to ~/.fz/models/ (default: ./.fz/models/)") + + # install algorithm subcommand + parser_install_algorithm = install_subparsers.add_parser("algorithm", help="Install an algorithm") + parser_install_algorithm.add_argument("source", help="Algorithm source (GitHub name, URL, or local zip file)") + parser_install_algorithm.add_argument("--global", dest="global_install", action="store_true", + help="Install to ~/.fz/algorithms/ (default: ./.fz/algorithms/)") + + # list command (supports both models and algorithms) + parser_list = subparsers.add_parser("list", help="List installed models or algorithms") + list_subparsers = parser_list.add_subparsers(dest="list_type", help="Type of resource to list") + + # list models subcommand + parser_list_models = list_subparsers.add_parser("models", help="List installed models") + parser_list_models.add_argument("--global", dest="global_list", action="store_true", + help="List models from ~/.fz/models/ (default: ./.fz/models/)") + + # list algorithms subcommand + parser_list_algorithms = list_subparsers.add_parser("algorithms", help="List installed algorithms") + parser_list_algorithms.add_argument("--global", dest="global_list", action="store_true", + help="List algorithms from ~/.fz/algorithms/ (default: ./.fz/algorithms/)") + + # uninstall command (supports both models and algorithms) + parser_uninstall = subparsers.add_parser("uninstall", help="Uninstall a model or algorithm") + uninstall_subparsers = parser_uninstall.add_subparsers(dest="uninstall_type", help="Type of resource to uninstall") + + # uninstall model subcommand + parser_uninstall_model = uninstall_subparsers.add_parser("model", help="Uninstall a model") + parser_uninstall_model.add_argument("name", help="Model name to uninstall") + parser_uninstall_model.add_argument("--global", dest="global_uninstall", action="store_true", + help="Uninstall from ~/.fz/models/ (default: ./.fz/models/)") + + # uninstall algorithm subcommand + parser_uninstall_algorithm = uninstall_subparsers.add_parser("algorithm", help="Uninstall an algorithm") + parser_uninstall_algorithm.add_argument("name", help="Algorithm name to uninstall") + parser_uninstall_algorithm.add_argument("--global", dest="global_uninstall", action="store_true", + help="Uninstall from ~/.fz/algorithms/ (default: ./.fz/algorithms/)") args = parser.parse_args() @@ -458,36 +579,118 @@ def main(): calculators=calculators) print(format_output(result, args.format)) + elif args.command == "design": + model = parse_model(args.model) + variables = parse_variables(args.input_vars) + + calculators = None + calculators = parse_calculators(args.calculators) if args.calculators else None + + # Parse algorithm options + algo_options = {} + if args.options: + if args.options.endswith('.json'): + with open(args.options) as f: + algo_options = json.load(f) + else: + algo_options = json.loads(args.options) + + result = fzd_func( + args.input_dir, + variables, + model, + args.output_expression, + args.algorithm, + results_dir=args.results_dir, + calculators=calculators, + **algo_options + ) + + # Print summary + print("\n" + "="*60) + print(result['summary']) + print("="*60) + + if 'analysis' in result and 'text' in result['analysis']: + print(result['analysis']['text']) + elif args.command == "install": - from .installer import install_model - result = install_model(args.source, global_install=args.global_install) - print(f"Successfully installed model '{result['model_name']}'") - if result.get('installed_files'): - print(f" Installed {len(result['installed_files'])} additional files from .fz subdirectories") + if args.install_type == "model": + from .installer import install_model + result = install_model(args.source, global_install=args.global_install) + print(f"Successfully installed model '{result['model_name']}'") + if result.get('installed_files'): + print(f" Installed {len(result['installed_files'])} additional files from .fz subdirectories") + elif args.install_type == "algorithm": + from .installer import install_algorithm + result = install_algorithm(args.source, global_install=args.global_install) + print(f"Successfully installed algorithm '{result['algorithm_name']}'") + if len(result.get('all_files', [])) > 1: + print(f" Installed {len(result['all_files'])} files") + else: + print("Error: Please specify 'model' or 'algorithm' to install") + print("Usage: fz install model ") + print(" fz install algorithm ") + return 1 elif args.command == "list": - from .installer import list_installed_models - models = list_installed_models(global_list=args.global_list) - if not models: - location = "~/.fz/models/" if args.global_list else "./.fz/models/" - print(f"No models installed in {location}") + if args.list_type == "models": + from .installer import list_installed_models + models = list_installed_models(global_list=args.global_list) + if not models: + location = "~/.fz/models/" if args.global_list else "./.fz/models/" + print(f"No models installed in {location}") + else: + print(f"Installed models:") + for model_name, model_info in models.items(): + model_id = model_info.get('id', 'N/A') + is_global = model_info.get('global', False) + location = "[global]" if is_global else "[local]" + print(f" - {model_name} (id: {model_id}) {location}") + elif args.list_type == "algorithms": + from .installer import list_installed_algorithms + algorithms = list_installed_algorithms(global_list=args.global_list) + if not algorithms: + location = "~/.fz/algorithms/" if args.global_list else "./.fz/algorithms/" + print(f"No algorithms installed in {location}") + else: + print(f"Installed algorithms:") + for algo_name, algo_info in algorithms.items(): + algo_type = algo_info.get('type', 'N/A') + is_global = algo_info.get('global', False) + location = "[global]" if is_global else "[local]" + print(f" - {algo_name} ({algo_type}) {location}") else: - print(f"Installed models:") - for model_name, model_info in models.items(): - model_id = model_info.get('id', 'N/A') - is_global = model_info.get('global', False) - location = "[global]" if is_global else "[local]" - print(f" - {model_name} (id: {model_id}) {location}") + print("Error: Please specify 'models' or 'algorithms' to list") + print("Usage: fz list models") + print(" fz list algorithms") + return 1 elif args.command == "uninstall": - from .installer import uninstall_model - success = uninstall_model(args.model, global_uninstall=args.global_uninstall) - if success: - location = "~/.fz/models/" if args.global_uninstall else "./.fz/models/" - print(f"Successfully uninstalled model '{args.model}' from {location}") + if args.uninstall_type == "model": + from .installer import uninstall_model + success = uninstall_model(args.name, global_uninstall=args.global_uninstall) + if success: + location = "~/.fz/models/" if args.global_uninstall else "./.fz/models/" + print(f"Successfully uninstalled model '{args.name}' from {location}") + else: + location = "~/.fz/models/" if args.global_uninstall else "./.fz/models/" + print(f"Model '{args.name}' not found in {location}") + return 1 + elif args.uninstall_type == "algorithm": + from .installer import uninstall_algorithm + success = uninstall_algorithm(args.name, global_uninstall=args.global_uninstall) + if success: + location = "~/.fz/algorithms/" if args.global_uninstall else "./.fz/algorithms/" + print(f"Successfully uninstalled algorithm '{args.name}' from {location}") + else: + location = "~/.fz/algorithms/" if args.global_uninstall else "./.fz/algorithms/" + print(f"Algorithm '{args.name}' not found in {location}") + return 1 else: - location = "~/.fz/models/" if args.global_uninstall else "./.fz/models/" - print(f"Model '{args.model}' not found in {location}") + print("Error: Please specify 'model' or 'algorithm' to uninstall") + print("Usage: fz uninstall model ") + print(" fz uninstall algorithm ") return 1 except Exception as e: @@ -498,4 +701,4 @@ def main(): if __name__ == "__main__": - sys.exit(main()) \ No newline at end of file + sys.exit(main()) diff --git a/fz/config.py b/fz/config.py index baac186..befb91e 100644 --- a/fz/config.py +++ b/fz/config.py @@ -5,7 +5,7 @@ """ import os -from typing import Optional, Union +from typing import Optional from enum import Enum diff --git a/fz/core.py b/fz/core.py index 91f3c54..de2ff0a 100644 --- a/fz/core.py +++ b/fz/core.py @@ -3,22 +3,16 @@ """ import os -import re -import subprocess -import tempfile -import json -import ast import logging import time import uuid +import threading +from collections import defaultdict import signal import sys -import io import platform from pathlib import Path -from typing import Dict, List, Union, Any, Optional, Tuple, TYPE_CHECKING -from concurrent.futures import ThreadPoolExecutor, as_completed -from contextlib import contextmanager +from typing import Dict, List, Union, Any, Optional, TYPE_CHECKING # Configure UTF-8 encoding for Windows to handle emoji output if platform.system() == "Windows": @@ -61,50 +55,117 @@ def utf8_open( if TYPE_CHECKING: import pandas -try: - import pandas as pd - - PANDAS_AVAILABLE = True -except ImportError: - PANDAS_AVAILABLE = False - pd = None - logging.warning("pandas not available, fzo() and fzr() will return dicts instead of DataFrames") +import pandas as pd -import threading -from collections import defaultdict import shutil -from .logging import log_error, log_warning, log_info, log_debug, log_progress -from .config import get_config +from .logging import log_error, log_warning, log_info, log_debug from .helpers import ( fz_temporary_directory, - _get_result_directory, - _get_case_directories, _cleanup_fzr_resources, _resolve_model, - get_calculator_manager, - try_calculators_with_retry, - run_single_case, run_cases_parallel, compile_to_result_directories, prepare_temp_directories, - prepare_case_directories, ) from .shell import run_command, replace_commands_in_string from .io import ( + flatten_dict_columns, + get_analysis, + get_and_process_analysis, ensure_unique_directory, - create_hash_file, resolve_cache_paths, find_cache_match, load_aliases, + process_analysis_content, ) from .interpreter import ( parse_variables_from_path, cast_output, ) from .runners import resolve_calculators, run_calculation +from .algorithms import ( + parse_input_vars, + parse_fixed_vars, + evaluate_output_expression, + load_algorithm, +) +import json +def _parse_argument(arg, alias_type=None): + """ + Parse an argument that can be: JSON string, JSON file path, or alias. + + Tries in order: + 1. JSON string (e.g., '{"key": "value"}') + 2. JSON file path (e.g., 'path/to/file.json') + 3. Alias (e.g., 'myalias' -> looks for .fz//myalias.json) + + Args: + arg: The argument to parse (str, dict, list, or other) + alias_type: Type of alias ('models', 'calculators', 'algorithms', etc.) + + Returns: + Parsed data or the original argument if it's not a string + """ + # If not a string, return as-is + if not isinstance(arg, str): + return arg + + if not arg: + return None + + # Try 1: Parse as JSON string (preferred) + if arg.strip().startswith(('{', '[')): + try: + return json.loads(arg) + except json.JSONDecodeError: + pass # Fall through to next option + + # Try 2: Load as JSON file path + if arg.endswith('.json'): + try: + path = Path(arg) + if path.exists(): + with open(path) as f: + return json.load(f) + except (IOError, json.JSONDecodeError): + pass # Fall through to next option + + # Try 3: Load as alias + if alias_type: + from .io import load_aliases + alias_data = load_aliases(arg, alias_type) + if alias_data is not None: + return alias_data + + # If alias_type not provided or alias not found, return as-is + return arg + + +def _resolve_calculators_arg(calculators): + """ + Parse and resolve calculator argument. + + Handles: + - None (defaults to ["sh://"]) + - JSON string, JSON file, or alias string + - Single calculator dict (wraps in list) + - List of calculator specs + """ + if calculators is None: + return ["sh://"] + + # Parse the argument (could be JSON string, file, or alias) + calculators = _parse_argument(calculators, alias_type='calculators') + + # Wrap dict in list if it's a single calculator definition + if isinstance(calculators, dict): + calculators = [calculators] + + return calculators + def _print_function_help(func_name: str, func_doc: str): """Print function signature and docstring to help users""" print(f"\n{'='*60}", file=sys.stderr) @@ -641,8 +702,9 @@ def fzc( if not isinstance(input_path, (str, Path)): raise TypeError(f"input_path must be a string or Path, got {type(input_path).__name__}") - if not isinstance(input_variables, dict): - raise TypeError(f"input_variables must be a dictionary, got {type(input_variables).__name__}") + # Allow dict or pandas DataFrame for input_variables + if not isinstance(input_variables, (dict, pd.DataFrame)): + raise TypeError(f"input_variables must be a dictionary or DataFrame, got {type(input_variables).__name__}") if not isinstance(output_dir, (str, Path)): raise TypeError(f"output_dir must be a string or Path, got {type(output_dir).__name__}") @@ -779,7 +841,7 @@ def fzo( rows.append(row) # Return DataFrame if pandas is available, otherwise return first row as dict for backward compatibility - if PANDAS_AVAILABLE: + if True: # pandas is always available df = pd.DataFrame(rows) # Check if all 'path' values follow the "key1=val1,key2=val2,..." pattern @@ -841,86 +903,20 @@ def fzo( cast_values.append(v) df[key] = cast_values + # Flatten any dict-valued columns into separate columns + df = flatten_dict_columns(df) + # Always restore the original working directory os.chdir(working_dir) return df - else: - # Return dict with lists for backward compatibility when no pandas - if not rows: - return {} - - # Convert list of dicts to dict of lists - result_dict = {} - for row in rows: - for key, value in row.items(): - if key not in result_dict: - result_dict[key] = [] - result_dict[key].append(value) - - # Also parse variable values from path if applicable - if len(rows) > 0 and "path" in result_dict: - parsed_vars = {} - all_parseable = True - - for path_val in result_dict["path"]: - # Extract just the last component (subdirectory name) for parsing - path_obj = Path(path_val) - last_component = path_obj.name - - # If last component doesn't contain '=', it's not a key=value pattern - if '=' not in last_component: - all_parseable = False - break - - try: - parts = last_component.split(",") - row_vars = {} - for part in parts: - if "=" in part: - key, val = part.split("=", 1) - row_vars[key.strip()] = val.strip() - else: - all_parseable = False - break - - if not all_parseable: - break - - for key in row_vars: - if key not in parsed_vars: - parsed_vars[key] = [] - parsed_vars[key].append(row_vars[key]) - - except Exception: - all_parseable = False - break - - # If all paths were parseable, add the extracted columns - if all_parseable and parsed_vars: - for key, values in parsed_vars.items(): - # Try to cast values to appropriate types - cast_values = [] - for v in values: - try: - if "." not in v: - cast_values.append(int(v)) - else: - cast_values.append(float(v)) - except ValueError: - cast_values.append(v) - result_dict[key] = cast_values - - # Always restore the original working directory - os.chdir(working_dir) - return result_dict @with_helpful_errors def fzr( input_path: str, - input_variables: Dict, + input_variables: Union[Dict, "pandas.DataFrame"], model: Union[str, Dict], results_dir: str = "results", calculators: Union[str, List[str]] = None, @@ -931,7 +927,8 @@ def fzr( Args: input_path: Path to input file or directory - input_variables: Dict of variable values or lists of values for grid + input_variables: Dict of variable values or lists of values for factorial grid, + or pandas DataFrame for non-factorial designs (each row is one case) model: Model definition dict or alias string results_dir: Results directory calculators: Calculator specifications @@ -955,8 +952,9 @@ def fzr( if not isinstance(input_path, (str, Path)): raise TypeError(f"input_path must be a string or Path, got {type(input_path).__name__}") - if not isinstance(input_variables, dict): - raise TypeError(f"input_variables must be a dictionary, got {type(input_variables).__name__}") + # Allow dict or pandas DataFrame for input_variables + if not isinstance(input_variables, (dict, pd.DataFrame)): + raise TypeError(f"input_variables must be a dictionary or DataFrame, got {type(input_variables).__name__}") if not isinstance(results_dir, (str, Path)): raise TypeError(f"results_dir must be a string or Path, got {type(results_dir).__name__}") @@ -1050,6 +1048,9 @@ def fzr( # Prepare results structure results = {var: [] for var in var_names} + # Get output keys from model (for reference), but don't pre-initialize arrays + # Output arrays will be created dynamically based on actual case results + # (especially important for dict outputs that get flattened) output_keys = list(model.get("output", {}).keys()) for key in output_keys: results[key] = [] @@ -1064,7 +1065,11 @@ def fzr( temp_path = Path(temp_dir) # Determine if input_variables is non-empty for directory structure decisions - has_input_variables = bool(input_variables) + # Handle both dict and DataFrame input types + if isinstance(input_variables, pd.DataFrame): + has_input_variables = not input_variables.empty + else: + has_input_variables = bool(input_variables) # Compile all combinations directly to result directories, then prepare temp directories compile_to_result_directories( @@ -1091,15 +1096,35 @@ def fzr( ) # Collect results in the correct order, filtering out None (interrupted/incomplete cases) + # First pass: collect all output columns from all cases to support dict flattening + all_output_cols = set() + valid_case_results = [] + for case_result in case_results: # Skip None results (incomplete cases from interrupts) if case_result is None: continue + valid_case_results.append(case_result) + + # Collect all output columns (including flattened dict columns) + metadata_keys = {"var_combo", "path", "calculator", "status", "error", "command"} + for key in case_result.keys(): + if key not in var_names and key not in metadata_keys: + all_output_cols.add(key) + + # Initialize all output columns in results dict + for key in all_output_cols: + if key not in results: + results[key] = [] + + # Second pass: populate results + for case_result in valid_case_results: for var in var_names: results[var].append(case_result["var_combo"][var]) - for key in output_keys: + # Append values for all output columns + for key in all_output_cols: results[key].append(case_result.get(key)) results["path"].append(case_result.get("path", ".")) @@ -1133,11 +1158,15 @@ def fzr( # Always restore the original working directory os.chdir(working_dir) - # Prepare final results - if PANDAS_AVAILABLE: - final_results = pd.DataFrame(results) - else: - final_results = results + # Return DataFrame + # Remove any columns that are empty (e.g., original dict columns that were flattened) + # This happens when dict flattening creates new columns (min, max, diff) and the + # original column (stats) is no longer populated + non_empty_results = {k: v for k, v in results.items() if len(v) > 0} + + df = pd.DataFrame(non_empty_results) + # Flatten any dict-valued columns into separate columns + final_results = flatten_dict_columns(df) # Call on_complete callback if callbacks and 'on_complete' in callbacks: @@ -1149,3 +1178,448 @@ def fzr( # Return final results return final_results + + +def _get_and_process_analysis( + algo_instance, + all_input_vars: List[Dict[str, float]], + all_output_values: List[float], + iteration: int, + results_dir: Path, + method_name: str = 'get_analysis' +) -> Optional[Dict[str, Any]]: + """ + Helper to call algorithm's analysis method and process the results. + + Args: + algo_instance: Algorithm instance + all_input_vars: All evaluated input combinations + all_output_values: All corresponding output values + iteration: Current iteration number + results_dir: Directory to save processed results + method_name: Name of the display method ('get_analysis' or 'get_analysis_tmp') + + Returns: + Processed analysis dict or None if method doesn't exist or fails + """ + if not hasattr(algo_instance, method_name): + return None + + try: + analysis_method = getattr(algo_instance, method_name) + analysis_dict = analysis_method(all_input_vars, all_output_values) + + if display_dict: + # Log text content before processing (for console output) + if 'text' in display_dict: + log_info(display_dict['text']) + + # Process and save content intelligently + processed = process_analysis_content(display_dict, iteration, results_dir) + return processed + return None + + except Exception as e: + log_warning(f"⚠️ {method_name} failed: {e}") + return None + + +def _get_analysis( + algo_instance, + all_input_vars: List[Dict[str, float]], + all_output_values: List[float], + output_expression: str, + algorithm: str, + iteration: int, + results_dir: Path +) -> Dict[str, Any]: + """ + Create final analysis results with analysis information and DataFrame. + + Args: + algo_instance: Algorithm instance + all_input_vars: All evaluated input combinations + all_output_values: All corresponding output values + output_expression: Expression for output column name + algorithm: Algorithm path/name + iteration: Final iteration number + results_dir: Directory for saving results + + Returns: + Dict with analysis results including XY DataFrame and analysis info + """ + # Display final results + log_info("\n" + "="*60) + log_info("📈 Final Results") + log_info("="*60) + + # Get and process final display results (logging is done inside) + processed_final_display = _get_and_process_analysis( + algo_instance, all_input_vars, all_output_values, + iteration, results_dir, 'get_analysis' + ) + + # If processed_final_display is None, create empty dict for backward compatibility + if processed_final_display is None: + processed_final_display = {} + + # Create DataFrame with all input and output values + df_data = [] + for inp_dict, out_val in zip(all_input_vars, all_output_values): + row = inp_dict.copy() + row[output_expression] = out_val # Use output_expression as column name + df_data.append(row) + + data_df = pd.DataFrame(df_data) + + # Prepare return value + result = { + 'XY': data_df, # DataFrame with all X and Y values + 'analysis': processed_final_analysis, # Use processed analysis instead of raw + 'algorithm': algorithm, + 'iterations': iteration, + 'total_evaluations': len(all_input_vars), + } + + # Add summary + valid_count = sum(1 for v in all_output_values if v is not None) + summary = f"{algorithm} completed: {iteration} iterations, {len(all_input_vars)} evaluations ({valid_count} valid)" + result['summary'] = summary + + return result + + +def fzd( + input_path: str, + input_variables: Dict[str, str], + model: Union[str, Dict], + output_expression: str, + algorithm: str, + calculators: Union[str, List[str]] = None, + algorithm_options: Dict[str, Any] = None, + analysis_dir: str = "analysis" +) -> Dict[str, Any]: + """ + Run iterative design of experiments with algorithms + + Requires pandas to be installed. + + Args: + input_path: Path to input file or directory + input_variables: Input variables to vary, as dict of strings {"var1": "[min;max]", ...} + model: Model definition dict or alias string + output_expression: Expression to extract from output files, e.g. "output1 + output2 * 2" + algorithm: Path to algorithm Python file (e.g., "algorithms/montecarlo.py") + calculators: Calculator specifications (default: ["sh://"]) + algorithm_options: Dict of algorithm-specific options (e.g., {"batch_size": 10, "max_iter": 100}) + analysis_dir: Analysis results directory (default: "results_fzd") + + Returns: + Dict with algorithm results including: + - 'input_vars': List of evaluated input combinations + - 'output_values': List of corresponding output values + - 'analysis': Display information from algorithm.get_analysis() + - 'summary': Summary text + + Raises: + ImportError: If pandas is not installed + + Example: + >>> analysis = fz.fzd( + ... input_path='input.txt', + ... input_variables={"x1": "[0;10]", "x2": "[0;5]"}, + ... model="mymodel", + ... output_expression="pressure", + ... algorithm="algorithms/montecarlo_uniform.py", + ... calculators=["sh://bash ./calculator.sh"], + ... algorithm_options={"batch_sample_size": 20, "max_iterations": 50}, + ... analysis_dir="fzd_analysis" + ... ) + """ + # This represents the directory from which the function was launched + working_dir = os.getcwd() + + # Install signal handler for graceful interrupt handling + global _interrupt_requested + _interrupt_requested = False + _install_signal_handler() + + + try: + model = _resolve_model(model) + + # Parse calculator argument (handles JSON string, file, or alias) + calculators = _resolve_calculators_arg(calculators) + + # Get model ID for calculator resolution + model_id = model.get("id") if isinstance(model, dict) else None + calculators = resolve_calculators(calculators, model_id) + + # Convert to absolute paths + input_dir = Path(input_path).resolve() + results_dir = Path(analysis_dir).resolve() + + # Ensure analysis directory is unique (rename existing with timestamp) + results_dir, renamed_results_dir = ensure_unique_directory(results_dir) + + # Parse input variable ranges and fixed values + parsed_input_vars = parse_input_vars(input_variables) # Only variables with ranges + fixed_input_vars = parse_fixed_vars(input_variables) # Fixed (unique) values + + # Log what we're doing + if fixed_input_vars: + log_info(f"🔒 Fixed variables: {', '.join(f'{k}={v}' for k, v in fixed_input_vars.items())}") + if parsed_input_vars: + log_info(f"🔄 Variable ranges: {', '.join(f'{k}={v}' for k, v in parsed_input_vars.items())}") + + # Extract output variable names from the model + output_spec = model.get("output", {}) + output_var_names = list(output_spec.keys()) + + if not output_var_names: + raise ValueError("Model must specify output variables in 'output' field") + + # Load algorithm with options + if algorithm_options is None: + algorithm_options = {} + algo_instance = load_algorithm(algorithm, **algorithm_options) + + # Get initial design from algorithm (only for variable inputs) + log_info(f"🎯 Starting {algorithm} algorithm...") + initial_design_vars = algo_instance.get_initial_design(parsed_input_vars, output_expression) + + # Merge fixed values with algorithm-generated design + initial_design = [] + for design_point in initial_design_vars: + # Combine variable values (from algorithm) with fixed values + full_point = {**design_point, **fixed_input_vars} + initial_design.append(full_point) + + # Track all evaluations + all_input_vars = [] + all_output_values = [] + + # Iterative loop + iteration = 0 + current_design = initial_design + + while current_design and not _interrupt_requested: + iteration += 1 + log_info(f"\n📊 Iteration {iteration}: Evaluating {len(current_design)} point(s)...") + + # Create results subdirectory for this iteration + iteration_result_dir = results_dir / f"iter{iteration:03d}" + iteration_result_dir.mkdir(parents=True, exist_ok=True) + + # Run fzr for all points in parallel using calculators + try: + log_info(f" Running {len(current_design)} cases in parallel...") + # Create DataFrame with all variables (both variable and fixed) + all_var_names = list(parsed_input_vars.keys()) + list(fixed_input_vars.keys()) + # Build cache paths: include current iterations and renamed directory if it exists + cache_paths = [f"cache://{results_dir / f'iter{j:03d}'}" for j in range(1, iteration)] + if renamed_results_dir is not None: + # Also check renamed directory for cached results from previous runs + cache_paths.extend([f"cache://{renamed_results_dir / f'iter{j:03d}'}" for j in range(1, 100)]) # Check up to 99 iterations + + result_df = fzr( + str(input_dir), + pd.DataFrame(current_design, columns=all_var_names),# All points in batch + model, + results_dir=str(iteration_result_dir), + calculators=[*cache_paths, *calculators] # Cache paths first, then actual calculators + ) + + # Extract output values for each point + iteration_inputs = [] + iteration_outputs = [] + + # result_df is a DataFrame (pandas is required for fzd) + for i, point in enumerate(current_design): + iteration_inputs.append(point) + + if i < len(result_df): + row = result_df.iloc[i] + output_data = row #{key: row.get(key, None) for key in output_var_names} + + # Evaluate output expression + try: + output_value = evaluate_output_expression( + output_expression, + output_data + ) + log_info(f" Point {i+1}: {point} → {output_value:.6g}") + iteration_outputs.append(output_value) + except Exception as e: + available_vars = ', '.join(f"'{k}'" for k in output_data.keys()) + log_warning( + f" Point {i+1}: Failed to evaluate expression '{output_expression}': {e}\n" + f" Available output variables: {available_vars}" + ) + iteration_outputs.append(None) + else: + log_warning(f" Point {i+1}: No results") + iteration_outputs.append(None) + + except Exception as e: + log_error(f" ❌ Error evaluating batch: {e}") + # Add all points with None outputs + iteration_inputs = current_design + iteration_outputs = [None] * len(current_design) + + # Add iteration results to overall tracking + all_input_vars.extend(iteration_inputs) + all_output_values.extend(iteration_outputs) + + # Display intermediate results if the method exists + tmp_analysis_processed = get_and_process_analysis( + algo_instance, all_input_vars, all_output_values, + iteration, results_dir, 'get_analysis_tmp' + ) + if tmp_analysis_processed: + log_info(f"\n📊 Iteration {iteration} intermediate results:") + # Text logging is done inside _get_and_process_analysis + + # Save iteration results to files + try: + # Save X (input variables) to CSV + x_file = results_dir / f"X_{iteration}.csv" + with open(x_file, 'w') as f: + if all_input_vars: + # Get all variable names from the first entry + var_names = list(all_input_vars[0].keys()) + f.write(','.join(var_names) + '\n') + for inp in all_input_vars: + f.write(','.join(str(inp[var]) for var in var_names) + '\n') + + # Save Y (output values) to CSV + y_file = results_dir / f"Y_{iteration}.csv" + with open(y_file, 'w') as f: + f.write('output\n') + for val in all_output_values: + f.write(f"{val if val is not None else 'NA'}\n") + + # Save HTML results + html_file = results_dir / f"results_{iteration}.html" + html_content = f""" + + + + Iteration {iteration} Results + + + +

Algorithm Results - Iteration {iteration}

+
+

Summary

+

Total samples: {len(all_input_vars)}

+

Valid samples: {sum(1 for v in all_output_values if v is not None)}

+

Iteration: {iteration}

+
+""" + # Add intermediate results from get_analysis_tmp + if tmp_display_processed: + html_content += """ +
+

Intermediate Progress

+""" + # Link to analysis files if they were created + if 'html_file' in tmp_display_processed: + html_content += f'

📄 View HTML Analysis

\n' + if 'md_file' in tmp_display_processed: + html_content += f'

📄 View Markdown Analysis

\n' + if 'json_file' in tmp_display_processed: + html_content += f'

📄 View JSON Data

\n' + if 'txt_file' in tmp_display_processed: + html_content += f'

📄 View Text Data

\n' + if 'text' in tmp_display_processed: + html_content += f"
{tmp_display_processed['text']}
\n" + if 'data' in tmp_display_processed and tmp_display_processed['data']: + html_content += "

Data:

\n
\n"
+                        for key, value in tmp_display_processed['data'].items():
+                            html_content += f"{key}: {value}\n"
+                        html_content += "
\n" + html_content += "
\n" + + # Always call get_analysis for this iteration and process content + iter_analysis_processed = get_and_process_analysis( + algo_instance, all_input_vars, all_output_values, + iteration, results_dir, 'get_analysis' + ) + if iter_display_processed: + html_content += """ +
+

Current Results

+""" + # Link to analysis files if they were created + if 'html_file' in iter_display_processed: + html_content += f'

📄 View HTML Analysis

\n' + if 'md_file' in iter_display_processed: + html_content += f'

📄 View Markdown Analysis

\n' + if 'json_file' in iter_display_processed: + html_content += f'

📄 View JSON Data

\n' + if 'txt_file' in iter_display_processed: + html_content += f'

📄 View Text Data

\n' + if 'text' in iter_display_processed: + html_content += f"
{iter_display_processed['text']}
\n" + if 'data' in iter_display_processed and iter_display_processed['data']: + html_content += "

Data:

\n
\n"
+                        for key, value in iter_display_processed['data'].items():
+                            html_content += f"{key}: {value}\n"
+                        html_content += "
\n" + html_content += "
\n" + + html_content += """ + + +""" + with open(html_file, 'w') as f: + f.write(html_content) + + log_info(f" 💾 Saved iteration results: {x_file.name}, {y_file.name}, {html_file.name}") + + except Exception as e: + log_warning(f"⚠️ Failed to save iteration files: {e}") + + if _interrupt_requested: + break + + # Get next design from algorithm (only for variable inputs) + next_design_vars = algo_instance.get_next_design( + all_input_vars, + all_output_values + ) + + # Merge fixed values with algorithm-generated design + current_design = [] + for design_point in next_design_vars: + # Combine variable values (from algorithm) with fixed values + full_point = {**design_point, **fixed_input_vars} + current_design.append(full_point) + + # Get final analysis results + result = get_analysis( + algo_instance, all_input_vars, all_output_values, + output_expression, algorithm, iteration, results_dir + ) + + return result + + finally: + # Restore signal handler + _restore_signal_handler() + + # Always restore the original working directory + os.chdir(working_dir) + + if _interrupt_requested: + log_warning("⚠️ Execution was interrupted. Partial results may be available.") diff --git a/fz/helpers.py b/fz/helpers.py index 3e4f5b1..0a16a54 100644 --- a/fz/helpers.py +++ b/fz/helpers.py @@ -12,11 +12,120 @@ from contextlib import contextmanager from concurrent.futures import ThreadPoolExecutor, as_completed -from .logging import log_debug, log_info, log_warning, log_error, log_progress, get_log_level, LogLevel +# Optional pandas import for DataFrame support +try: + import pandas as pd + HAS_PANDAS = True +except ImportError: + HAS_PANDAS = False + +from .logging import log_debug, log_info, log_warning, log_error, log_progress from .config import get_config from .spinner import CaseSpinner, CaseStatus +def _get_windows_short_path(path: str) -> str: + r""" + Convert a Windows path with spaces to its short (8.3) name format. + + This is necessary because Python's subprocess module on Windows doesn't + properly handle spaces in the executable parameter when using shell=True. + + Args: + path: Windows file path + + Returns: + Short format path (e.g., C:\PROGRA~1\...) or original path if conversion fails + """ + if not path or ' ' not in path: + return path + + try: + import ctypes + from ctypes import wintypes + + GetShortPathName = ctypes.windll.kernel32.GetShortPathNameW + GetShortPathName.argtypes = [wintypes.LPCWSTR, wintypes.LPWSTR, wintypes.DWORD] + GetShortPathName.restype = wintypes.DWORD + + buffer = ctypes.create_unicode_buffer(260) + GetShortPathName(path, buffer, 260) + short_path = buffer.value + + if short_path: + log_debug(f"Converted path with spaces: {path} -> {short_path}") + return short_path + except Exception as e: + log_debug(f"Failed to get short path for {path}: {e}") + + return path + + +def get_windows_bash_executable() -> Optional[str]: + """ + Get the bash executable path on Windows. + + This function determines the appropriate bash executable to use on Windows + by checking both the system PATH and common installation locations. + + Priority order: + 1. Bash in system/user PATH (from MSYS2, Git Bash, WSL, Cygwin, etc.) + 2. MSYS2 bash at C:\\msys64\\usr\\bin\\bash.exe (preferred) + 3. Git for Windows bash + 4. Cygwin bash + 5. WSL bash + 6. win-bash + + Returns: + Optional[str]: Path to bash executable if found on Windows, None otherwise. + Returns None if not on Windows or if bash is not found. + """ + if platform.system() != "Windows": + return None + + # Try system/user PATH first + bash_in_path = shutil.which("bash") + if bash_in_path: + log_debug(f"Using bash from PATH: {bash_in_path}") + # Convert to short name if path contains spaces + return _get_windows_short_path(bash_in_path) + + # Check common bash installation paths, prioritizing MSYS2 + # Include both short names (8.3) and long names to handle various Git installations + bash_paths = [ + # MSYS2 bash (preferred - provides complete Unix environment) + r"C:\msys64\usr\bin\bash.exe", + # Git for Windows with short names (always works) + r"C:\Progra~1\Git\bin\bash.exe", + r"C:\Progra~2\Git\bin\bash.exe", + # Git for Windows with long names (may have spaces issue, will be converted) + r"C:\Program Files\Git\bin\bash.exe", + r"C:\Program Files (x86)\Git\bin\bash.exe", + # Also check usr/bin for newer Git for Windows + r"C:\Program Files\Git\usr\bin\bash.exe", + r"C:\Program Files (x86)\Git\usr\bin\bash.exe", + # Cygwin bash (alternative Unix environment) + r"C:\cygwin64\bin\bash.exe", + r"C:\cygwin\bin\bash.exe", + # WSL bash (almost always available on modern Windows) + r"C:\Windows\System32\bash.exe", + # win-bash + r"C:\win-bash\bin\bash.exe", + ] + + for bash_path in bash_paths: + if os.path.exists(bash_path): + log_debug(f"Using bash at: {bash_path}") + # Convert to short name if path contains spaces + return _get_windows_short_path(bash_path) + + # No bash found + log_warning( + "Bash not found on Windows. Commands may fail if they use bash-specific syntax." + ) + return None + + @contextmanager def fz_temporary_directory(session_cwd=None): """ @@ -40,7 +149,6 @@ def fz_temporary_directory(session_cwd=None): fz_tmp_base.mkdir(parents=True, exist_ok=True) # Create unique temp directory name - pid = os.getpid() unique_id = uuid.uuid4().hex[:12] timestamp = int(time.time()) temp_name = f"fz_temp_{unique_id}_{timestamp}" @@ -119,30 +227,57 @@ def _get_case_directories(var_combo: Dict, case_index: int, temp_path: Path, res return tmp_dir, result_dir, case_name -def generate_variable_combinations(input_variables: Dict) -> List[Dict]: +def generate_variable_combinations(input_variables: Union[Dict, Any]) -> List[Dict]: """ Generate variable combinations from input variables - - Converts input variables dict into a list of variable combinations. - If any value is a list, generates the cartesian product of all variables. - Single values are treated as single-element lists. - + + Supports two input formats: + 1. Dict: Creates Cartesian product (full factorial design) + - If any value is a list, generates the cartesian product of all variables + - Single values are treated as single-element lists + + 2. DataFrame: Non-factorial design + - Each row represents one case + - Column names become variable names + - Allows arbitrary combinations of values + Args: - input_variables: Dict of variable values or lists of values - + input_variables: Dict of variable values/lists OR pandas DataFrame + Returns: List of variable combination dicts - - Example: - >>> generate_variable_combinations({"x": [1, 2], "y": 3}) - [{"x": 1, "y": 3}, {"x": 2, "y": 3}] - + + Examples: + Dict (factorial design): + >>> generate_variable_combinations({"x": [1, 2], "y": [3, 4]}) + [{"x": 1, "y": 3}, {"x": 1, "y": 4}, {"x": 2, "y": 3}, {"x": 2, "y": 4}] + >>> generate_variable_combinations({"x": 1, "y": 2}) [{"x": 1, "y": 2}] + + DataFrame (non-factorial design): + >>> df = pd.DataFrame({"x": [1, 2, 3], "y": [10, 10, 20]}) + >>> generate_variable_combinations(df) + [{"x": 1, "y": 10}, {"x": 2, "y": 10}, {"x": 3, "y": 20}] """ + # Check if input is a pandas DataFrame + if isinstance(input_variables, pd.DataFrame): + # Each row is one case (non-factorial design) + var_combinations = [] + for _, row in input_variables.iterrows(): + var_combinations.append(row.to_dict()) + + log_info(f"📊 DataFrame input detected: {len(var_combinations)} cases (non-factorial design)") + return var_combinations + + # Original dict behavior (factorial design) + if not isinstance(input_variables, dict): + # If not dict and not DataFrame, raise error + raise TypeError(f"input_variables must be a dict or pandas DataFrame, got {type(input_variables)}") + var_names = list(input_variables.keys()) has_lists = any(isinstance(v, list) for v in input_variables.values()) - + if has_lists: list_values = [] for var in var_names: @@ -151,13 +286,13 @@ def generate_variable_combinations(input_variables: Dict) -> List[Dict]: list_values.append(val) else: list_values.append([val]) - + var_combinations = [ dict(zip(var_names, combo)) for combo in itertools.product(*list_values) ] else: var_combinations = [input_variables] - + return var_combinations @@ -328,14 +463,14 @@ def _resolve_model(model: Union[str, Dict]) -> Dict: def get_calculator_manager(): """ Get or create the global calculator manager instance - + Returns: CalculatorManager instance """ global _calculator_manager if _calculator_manager is None: - from .core import CalculatorManager - _calculator_manager = CalculatorManager() + from .runners import _calculator_manager as calc_mgr + _calculator_manager = calc_mgr return _calculator_manager @@ -498,7 +633,6 @@ def try_calculators_with_retry(non_cache_calculator_ids: List[str], case_index: "calculator_uri": "multiple_failed" } - used_calculator_uri = final_error.get("calculator_uri", "unknown") log_error(f"❌ [Thread {thread_id}] Case {case_index}: All {len(attempted_calculator_ids)} calculator attempts failed") # Convert calculator IDs back to URIs for logging attempted_uris = [calc_mgr.get_original_uri(calc_id) for calc_id in attempted_calculator_ids] @@ -520,7 +654,6 @@ def run_single_case(case_info: Dict) -> Dict[str, Any]: Returns: Dict with case results """ - from .runners import select_calculator_for_case, run_single_case_calculation from .io import resolve_cache_paths, find_cache_match from .core import fzo @@ -613,12 +746,19 @@ def run_single_case(case_info: Dict) -> Dict[str, Any]: # Validate that cached outputs don't contain None values try: cached_output = fzo(result_dir, model) - output_keys = list(model.get("output", {}).keys()) + + # Get all output columns (including flattened dict columns) + # We use all keys from cached_output to capture flattened dict columns + all_output_keys = list(cached_output.keys()) if hasattr(cached_output, 'keys') else cached_output.columns.tolist() + + # Filter out metadata columns + metadata_cols = ['path'] + output_columns = [k for k in all_output_keys if k not in metadata_cols] # Check if any expected output is None # Extract scalar values properly from DataFrame/dict returned by fzo none_keys = [] - for key in output_keys: + for key in output_columns: value = cached_output.get(key) # Extract scalar from pandas Series or list if hasattr(value, 'iloc'): @@ -822,7 +962,17 @@ def run_single_case(case_info: Dict) -> Dict[str, Any]: result_output = fzo(result_dir, model) log_debug(f"🔄 [Thread {thread_id}] {case_name}: Parsed output: {list(result_output.keys())}") - for key in output_keys: + + # Extract all columns from fzo result (includes flattened dict columns) + # We use all keys from result_output instead of just output_keys to capture + # flattened dict columns (e.g., if "stats" was a dict, we now have "min", "max", etc.) + all_output_keys = list(result_output.keys()) if hasattr(result_output, 'keys') else result_output.columns.tolist() + + # Filter out metadata columns (path, etc.) to only get output values + metadata_cols = ['path'] + output_columns = [k for k in all_output_keys if k not in metadata_cols] + + for key in output_columns: value = result_output.get(key) # Extract scalar from pandas Series if applicable if hasattr(value, 'iloc'): @@ -1087,7 +1237,6 @@ def run_cases_parallel(var_combinations: List[Dict], temp_path: Path, resultsdir # Progress tracking for multiple cases (only if spinner is disabled) if len(var_combinations) > 1 and not spinner.enabled: completed_count = i + 1 - case_elapsed = time.time() - case_start_time total_elapsed = time.time() - start_time # Estimate remaining time based on average time per case @@ -1277,7 +1426,11 @@ def compile_to_result_directories(input_path: str, model: Dict, input_variables: input_path = Path(input_path) # Determine if input_variables is non-empty - has_input_variables = bool(input_variables) + # Handle both dict and DataFrame input types + if isinstance(input_variables, pd.DataFrame): + has_input_variables = not input_variables.empty + else: + has_input_variables = bool(input_variables) # Ensure main results directory exists resultsdir.mkdir(parents=True, exist_ok=True) diff --git a/fz/installer.py b/fz/installer.py index 6425fc5..dec7d27 100644 --- a/fz/installer.py +++ b/fz/installer.py @@ -137,7 +137,6 @@ def extract_model_files(zip_path: Path, extract_dir: Path) -> Dict[str, Path]: # Second try: look for .fz/models/*.json if not model_json_paths: - fz_models_dir = extract_dir / '*' / '.fz' / 'models' model_json_paths = list(extract_dir.glob('*/.fz/models/*.json')) log_debug(f"Looking in .fz/models/: found {len(model_json_paths)} files") @@ -362,3 +361,247 @@ def list_installed_models(global_list: bool = False) -> Dict[str, Dict]: log_warning(f"Failed to load model {model_file}: {e}") return models + + +# ============================================================================ +# Algorithm Installation Functions +# ============================================================================ + + +def extract_algorithm_files(zip_path: Path, extract_dir: Path) -> Dict[str, Path]: + """ + Extract algorithm files from a zip archive + + The expected structure of an algorithm zip is: + - fz-algorithm-name-main/ + - algorithm.py or algorithm.R (algorithm implementation) + - README.md (optional) + - examples/ (optional) + + Args: + zip_path: Path to the zip file + extract_dir: Directory to extract to + + Returns: + Dict with 'algorithm_files' key pointing to list of algorithm files (.py or .R), + and 'algorithm_name' key with the algorithm name + + Raises: + Exception: If extraction fails or no algorithm files found + """ + log_info(f"Extracting: {zip_path}") + + try: + with zipfile.ZipFile(zip_path, 'r') as zip_ref: + zip_ref.extractall(extract_dir) + log_debug(f"Extracted files: {zip_ref.namelist()[:10]}") # Show first 10 files + except Exception as e: + raise Exception(f"Failed to extract {zip_path}: {e}") + + # Find algorithm files (.py or .R) + # Look in two places: + # 1. Algorithm files in the root (simple case) + # 2. .fz/algorithms/*.py or *.R (fz repository structure) + + log_debug(f"Searching for algorithm files in: {extract_dir}") + + # First try: look for .py/.R files in root + py_files = list(extract_dir.rglob('*.py')) + r_files = list(extract_dir.rglob('*.R')) + + # Filter out test files, setup files, etc. + def is_algorithm_file(path: Path) -> bool: + """Check if file is likely an algorithm implementation""" + name_lower = path.name.lower() + # Exclude common non-algorithm files + exclude_patterns = ['setup.py', 'test_', '_test.', 'conftest.py', '__init__.py'] + return not any(pattern in name_lower for pattern in exclude_patterns) + + py_files = [f for f in py_files if is_algorithm_file(f)] + r_files = [f for f in r_files if is_algorithm_file(f)] + + # Second try: specifically look for .fz/algorithms/*.py or *.R + fz_algo_py = list(extract_dir.glob('*/.fz/algorithms/*.py')) + fz_algo_r = list(extract_dir.glob('*/.fz/algorithms/*.R')) + + # Combine and prioritize .fz/algorithms/ files if they exist + if fz_algo_py or fz_algo_r: + algorithm_files = fz_algo_py + fz_algo_r + log_debug(f"Found {len(algorithm_files)} algorithm files in .fz/algorithms/") + else: + algorithm_files = py_files + r_files + log_debug(f"Found {len(algorithm_files)} algorithm files in root") + + if not algorithm_files: + # List what we did find to help debugging + all_files = list(extract_dir.rglob('*')) + log_debug(f"Files found in extraction: {[str(f.relative_to(extract_dir)) for f in all_files[:20]]}") + raise Exception(f"No algorithm files (.py or .R) found in extracted archive. Extracted to: {extract_dir}") + + # Extract algorithm name from first file + # For fz-algorithm repositories, the file is typically named after the algorithm + algorithm_file = algorithm_files[0] + algorithm_name = algorithm_file.stem + log_info(f"Found algorithm file: {algorithm_file}") + + return { + 'algorithm_files': algorithm_files, + 'algorithm_name': algorithm_name, + 'extract_dir': algorithm_file.parent + } + + +def install_algorithm(source: str, global_install: bool = False) -> Dict[str, str]: + """ + Install an algorithm from a source (GitHub name, URL, or local zip file) + + Args: + source: Algorithm source to install from + - GitHub name: "montecarlo" → "https://github.com/Funz/fz-montecarlo" + - Full URL: "https://github.com/user/fz-myalgo" + - Local zip: "fz-myalgo.zip" + global_install: If True, install to ~/.fz/algorithms/, else to ./.fz/algorithms/ + + Returns: + Dict with 'algorithm_name' and 'install_path' keys + + Raises: + Exception: If installation fails + """ + # Determine installation directory + if global_install: + install_base = Path.home() / '.fz' / 'algorithms' + else: + install_base = Path.cwd() / '.fz' / 'algorithms' + + install_base.mkdir(parents=True, exist_ok=True) + + # Create a temporary directory for download and extraction + with tempfile.TemporaryDirectory() as temp_dir: + temp_path = Path(temp_dir) + + try: + # Download the algorithm (reuse download_model function) + zip_path = download_model(source, temp_path) + + # Extract the algorithm files + extract_path = temp_path / 'extract' + extract_path.mkdir(exist_ok=True) + algo_info = extract_algorithm_files(zip_path, extract_path) + + algorithm_name = algo_info['algorithm_name'] + algorithm_files = algo_info['algorithm_files'] + + # Install the algorithm file(s) + installed_files = [] + for algo_file in algorithm_files: + # Use the original filename for installation + dest_file = install_base / algo_file.name + shutil.copy2(algo_file, dest_file) + installed_files.append(str(dest_file)) + log_info(f"Installed algorithm '{algo_file.name}' to: {dest_file}") + + return { + 'algorithm_name': algorithm_name, + 'install_path': str(installed_files[0]), + 'all_files': installed_files + } + + except Exception as e: + log_error(f"Algorithm installation failed: {e}") + raise + + +def uninstall_algorithm(algorithm_name: str, global_uninstall: bool = False) -> bool: + """ + Uninstall an algorithm + + Args: + algorithm_name: Name of the algorithm to uninstall (without extension) + global_uninstall: If True, uninstall from ~/.fz/algorithms/, else from ./.fz/algorithms/ + + Returns: + True if successful, False otherwise + """ + if global_uninstall: + install_base = Path.home() / '.fz' / 'algorithms' + else: + install_base = Path.cwd() / '.fz' / 'algorithms' + + # Try both .py and .R extensions + removed_any = False + for ext in ['.py', '.R']: + algo_path = install_base / f"{algorithm_name}{ext}" + if algo_path.exists(): + try: + algo_path.unlink() + log_info(f"Uninstalled algorithm '{algorithm_name}{ext}'") + removed_any = True + except Exception as e: + log_error(f"Failed to uninstall algorithm '{algorithm_name}{ext}': {e}") + return False + + if not removed_any: + log_warning(f"Algorithm '{algorithm_name}' not found at: {install_base}") + return False + + return True + + +def list_installed_algorithms(global_list: bool = False) -> Dict[str, Dict]: + """ + List installed algorithms + + Args: + global_list: If True, list from ~/.fz/algorithms/, else from ./.fz/algorithms/ + If False, lists from both locations and marks each with 'global' property + + Returns: + Dict mapping algorithm names to their info (with 'global' property added) + """ + algorithms = {} + + if global_list: + # Only list global algorithms + install_base = Path.home() / '.fz' / 'algorithms' + if install_base.exists(): + for algo_file in install_base.glob('*'): + if algo_file.suffix in ['.py', '.R'] and algo_file.is_file(): + algo_name = algo_file.stem + algorithms[algo_name] = { + 'name': algo_name, + 'file': str(algo_file), + 'type': 'Python' if algo_file.suffix == '.py' else 'R', + 'global': True + } + else: + # List from both local and global, marking each + # First, check local algorithms + local_base = Path.cwd() / '.fz' / 'algorithms' + if local_base.exists(): + for algo_file in local_base.glob('*'): + if algo_file.suffix in ['.py', '.R'] and algo_file.is_file(): + algo_name = algo_file.stem + algorithms[algo_name] = { + 'name': algo_name, + 'file': str(algo_file), + 'type': 'Python' if algo_file.suffix == '.py' else 'R', + 'global': False + } + + # Then check global algorithms (but don't override local ones) + global_base = Path.home() / '.fz' / 'algorithms' + if global_base.exists(): + for algo_file in global_base.glob('*'): + if algo_file.suffix in ['.py', '.R'] and algo_file.is_file(): + algo_name = algo_file.stem + # Only add if not already present (local takes precedence) + if algo_name not in algorithms: + algorithms[algo_name] = { + 'name': algo_name, + 'file': str(algo_file), + 'type': 'Python' if algo_file.suffix == '.py' else 'R', + 'global': True + } + + return algorithms diff --git a/fz/interpreter.py b/fz/interpreter.py index 210f9a3..2a29c45 100644 --- a/fz/interpreter.py +++ b/fz/interpreter.py @@ -190,6 +190,9 @@ def evaluate_formulas(content: str, model: Dict, input_variables: Dict, interpre # Extract the code part and preserve any indentation from original code_part = stripped[len(commentline + formulaprefix):] context_lines.append(code_part) + # If delimiters are empty, skip formula evaluation (no formulas possible) + if len(delim) == 0: + return content # If delimiters are empty, skip formula evaluation (no formulas possible) if len(delim) == 0: @@ -393,4 +396,4 @@ def cast_output(value: str) -> Any: pass # Return as string - return value \ No newline at end of file + return value diff --git a/fz/io.py b/fz/io.py index 8ca31cb..035dcff 100644 --- a/fz/io.py +++ b/fz/io.py @@ -7,11 +7,16 @@ import json import hashlib from pathlib import Path -from typing import Dict, List, Optional +from typing import Dict, List, Optional, Any, TYPE_CHECKING -from .logging import log_info +from .logging import log_info, log_warning from datetime import datetime +if TYPE_CHECKING: + import pandas + +import pandas as pd + def ensure_unique_directory(directory_path: Path) -> tuple[Path, Optional[Path]]: """ @@ -220,7 +225,7 @@ def find_cache_match(cache_base_path: Path, current_hash_file: Path) -> Optional print(f"Could not read cache hash file {cache_hash_file}: {e}") continue - print(f"No cache match found in {cache_base_path} or its subdirectories") + log_info(f"No cache match found in {cache_base_path} or its subdirectories") return None @@ -236,4 +241,361 @@ def load_aliases(name: str, alias_type: str = "models") -> Optional[Dict]: return json.load(f) except (json.JSONDecodeError, IOError): continue - return None \ No newline at end of file + return None + + +def detect_content_type(text: str) -> str: + """ + Detect the type of content in a text string. + + Returns: 'html', 'json', 'keyvalue', 'markdown', or 'plain' + """ + if not text or not isinstance(text, str): + return 'plain' + + text_stripped = text.strip() + + # Check for HTML tags + if re.search(r'<(html|div|p|h1|h2|h3|img|table|body|head)', text_stripped, re.IGNORECASE): + return 'html' + + # Check for JSON (starts with { or [) + if text_stripped.startswith(('{', '[')): + try: + json.loads(text_stripped) + return 'json' + except (json.JSONDecodeError, ValueError): + pass + + # Check for markdown (has markdown syntax like #, ##, *, -, ```, etc.) + markdown_patterns = [ + r'^#{1,6}\s+.+$', # Headers + r'^\*\*.+\*\*$', # Bold + r'^_.+_$', # Italic + r'^\[.+\]\(.+\)$', # Links + r'^```', # Code blocks + r'^\* .+$', # Unordered lists + r'^\d+\. .+$', # Ordered lists + ] + for pattern in markdown_patterns: + if re.search(pattern, text_stripped, re.MULTILINE): + return 'markdown' + + # Check for key=value format (at least 2 lines with = signs) + lines = text_stripped.split('\n') + kv_lines = [l for l in lines if '=' in l and not l.strip().startswith('#')] + if len(kv_lines) >= 2: + # Verify they look like key=value pairs + if all(len(l.split('=', 1)) == 2 for l in kv_lines[:3]): + return 'keyvalue' + + return 'plain' + + +def parse_keyvalue_text(text: str) -> Dict[str, str]: + """Parse key=value text into a dictionary.""" + result = {} + for line in text.strip().split('\n'): + line = line.strip() + if not line or line.startswith('#'): + continue + if '=' in line: + key, value = line.split('=', 1) + result[key.strip()] = value.strip() + return result + + +def process_analysis_content( + analysis_dict: Dict[str, Any], + iteration: int, + results_dir: Path +) -> Dict[str, Any]: + """ + Process get_analysis() output, detecting content types and saving to files. + + Args: + analysis_dict: The dict returned by get_analysis() + iteration: Current iteration number + results_dir: Directory to save files + + Returns: + Processed dict with file references instead of raw content + """ + processed = {'data': analysis_dict.get('data', {})} + + # Process 'html' field if present + if 'html' in analysis_dict: + html_content = analysis_dict['html'] + html_file = results_dir / f"analysis_{iteration}.html" + with open(html_file, 'w') as f: + f.write(html_content) + processed['html_file'] = str(html_file.name) + log_info(f" 💾 Saved HTML to {html_file.name}") + + # Process 'text' field if present + if 'text' in analysis_dict: + text_content = analysis_dict['text'] + content_type = detect_content_type(text_content) + + if content_type == 'html': + # Save as HTML file + html_file = results_dir / f"analysis_{iteration}.html" + with open(html_file, 'w') as f: + f.write(text_content) + processed['html_file'] = str(html_file.name) + log_info(f" 💾 Detected HTML in text, saved to {html_file.name}") + + elif content_type == 'json': + # Parse JSON and save to file + json_file = results_dir / f"analysis_{iteration}.json" + try: + parsed_json = json.loads(text_content) + with open(json_file, 'w') as f: + json.dump(parsed_json, f, indent=2) + processed['json_data'] = parsed_json + processed['json_file'] = str(json_file.name) + log_info(f" 💾 Detected JSON, parsed and saved to {json_file.name}") + except Exception as e: + log_warning(f"⚠️ Failed to parse JSON: {e}") + processed['text'] = text_content + + elif content_type == 'keyvalue': + # Parse key=value format and save to file + txt_file = results_dir / f"analysis_{iteration}.txt" + with open(txt_file, 'w') as f: + f.write(text_content) + try: + parsed_kv = parse_keyvalue_text(text_content) + processed['keyvalue_data'] = parsed_kv + processed['txt_file'] = str(txt_file.name) + log_info(f" 💾 Detected key=value format, parsed and saved to {txt_file.name}") + except Exception as e: + log_warning(f"⚠️ Failed to parse key=value: {e}") + processed['text'] = text_content + + elif content_type == 'markdown': + # Save as markdown file + md_file = results_dir / f"analysis_{iteration}.md" + with open(md_file, 'w') as f: + f.write(text_content) + processed['md_file'] = str(md_file.name) + log_info(f" 💾 Detected markdown, saved to {md_file.name}") + + else: + # Keep as plain text + processed['text'] = text_content + + return processed + + +def flatten_dict_recursive(d: dict, parent_key: str = '', sep: str = '_') -> dict: + """ + Recursively flatten a nested dictionary. + + Args: + d: Dictionary to flatten + parent_key: Parent key prefix for nested keys + sep: Separator to use between nested keys + + Returns: + Flattened dictionary with keys joined by separator + """ + items = [] + for k, v in d.items(): + new_key = f"{parent_key}{sep}{k}" if parent_key else k + if isinstance(v, dict): + # Recursively flatten nested dict + items.extend(flatten_dict_recursive(v, new_key, sep=sep).items()) + else: + items.append((new_key, v)) + return dict(items) + + +def flatten_dict_columns(df: "pandas.DataFrame") -> "pandas.DataFrame": + """ + Recursively flatten dictionary-valued columns into separate columns. + + For each column containing dict values, creates new columns with the dict keys. + Nested dicts are flattened recursively with keys joined by '_'. + For example, {"stats": {"basic": {"min": 1, "max": 4}}} becomes: + - stats_basic_min: 1 + - stats_basic_max: 4 + + The original dict column is removed. + + Args: + df: DataFrame potentially containing dict-valued columns + + Returns: + DataFrame with dict columns recursively flattened + """ + + if df.empty: + return df + + # Keep flattening until no more dict columns remain + max_iterations = 10 # Prevent infinite loops + iteration = 0 + + while iteration < max_iterations: + iteration += 1 + + # Track which columns contain dicts and need to be flattened + dict_columns = [] + + for col in df.columns: + # Check if this column contains dict values + # Sample first non-None value to check type + sample_value = None + for val in df[col]: + if val is not None: + sample_value = val + break + + if isinstance(sample_value, dict): + dict_columns.append(col) + + if not dict_columns: + break # No more dict columns to flatten + + # Flatten each dict column + new_columns = {} + + for col in dict_columns: + # Process each row in this column + for row_idx, val in enumerate(df[col]): + if isinstance(val, dict): + # Recursively flatten this dict + flattened = flatten_dict_recursive(val, parent_key=col, sep='_') + + # Add flattened keys to new_columns + for flat_key, flat_val in flattened.items(): + if flat_key not in new_columns: + # Initialize column with None for all rows + new_columns[flat_key] = [None] * len(df) + new_columns[flat_key][row_idx] = flat_val + + # Create new DataFrame with original columns plus flattened dict columns + df = df.copy() + + # Add new columns + for col_name, values in new_columns.items(): + df[col_name] = values + + # Drop original dict columns + df = df.drop(columns=dict_columns) + + return df + + +def get_and_process_analysis( + algo_instance, + all_input_vars: List[Dict[str, float]], + all_output_values: List[float], + iteration: int, + results_dir: Path, + method_name: str = 'get_analysis' +) -> Optional[Dict[str, Any]]: + """ + Helper to call algorithm's analysis method and process the results. + + Args: + algo_instance: Algorithm instance + all_input_vars: All evaluated input combinations + all_output_values: All corresponding output values + iteration: Current iteration number + results_dir: Directory to save processed results + method_name: Name of the display method ('get_analysis' or 'get_analysis_tmp') + + Returns: + Processed analysis dict or None if method doesn't exist or fails + """ + if not hasattr(algo_instance, method_name): + return None + + try: + analysis_method = getattr(algo_instance, method_name) + analysis_dict = analysis_method(all_input_vars, all_output_values) + + if analysis_dict: + # Process and save content intelligently + processed = process_analysis_content(analysis_dict, iteration, results_dir) + # Also keep the original text/html for backward compatibility + processed['_raw'] = analysis_dict + return processed + return None + + except Exception as e: + log_warning(f"⚠️ {method_name} failed: {e}") + return None + + +def get_analysis( + algo_instance, + all_input_vars: List[Dict[str, float]], + all_output_values: List[float], + output_expression: str, + algorithm: str, + iteration: int, + results_dir: Path +) -> Dict[str, Any]: + """ + Create final analysis results with analysis information and DataFrame. + + Args: + algo_instance: Algorithm instance + all_input_vars: All evaluated input combinations + all_output_values: All corresponding output values + output_expression: Expression for output column name + algorithm: Algorithm path/name + iteration: Final iteration number + results_dir: Directory for saving results + + Returns: + Dict with analysis results including XY DataFrame and analysis info + """ + # Display final results + log_info("\n" + "="*60) + log_info("📈 Final Results") + log_info("="*60) + + # Get and process final analysis results + processed_final_analysis = get_and_process_analysis( + algo_instance, all_input_vars, all_output_values, + iteration, results_dir, 'get_analysis' + ) + + if processed_final_analysis and '_raw' in processed_final_analysis: + if 'text' in processed_final_analysis['_raw']: + log_info(processed_final_analysis['_raw']['text']) + # Remove _raw from returned dict - it's only for internal use + del processed_final_analysis['_raw'] + + # If processed_final_analysis is None, create empty dict for backward compatibility + if processed_final_analysis is None: + processed_final_analysis = {} + + # Create DataFrame with all input and output values + df_data = [] + for inp_dict, out_val in zip(all_input_vars, all_output_values): + row = inp_dict.copy() + row[output_expression] = out_val # Use output_expression as column name + df_data.append(row) + + data_df = pd.DataFrame(df_data) + + # Prepare return value + result = { + 'XY': data_df, # DataFrame with all X and Y values + 'analysis': processed_final_analysis, # Use processed analysis instead of raw + 'algorithm': algorithm, + 'iterations': iteration, + 'total_evaluations': len(all_input_vars), + } + + # Add summary + valid_count = sum(1 for v in all_output_values if v is not None) + summary = f"{algorithm} completed: {iteration} iterations, {len(all_input_vars)} evaluations ({valid_count} valid)" + result['summary'] = summary + + return result \ No newline at end of file diff --git a/fz/runners.py b/fz/runners.py index ccb9069..7a733aa 100644 --- a/fz/runners.py +++ b/fz/runners.py @@ -5,17 +5,13 @@ import os import subprocess import time -import re -import tarfile -import tempfile import hashlib import base64 -import threading -import queue import socket import platform -import shutil import uuid +import threading +from collections import defaultdict from .logging import log_error, log_warning, log_info, log_debug from .config import get_config @@ -178,6 +174,173 @@ def get_host_key_policy(password_provided: bool = False, auto_accept: bool = Fal return paramiko.AutoAddPolicy() +class CalculatorManager: + """Thread-safe calculator management for parallel execution""" + + def __init__(self): + self._lock = threading.Lock() + self._calculator_locks = defaultdict(threading.Lock) + self._calculator_owners = {} # calculator_id -> thread_id mapping + self._calculator_registry = {} # calculator_id -> original_uri mapping + + def register_calculator_instances(self, calculator_uris: List[str]) -> List[str]: + """ + Register calculator instances with unique IDs for each occurrence + + Args: + calculator_uris: List of calculator URIs (may contain duplicates) + + Returns: + List of unique calculator IDs + """ + calculator_ids = [] + with self._lock: + for uri in calculator_uris: + # Generate unique alphanumeric ID for tmux compatibility + unique_id = uuid.uuid4().hex[:8] + calc_id = f"{uri}#{unique_id}" + self._calculator_registry[calc_id] = uri + calculator_ids.append(calc_id) + return calculator_ids + + def get_original_uri(self, calculator_id: str) -> str: + """Get the original URI for a calculator ID""" + return self._calculator_registry.get(calculator_id, calculator_id) + + def acquire_calculator(self, calculator_id: str, thread_id: int) -> bool: + """ + Try to acquire exclusive access to a calculator + + Args: + calculator_id: Calculator ID to acquire + thread_id: Thread ID requesting the calculator + + Returns: + True if calculator was acquired, False if already in use + """ + calc_lock = self._calculator_locks[calculator_id] + + # Try to acquire the calculator lock (non-blocking) + acquired = calc_lock.acquire(blocking=False) + + if acquired: + with self._lock: + self._calculator_owners[calculator_id] = thread_id + original_uri = self.get_original_uri(calculator_id) + log_debug( + f"🔒 [Thread {thread_id}] Acquired calculator: {original_uri} (ID: {calculator_id})" + ) + return True + else: + current_owner = self._calculator_owners.get(calculator_id, "unknown") + original_uri = self.get_original_uri(calculator_id) + log_debug( + f"⏳ [Thread {thread_id}] Calculator {original_uri} (ID: {calculator_id}) is busy (owned by thread {current_owner})" + ) + return False + + def release_calculator(self, calculator_id: str, thread_id: int): + """ + Release exclusive access to a calculator + + Args: + calculator_id: Calculator ID to release + thread_id: Thread ID releasing the calculator + """ + try: + with self._lock: + if calculator_id in self._calculator_owners: + del self._calculator_owners[calculator_id] + + calc_lock = self._calculator_locks[calculator_id] + calc_lock.release() + original_uri = self.get_original_uri(calculator_id) + log_debug( + f"🔓 [Thread {thread_id}] Released calculator: {original_uri} (ID: {calculator_id})" + ) + except Exception as e: + original_uri = self.get_original_uri(calculator_id) + log_warning( + f"⚠️ [Thread {thread_id}] Error releasing calculator {original_uri} (ID: {calculator_id}): {e}" + ) + + def get_available_calculator( + self, calculator_ids: List[str], thread_id: int, case_index: int + ) -> Optional[str]: + """ + Get an available calculator from the list, preferring round-robin distribution + + Args: + calculator_ids: List of calculator IDs to choose from + thread_id: Thread ID requesting a calculator + case_index: Case index for round-robin distribution + + Returns: + Available calculator ID or None if all are busy + """ + if not calculator_ids: + return None + + # Try round-robin selection first + preferred_index = case_index % len(calculator_ids) + preferred_calc = calculator_ids[preferred_index] + + if self.acquire_calculator(preferred_calc, thread_id): + return preferred_calc + + # If preferred calculator is busy, try others + for calc in calculator_ids: + if calc != preferred_calc and self.acquire_calculator(calc, thread_id): + return calc + + # All calculators are busy + return None + + def cleanup_all_calculators(self): + """ + Release all calculator locks and clear internal state + + This should be called when fzr execution is complete to ensure + proper cleanup of resources. + """ + with self._lock: + # Force release all calculator locks + for calc_id, calc_lock in self._calculator_locks.items(): + try: + # Try to release the lock (may fail if not held) + if calc_id in self._calculator_owners: + thread_id = self._calculator_owners[calc_id] + log_debug( + f"🧹 Cleanup: Force-releasing calculator {calc_id} from thread {thread_id}" + ) + calc_lock.release() + except Exception as e: + # Lock might not be held, which is fine + pass + + # Clear all state + self._calculator_locks.clear() + self._calculator_owners.clear() + self._calculator_registry.clear() + self._next_id = 1 + + log_debug("🧹 CalculatorManager cleanup completed") + + def get_active_calculators(self) -> Dict[str, int]: + """ + Get currently active calculators and their owners + + Returns: + Dict mapping calculator ID to thread ID for active calculators + """ + with self._lock: + return dict(self._calculator_owners) + + +# Global instance +_calculator_manager = CalculatorManager() + + def validate_ssh_connection_security( host: str, username: str, password: Optional[str] ) -> Dict[str, Any]: @@ -867,7 +1030,7 @@ def run_local_calculation( # Construct command - resolve ALL file paths to absolute for reliable parallel execution if command: resolved_command, was_changed = resolve_all_paths_in_command( - command, original_cwd + command.replace("\\","/"), original_cwd ) # Apply shell path resolution to command if FZ_SHELL_PATH is set diff --git a/fz/shell.py b/fz/shell.py index 0a8851c..ca08832 100644 --- a/fz/shell.py +++ b/fz/shell.py @@ -407,7 +407,6 @@ def replace_commands_in_string(self, command_string: str) -> str: pattern = r'\b' + re.escape(cmd) + r'\b' # Use a lambda function to properly handle backslashes in the replacement modified = re.sub(pattern, lambda m: resolved_path, modified) - #log_debug(f"Replaced '{cmd}' with '{resolved_path}' in command string") return modified diff --git a/fz/spinner.py b/fz/spinner.py index 26dbdea..9db3823 100644 --- a/fz/spinner.py +++ b/fz/spinner.py @@ -76,6 +76,13 @@ def stop(self, clear: bool = False): if self.thread: self.thread.join(timeout=1.0) + # Render final status line to show "Total time:" + if not clear: + final_status = self._build_status_line() + sys.stdout.write('\r' + final_status) + sys.stdout.flush() + self.last_output = final_status + if clear and self.last_output: # Clear the line sys.stdout.write('\r' + ' ' * len(self.last_output) + '\r') @@ -153,7 +160,7 @@ def _build_status_line(self) -> str: completed = sum(1 for s in self.statuses if s in (CaseStatus.DONE, CaseStatus.FAILED)) remaining = self.num_cases - completed - # Calculate ETA + # Calculate ETA or Total time if remaining > 0 and self.case_durations: # Use average duration of completed cases avg_duration = sum(self.case_durations) / len(self.case_durations) @@ -163,8 +170,12 @@ def _build_status_line(self) -> str: # No completed cases yet, show calculating eta_text = "ETA: ..." else: - # All cases completed - eta_text = "Done" + # All cases completed - show total time + if self.start_time is not None: + total_time = time.time() - self.start_time + eta_text = f"Total time: {self._format_eta(total_time)}" + else: + eta_text = "Done" # Build final line status_line = f"[{''.join(chars)}] {eta_text}" @@ -220,6 +231,7 @@ def __enter__(self): def __exit__(self, exc_type, exc_val, exc_tb): """Context manager exit""" if self.enabled: - self.stop(clear=True) + # Stop but don't clear - keep the final status visible + self.stop(clear=False) # Print final newline to move to next line print() diff --git a/pyproject.toml b/pyproject.toml index a42a89d..561c949 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -29,6 +29,7 @@ classifiers = [ requires-python = ">=3.8" dependencies = [ "paramiko>=2.7.0", + "pandas>=1.0.0", ] [project.optional-dependencies] diff --git a/tests/pytest.ini b/pytest.ini similarity index 88% rename from tests/pytest.ini rename to pytest.ini index ef0781f..fe7b1a2 100644 --- a/tests/pytest.ini +++ b/pytest.ini @@ -1,4 +1,4 @@ -[tool:pytest] +[pytest] testpaths = tests python_files = test_*.py python_classes = Test* @@ -8,6 +8,7 @@ python_functions = test_* markers = slow: marks tests as slow (may require external tools) integration: marks tests as integration tests + manual: marks tests that require manual interaction (skipped by default) requires_docker: marks tests that require Docker requires_omc: marks tests that require OpenModelica requires_ssh: marks tests that require SSH server on localhost diff --git a/setup.py b/setup.py index e735b80..a97c59d 100644 --- a/setup.py +++ b/setup.py @@ -16,6 +16,7 @@ python_requires=">=3.8", install_requires=[ "paramiko>=2.7.0", + "pandas>=1.0.0", ], extras_require={ "dev": [ @@ -24,9 +25,6 @@ "black", "flake8", ], - "pandas": [ - "pandas>=1.0.0", - ], "r": [ "rpy2>=3.4.0", ], @@ -38,6 +36,7 @@ "fzc=fz.cli:fzc_main", "fzo=fz.cli:fzo_main", "fzr=fz.cli:fzr_main", + "fzd=fz.cli:fzd_main", ], }, classifiers=[ diff --git a/tests/test_algorithm_installation.py b/tests/test_algorithm_installation.py new file mode 100644 index 0000000..b4c28d9 --- /dev/null +++ b/tests/test_algorithm_installation.py @@ -0,0 +1,593 @@ +""" +Test algorithm installation functionality + +Tests installation, uninstallation, and listing of algorithms +from GitHub repositories, URLs, and local zip files. +""" +import json +import os +import platform +import shutil +import tempfile +import zipfile +from pathlib import Path + +import pytest + + +# Determine if we're running on Windows +IS_WINDOWS = platform.system() == "Windows" + + +@pytest.fixture +def temp_workspace(): + """Create a temporary workspace for tests""" + tmpdir = tempfile.mkdtemp() + yield Path(tmpdir) + shutil.rmtree(tmpdir, ignore_errors=True) + + +@pytest.fixture +def test_algorithm_zip_py(temp_workspace): + """Create a test Python algorithm zip file""" + # Create algorithm structure + algo_dir = temp_workspace / "fz-testalgo-main" + algo_dir.mkdir(exist_ok=True) + + # Create algorithm implementation + algo_code = ''' +class TestAlgo: + """Simple test algorithm""" + def __init__(self, **options): + self.n_samples = options.get("n_samples", 10) + + def get_initial_design(self, input_vars, output_vars): + import random + random.seed(42) + samples = [] + for _ in range(self.n_samples): + sample = {} + for var, (min_val, max_val) in input_vars.items(): + sample[var] = random.uniform(min_val, max_val) + samples.append(sample) + return samples + + def get_next_design(self, X, Y): + return [] # One-shot sampling + + def get_analysis(self, X, Y): + valid_Y = [y for y in Y if y is not None] + mean = sum(valid_Y) / len(valid_Y) if valid_Y else 0 + return {"text": f"Mean: {mean:.2f}", "data": {"mean": mean}} +''' + + algo_file = algo_dir / "testalgo.py" + algo_file.write_text(algo_code) + + # Create zip file + zip_path = temp_workspace / "fz-testalgo.zip" + with zipfile.ZipFile(zip_path, 'w') as zf: + zf.write(algo_file, arcname="fz-testalgo-main/testalgo.py") + + return zip_path + + +@pytest.fixture +def test_algorithm_zip_r(temp_workspace): + """Create a test R algorithm zip file""" + # Create algorithm structure + algo_dir = temp_workspace / "fz-testalgor-main" + algo_dir.mkdir(exist_ok=True) + + # Create R algorithm implementation + algo_code = ''' +TestAlgoR <- function(...) { + opts <- list(...) + state <- new.env(parent = emptyenv()) + state$n_samples <- 0 + + obj <- list( + options = list( + n_samples = as.integer(ifelse(is.null(opts$n_samples), 10, opts$n_samples)) + ), + state = state + ) + + class(obj) <- "TestAlgoR" + return(obj) +} + +get_initial_design.TestAlgoR <- function(obj, input_variables, output_variables) { + set.seed(42) + samples <- list() + for (i in 1:obj$options$n_samples) { + sample <- list() + for (var in names(input_variables)) { + bounds <- input_variables[[var]] + sample[[var]] <- runif(1, bounds[1], bounds[2]) + } + samples[[i]] <- sample + } + return(samples) +} + +get_next_design.TestAlgoR <- function(obj, X, Y) { + return(list()) +} + +get_analysis.TestAlgoR <- function(obj, X, Y) { + valid_Y <- Y[!sapply(Y, is.null)] + mean_val <- mean(unlist(valid_Y)) + return(list(text = paste0("Mean: ", round(mean_val, 2)), data = list(mean = mean_val))) +} +''' + + algo_file = algo_dir / "testalgor.R" + algo_file.write_text(algo_code) + + # Create zip file + zip_path = temp_workspace / "fz-testalgor.zip" + with zipfile.ZipFile(zip_path, 'w') as zf: + zf.write(algo_file, arcname="fz-testalgor-main/testalgor.R") + + return zip_path + + +@pytest.fixture +def test_algorithm_zip_fz_structure(temp_workspace): + """Create algorithm zip with .fz/algorithms/ structure""" + # Create algorithm structure matching fz repository structure + algo_dir = temp_workspace / "fz-advanced-main" + fz_dir = algo_dir / ".fz" / "algorithms" + fz_dir.mkdir(parents=True, exist_ok=True) + + # Create algorithm implementation + algo_code = ''' +class AdvancedAlgo: + """Advanced test algorithm""" + def __init__(self, **options): + self.batch_size = options.get("batch_size", 5) + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] * self.batch_size + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Analysis complete", "data": {"count": len(X)}} +''' + + algo_file = fz_dir / "advanced.py" + algo_file.write_text(algo_code) + + # Create zip file + zip_path = temp_workspace / "fz-advanced.zip" + with zipfile.ZipFile(zip_path, 'w') as zf: + zf.write(algo_file, arcname="fz-advanced-main/.fz/algorithms/advanced.py") + + return zip_path + + +@pytest.fixture +def install_workspace(temp_workspace): + """Create a workspace with .fz directories for installation""" + fz_dir = temp_workspace / ".fz" + algo_dir = fz_dir / "algorithms" + algo_dir.mkdir(parents=True, exist_ok=True) + + return temp_workspace + + +class TestAlgorithmInstallation: + """Test algorithm installation functions""" + + def test_install_algorithm_from_local_zip_py(self, test_algorithm_zip_py, install_workspace): + """Test installing Python algorithm from local zip file""" + from fz.installer import install_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + result = install_algorithm(str(test_algorithm_zip_py), global_install=False) + + assert result['algorithm_name'] == 'testalgo' + assert 'install_path' in result + + # Verify algorithm file was created + algo_file = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert algo_file.exists() + + # Verify can load the algorithm + from fz.algorithms import load_algorithm + algo = load_algorithm("testalgo", n_samples=5) + assert algo is not None + finally: + os.chdir(original_cwd) + + def test_install_algorithm_from_local_zip_r(self, test_algorithm_zip_r, install_workspace): + """Test installing R algorithm from local zip file""" + from fz.installer import install_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + result = install_algorithm(str(test_algorithm_zip_r), global_install=False) + + assert result['algorithm_name'] == 'testalgor' + assert 'install_path' in result + + # Verify algorithm file was created + algo_file = install_workspace / ".fz" / "algorithms" / "testalgor.R" + assert algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_install_algorithm_fz_structure(self, test_algorithm_zip_fz_structure, install_workspace): + """Test installing algorithm from .fz/algorithms/ structure""" + from fz.installer import install_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + result = install_algorithm(str(test_algorithm_zip_fz_structure), global_install=False) + + assert result['algorithm_name'] == 'advanced' + + # Verify algorithm file was created + algo_file = install_workspace / ".fz" / "algorithms" / "advanced.py" + assert algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_install_global_algorithm(self, test_algorithm_zip_py): + """Test installing algorithm globally to ~/.fz/algorithms/""" + from fz.installer import install_algorithm, uninstall_algorithm + + try: + result = install_algorithm(str(test_algorithm_zip_py), global_install=True) + + assert result['algorithm_name'] == 'testalgo' + + # Verify in global location + global_algo = Path.home() / ".fz" / "algorithms" / "testalgo.py" + assert global_algo.exists() + + finally: + # Cleanup: remove global installation + uninstall_algorithm('testalgo', global_uninstall=True) + + def test_install_overwrites_existing(self, test_algorithm_zip_py, install_workspace): + """Test that installing same algorithm twice overwrites""" + from fz.installer import install_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install first time + result1 = install_algorithm(str(test_algorithm_zip_py), global_install=False) + assert result1['algorithm_name'] == 'testalgo' + + # Install again (should overwrite) + result2 = install_algorithm(str(test_algorithm_zip_py), global_install=False) + assert result2['algorithm_name'] == 'testalgo' + + # Should still exist + algo_file = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_install_invalid_zip(self, temp_workspace, install_workspace): + """Test error handling for invalid zip file""" + from fz.installer import install_algorithm + + # Create an empty zip file + invalid_zip = temp_workspace / "invalid.zip" + with zipfile.ZipFile(invalid_zip, 'w') as zf: + pass # Empty zip + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + with pytest.raises(Exception, match="No algorithm files"): + install_algorithm(str(invalid_zip), global_install=False) + finally: + os.chdir(original_cwd) + + +class TestAlgorithmUninstallation: + """Test algorithm uninstallation functions""" + + def test_uninstall_algorithm(self, test_algorithm_zip_py, install_workspace): + """Test uninstalling an algorithm""" + from fz.installer import install_algorithm, uninstall_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install algorithm first + install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # Verify it's installed + algo_file = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert algo_file.exists() + + # Uninstall + success = uninstall_algorithm('testalgo', global_uninstall=False) + assert success is True + + # Verify it's removed + assert not algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_uninstall_nonexistent_algorithm(self, install_workspace): + """Test uninstalling non-existent algorithm returns False""" + from fz.installer import uninstall_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + success = uninstall_algorithm('nonexistent', global_uninstall=False) + assert success is False + finally: + os.chdir(original_cwd) + + def test_uninstall_removes_both_py_and_r(self, test_algorithm_zip_py, test_algorithm_zip_r, install_workspace): + """Test that uninstall removes both .py and .R files if present""" + from fz.installer import install_algorithm, uninstall_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install Python algorithm + install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # Manually create an R version with same name + r_algo = install_workspace / ".fz" / "algorithms" / "testalgo.R" + r_algo.write_text("# R version") + + # Both should exist + py_algo = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert py_algo.exists() + assert r_algo.exists() + + # Uninstall by name (should remove both) + success = uninstall_algorithm('testalgo', global_uninstall=False) + assert success is True + + # Both should be removed + assert not py_algo.exists() + assert not r_algo.exists() + finally: + os.chdir(original_cwd) + + +class TestAlgorithmListing: + """Test algorithm listing functions""" + + def test_list_installed_algorithms_empty(self, install_workspace): + """Test listing when no algorithms are installed""" + from fz.installer import list_installed_algorithms + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + algorithms = list_installed_algorithms(global_list=False) + # Should return empty dict or only global algorithms + assert isinstance(algorithms, dict) + finally: + os.chdir(original_cwd) + + def test_list_installed_algorithms(self, test_algorithm_zip_py, test_algorithm_zip_r, install_workspace): + """Test listing installed algorithms""" + from fz.installer import install_algorithm, list_installed_algorithms + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install both Python and R algorithms + install_algorithm(str(test_algorithm_zip_py), global_install=False) + install_algorithm(str(test_algorithm_zip_r), global_install=False) + + # List algorithms + algorithms = list_installed_algorithms(global_list=False) + + assert 'testalgo' in algorithms + assert 'testalgor' in algorithms + + # Check algorithm info + testalgo_info = algorithms['testalgo'] + assert testalgo_info['type'] == 'Python' + assert testalgo_info['global'] is False + + testalgor_info = algorithms['testalgor'] + assert testalgor_info['type'] == 'R' + assert testalgor_info['global'] is False + finally: + os.chdir(original_cwd) + + def test_list_shows_global_flag(self, test_algorithm_zip_py, install_workspace): + """Test that list shows correct global flag""" + from fz.installer import install_algorithm, list_installed_algorithms, uninstall_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install locally + install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # Install globally with different name (to avoid conflict) + # We'll use the same zip but it will show up with same name + # Just install globally and check the flag + install_algorithm(str(test_algorithm_zip_py), global_install=True) + + # List all + algorithms = list_installed_algorithms(global_list=False) + + # Local should have priority, so testalgo should be marked as local + if 'testalgo' in algorithms: + # The local one takes priority in listing + assert algorithms['testalgo']['global'] is False + + finally: + os.chdir(original_cwd) + # Cleanup global installation + uninstall_algorithm('testalgo', global_uninstall=True) + + def test_list_global_only(self, test_algorithm_zip_py, install_workspace): + """Test listing only global algorithms""" + from fz.installer import install_algorithm, list_installed_algorithms, uninstall_algorithm + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install locally + install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # List only global (should not show local algorithm) + algorithms = list_installed_algorithms(global_list=True) + + # testalgo should not appear in global list + assert 'testalgo' not in algorithms + + finally: + os.chdir(original_cwd) + + +class TestAlgorithmCLIIntegration: + """Test CLI integration for algorithm installation""" + + def test_cli_install_algorithm_help(self): + """Test fz install algorithm --help""" + from fz.cli import main + import sys + from io import StringIO + + # Save original + original_argv = sys.argv + original_stdout = sys.stdout + original_stderr = sys.stderr + + # Redirect output + sys.stdout = StringIO() + sys.stderr = StringIO() + sys.argv = ['fz', 'install', 'algorithm', '--help'] + + returncode = 0 + try: + returncode = main() + except SystemExit as e: + returncode = e.code if e.code is not None else 0 + finally: + output = sys.stdout.getvalue() + sys.stderr.getvalue() + sys.stdout = original_stdout + sys.stderr = original_stderr + sys.argv = original_argv + + assert returncode == 0 + assert 'algorithm' in output.lower() + + def test_cli_list_algorithms_help(self): + """Test fz list algorithms --help""" + from fz.cli import main + import sys + from io import StringIO + + # Save original + original_argv = sys.argv + original_stdout = sys.stdout + original_stderr = sys.stderr + + # Redirect output + sys.stdout = StringIO() + sys.stderr = StringIO() + sys.argv = ['fz', 'list', 'algorithms', '--help'] + + returncode = 0 + try: + returncode = main() + except SystemExit as e: + returncode = e.code if e.code is not None else 0 + finally: + output = sys.stdout.getvalue() + sys.stderr.getvalue() + sys.stdout = original_stdout + sys.stderr = original_stderr + sys.argv = original_argv + + assert returncode == 0 + assert 'algorithm' in output.lower() + + +class TestAlgorithmPythonAPI: + """Test Python API for algorithm installation""" + + def test_install_algo_function(self, test_algorithm_zip_py, install_workspace): + """Test fz.install_algorithm() function""" + import fz + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + result = fz.install_algorithm(str(test_algorithm_zip_py), global_install=False) + + assert result['algorithm_name'] == 'testalgo' + + # Verify file exists + algo_file = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_uninstall_algo_function(self, test_algorithm_zip_py, install_workspace): + """Test fz.uninstall_algo() function""" + import fz + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install first + fz.install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # Uninstall + success = fz.uninstall_algorithm('testalgo', global_uninstall=False) + + assert success is True + + # Verify removed + algo_file = install_workspace / ".fz" / "algorithms" / "testalgo.py" + assert not algo_file.exists() + finally: + os.chdir(original_cwd) + + def test_list_algorithms_function(self, test_algorithm_zip_py, install_workspace): + """Test fz.list_algorithms() function""" + import fz + + original_cwd = os.getcwd() + try: + os.chdir(install_workspace) + + # Install an algorithm + fz.install_algorithm(str(test_algorithm_zip_py), global_install=False) + + # List algorithms + algorithms = fz.list_installed_algorithms(global_list=False) + + assert 'testalgo' in algorithms + assert algorithms['testalgo']['type'] == 'Python' + finally: + os.chdir(original_cwd) + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) diff --git a/tests/test_algorithm_montecarlo.py b/tests/test_algorithm_montecarlo.py new file mode 100644 index 0000000..9f35311 --- /dev/null +++ b/tests/test_algorithm_montecarlo.py @@ -0,0 +1,120 @@ +#title: Estimate mean with given confidence interval range using Monte Carlo +#author: Yann Richet +#type: sampling +#options: batch_sample_size=10;max_iterations=100;confidence=0.9;target_confidence_range=1.0;seed=42 +#require: numpy;scipy;matplotlib;base64 + +class MonteCarlo_Uniform: + + options = {} + samples = [] + n_samples = 0 + variables = {} + + def __init__(self, **options): + # parse (numeric) options + self.options["batch_sample_size"] = int(options.get("batch_sample_size", 10)) + self.options["max_iterations"] = int(options.get("max_iterations", 100)) + self.options["confidence"] = float(options.get("confidence", 0.9)) + self.options["target_confidence_range"] = float(options.get("target_confidence_range", 1.0)) + + import numpy as np + from scipy import stats + np.random.seed(int(options.get("seed", 42))) + + def get_initial_design(self, input_variables, output_variables): + for v, bounds in input_variables.items(): + # bounds is already a tuple (min, max) from parse_input_vars + if isinstance(bounds, tuple) and len(bounds) == 2: + min_val, max_val = bounds + else: + # Fallback: parse bounds string if needed : [min;max] + bounds_str = str(bounds).strip("[]").split(";") + if len(bounds_str) != 2: + raise Exception(f"Input variable {v} must be defined with min and max values for MonteCarlo_Uniform sampling") + min_val = float(bounds_str[0]) + max_val = float(bounds_str[1]) + self.variables[v] = (min_val, max_val) + return self._generate_samples(self.options["batch_sample_size"]) + + def get_next_design(self, X, Y): + # check max iterations + if self.n_samples >= self.options["max_iterations"] * self.options["batch_sample_size"]: + return [] + # check confidence interval: compute empirical confidence interval (using kernel density) on Y, compare with target_confidence_range + import numpy as np + from scipy import stats + Y_array = np.array([y for y in Y if y is not None]) + if len(Y_array) < 2: + return self._generate_samples(self.options["batch_sample_size"]) + kde = stats.gaussian_kde(Y_array) + mean = np.mean(Y_array) + conf_int = stats.t.interval(self.options["confidence"], len(Y_array)-1, loc=mean, scale=stats.sem(Y_array)) + conf_range = conf_int[1] - conf_int[0] + if conf_range <= self.options["target_confidence_range"]: + return [] + # else generate new samples + return self._generate_samples(self.options["batch_sample_size"]) + + def _generate_samples(self, n): + import numpy as np + samples = [] + for _ in range(n): + sample = {} + for v, (min_val, max_val) in self.variables.items(): + sample[v] = np.random.uniform(min_val, max_val) + samples.append(sample) + self.n_samples += n + return samples + + def get_analysis(self, X, Y): + analysis_dict = {"text": "", "data": {}} + html_output = "" + import numpy as np + from scipy import stats + Y_array = np.array([y for y in Y if y is not None]) + if len(Y_array) < 2: + analysis_dict["text"] = "Not enough valid results to analysis statistics" + return analysis_dict + mean = np.mean(Y_array) + conf_int = stats.t.interval(self.options["confidence"], len(Y_array)-1, loc=mean, scale=stats.sem(Y_array)) + html_output += f"

Estimated mean: {mean}

" + html_output += f"

{self.options['confidence']*100}% confidence interval: [{conf_int[0]}, {conf_int[1]}]

" + + # Store data + analysis_dict["data"]["mean"] = mean + analysis_dict["data"]["confidence_interval"] = conf_int + analysis_dict["data"]["n_samples"] = len(Y_array) + + # Text output + analysis_dict["text"] = ( + f"Estimated mean: {mean:.6f}\n" + f"{self.options['confidence']*100}% confidence interval: [{conf_int[0]:.6f}, {conf_int[1]:.6f}]\n" + f"Number of valid samples: {len(Y_array)}" + ) + + # Try to plot histogram if matplotlib is available + try: + import matplotlib.pyplot as plt + import base64 + from io import BytesIO + + plt.figure() + plt.hist(Y_array, bins=20, density=True, alpha=0.6, color='g') + plt.title("Output Y histogram") + plt.xlabel("Y") + plt.ylabel("Density") + plt.grid() + + # base64 in html + buffered = BytesIO() + plt.savefig(buffered, format="png") + plt.close() + img_str = base64.b64encode(buffered.getvalue()).decode() + html_output += f'Histogram' + analysis_dict["html"] = html_output + except Exception as e: + # If plotting fails, just skip it + pass + + return analysis_dict diff --git a/tests/test_algorithm_plugins.py b/tests/test_algorithm_plugins.py new file mode 100644 index 0000000..b024d56 --- /dev/null +++ b/tests/test_algorithm_plugins.py @@ -0,0 +1,452 @@ +#!/usr/bin/env python3 +""" +Test algorithm plugin system + +This test suite verifies that: +1. Algorithms can be loaded by name from .fz/algorithms/ directory +2. Project-level plugins take priority over global plugins +3. Both .py and .R algorithms work as plugins +4. Direct paths still work as before +5. Helpful error messages when plugins not found +""" + +import pytest +from pathlib import Path +import shutil + +from fz.algorithms import load_algorithm, resolve_algorithm_path + + +class TestAlgorithmPluginResolution: + """Test algorithm plugin path resolution""" + + def test_resolve_direct_path(self): + """Test that direct paths are not resolved as plugins""" + # Paths with / should not be resolved + assert resolve_algorithm_path("path/to/algo.py") is None + assert resolve_algorithm_path("../algo.py") is None + + # Paths with extension should not be resolved + assert resolve_algorithm_path("algo.py") is None + assert resolve_algorithm_path("algo.R") is None + + def test_resolve_plugin_name(self, temp_test_dir): + """Test resolving plugin name to path""" + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create a test algorithm + algo_file = algo_dir / "testalgo.py" + algo_file.write_text(""" +class TestAlgo: + def __init__(self, **options): + pass + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "test"} +""") + + # Resolve plugin name + resolved = resolve_algorithm_path("testalgo") + + # Verify it found the plugin + assert resolved is not None + assert resolved.exists() + assert resolved.name == "testalgo.py" + + def test_resolve_plugin_with_r_extension(self, temp_test_dir): + """Test resolving R algorithm plugin""" + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create a test R algorithm + algo_file = algo_dir / "testalgo.R" + algo_file.write_text(""" +TestAlgo <- function(...) { + obj <- list() + class(obj) <- "TestAlgo" + return(obj) +} +""") + + # Resolve plugin name + resolved = resolve_algorithm_path("testalgo") + + # Verify it found the R plugin + assert resolved is not None + assert resolved.exists() + assert resolved.name == "testalgo.R" + + def test_resolve_plugin_python_priority_over_r(self, temp_test_dir): + """Test that .py plugin has priority over .R when both exist""" + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create both .py and .R algorithms with same name + py_file = algo_dir / "testalgo.py" + py_file.write_text("class TestAlgo: pass") + + r_file = algo_dir / "testalgo.R" + r_file.write_text("TestAlgo <- function() {}") + + # Resolve plugin name - should get .py first + resolved = resolve_algorithm_path("testalgo") + + # Verify it found the Python plugin (priority) + assert resolved is not None + assert resolved.name == "testalgo.py" + + def test_resolve_plugin_not_found(self, temp_test_dir): + """Test resolving non-existent plugin returns None""" + # Try to resolve non-existent plugin + resolved = resolve_algorithm_path("nonexistent") + + # Should return None + assert resolved is None + + +class TestAlgorithmPluginLoading: + """Test loading algorithms via plugin system""" + + def test_load_plugin_by_name(self, temp_test_dir): + """Test loading algorithm by plugin name""" + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create a test algorithm + algo_file = algo_dir / "myalgorithm.py" + algo_file.write_text(""" +class MyAlgorithm: + def __init__(self, **options): + self.batch_size = options.get("batch_size", 5) + + def get_initial_design(self, input_vars, output_vars): + return [{"x": float(i)} for i in range(self.batch_size)] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Analysis complete", "data": {"count": len(X)}} +""") + + # Load by plugin name + algo = load_algorithm("myalgorithm", batch_size=3) + + # Test the algorithm works + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert len(design) == 3 + assert design[0]["x"] == 0.0 + + def test_load_plugin_from_global_directory(self, temp_test_dir, monkeypatch): + """Test loading algorithm from global ~/.fz/algorithms/""" + # Mock home directory + fake_home = Path(temp_test_dir) / "home" + fake_home.mkdir() + monkeypatch.setenv("HOME", str(fake_home)) + + # Create global .fz/algorithms/ directory + algo_dir = fake_home / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create a test algorithm in global directory + algo_file = algo_dir / "globalalgo.py" + algo_file.write_text(""" +class GlobalAlgo: + def __init__(self, **options): + pass + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 1.0}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "global"} +""") + + # Load by plugin name + algo = load_algorithm("globalalgo") + + # Test the algorithm works + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert len(design) == 1 + assert design[0]["x"] == 1.0 + + def test_load_plugin_project_priority_over_global(self, temp_test_dir, monkeypatch): + """Test that project-level plugin takes priority over global""" + # Mock home directory + fake_home = Path(temp_test_dir) / "home" + fake_home.mkdir() + monkeypatch.setenv("HOME", str(fake_home)) + + # Create global plugin + global_algo_dir = fake_home / ".fz" / "algorithms" + global_algo_dir.mkdir(parents=True) + global_file = global_algo_dir / "samename.py" + global_file.write_text(""" +class SameName: + def __init__(self, **options): + self.source = "global" + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 999.0}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": self.source} +""") + + # Create project-level plugin (should have priority) + project_algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + project_algo_dir.mkdir(parents=True) + project_file = project_algo_dir / "samename.py" + project_file.write_text(""" +class SameName: + def __init__(self, **options): + self.source = "project" + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 1.0}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": self.source} +""") + + # Load by plugin name - should get project-level + algo = load_algorithm("samename") + + # Verify it loaded the project-level one + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert design[0]["x"] == 1.0 # Not 999.0 from global + + analysis = algo.get_analysis([], []) + assert analysis["text"] == "project" + + def test_load_direct_path_still_works(self, temp_test_dir): + """Test that direct paths still work (backward compatibility)""" + # Create algorithm file in arbitrary location + algo_file = Path(temp_test_dir) / "direct_algo.py" + algo_file.write_text(""" +class DirectAlgo: + def __init__(self, **options): + pass + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 42.0}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "direct"} +""") + + # Load by direct path + algo = load_algorithm(str(algo_file)) + + # Verify it works + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert design[0]["x"] == 42.0 + + def test_load_plugin_not_found_error(self, temp_test_dir): + """Test helpful error message when plugin not found""" + # Try to load non-existent plugin + with pytest.raises(ValueError) as exc_info: + load_algorithm("nonexistent") + + # Verify error message is helpful + error_msg = str(exc_info.value) + assert "Plugin 'nonexistent' not found" in error_msg + assert ".fz/algorithms/nonexistent.py" in error_msg + assert "~/.fz/algorithms/nonexistent.py" in error_msg + + def test_load_plugin_with_options(self, temp_test_dir): + """Test passing options to plugin algorithm""" + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create algorithm that uses options + algo_file = algo_dir / "optionalgo.py" + algo_file.write_text(""" +class OptAlgo: + def __init__(self, **options): + self.value = options.get("value", 10) + self.name = options.get("name", "default") + + def get_initial_design(self, input_vars, output_vars): + return [{"x": float(self.value)}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": self.name, "data": {"value": self.value}} +""") + + # Load with options + algo = load_algorithm("optionalgo", value=42, name="test") + + # Verify options were passed + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert design[0]["x"] == 42.0 + + analysis = algo.get_analysis([], []) + assert analysis["text"] == "test" + assert analysis["data"]["value"] == 42 + + +class TestAlgorithmPluginWithRAlgorithms: + """Test plugin system with R algorithms""" + + @pytest.mark.skipif( + True, # Skip for now unless rpy2 is available + reason="R plugin tests require rpy2" + ) + def test_load_r_plugin_by_name(self, temp_test_dir): + """Test loading R algorithm by plugin name""" + try: + import rpy2 + except ImportError: + pytest.skip("rpy2 not available") + + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create R algorithm + algo_file = algo_dir / "ralgo.R" + algo_file.write_text(""" +RAlgo <- function(...) { + obj <- list( + options = list(), + state = new.env(parent = emptyenv()) + ) + class(obj) <- "RAlgo" + return(obj) +} + +get_initial_design.RAlgo <- function(obj, input_variables, output_variables) { + return(list(list(x = 1.0))) +} + +get_next_design.RAlgo <- function(obj, X, Y) { + return(list()) +} + +get_analysis.RAlgo <- function(obj, X, Y) { + return(list(text = "R plugin")) +} +""") + + # Load by plugin name + algo = load_algorithm("ralgo") + + # Test it works + design = algo.get_initial_design({"x": (0, 10)}, ["result"]) + assert len(design) == 1 + + +class TestAlgorithmPluginIntegration: + """Test plugin system integration with fzd""" + + def test_fzd_with_plugin_algorithm(self, temp_test_dir): + """Test using plugin algorithm with fzd""" + try: + import pandas as pd + except ImportError: + pytest.skip("pandas required for fzd") + + import fz + + # Create .fz/algorithms/ directory + algo_dir = Path(temp_test_dir) / ".fz" / "algorithms" + algo_dir.mkdir(parents=True) + + # Create simple sampling algorithm + algo_file = algo_dir / "simplesampler.py" + algo_file.write_text(""" +class SimpleSampler: + def __init__(self, **options): + self.n_samples = options.get("n_samples", 5) + self.iteration = 0 + + def get_initial_design(self, input_vars, output_vars): + import random + random.seed(42) + samples = [] + for _ in range(self.n_samples): + sample = {} + for var, (min_val, max_val) in input_vars.items(): + sample[var] = random.uniform(min_val, max_val) + samples.append(sample) + return samples + + def get_next_design(self, X, Y): + self.iteration += 1 + if self.iteration >= 1: + return [] + return self.get_initial_design({"x": (0, 10)}, []) + + def get_analysis(self, X, Y): + valid_Y = [y for y in Y if y is not None] + mean_val = sum(valid_Y) / len(valid_Y) if valid_Y else 0 + return { + "text": f"Mean: {mean_val:.2f}", + "data": {"mean": mean_val, "n_samples": len(valid_Y)} + } +""") + + # Create input template + input_file = Path(temp_test_dir) / "input.txt" + input_file.write_text("x=$x") + + # Create calculation script + calc_script = Path(temp_test_dir) / "calc.sh" + calc_script.write_text("""#!/bin/bash +source $1 +echo "result=$x" > output.txt +""") + calc_script.chmod(0o755) + + # Define model + model = { + "varprefix": "$", + "output": {"result": "grep result output.txt | cut -d= -f2"} + } + + # Run fzd with plugin algorithm + results = fz.fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm="simplesampler", # Plugin name! + calculators=[f"sh://bash {calc_script}"], + algorithm_options={"n_samples": 3}, + analysis_dir=str(Path(temp_test_dir) / "fzd_results") + ) + + # Verify results + assert "XY" in results + assert isinstance(results["XY"], pd.DataFrame) + assert len(results["XY"]) >= 3 + assert "x" in results["XY"].columns + assert "result" in results["XY"].columns diff --git a/tests/test_cli_commands.py b/tests/test_cli_commands.py index 6e320ba..6d18628 100644 --- a/tests/test_cli_commands.py +++ b/tests/test_cli_commands.py @@ -131,7 +131,10 @@ def temp_workspace(): def sample_input_file(temp_workspace): """Create a sample input file with variables""" input_file = temp_workspace / "input.txt" - input_file.write_text("x = ${var1}\ny = ${var2}\nz = ${var3}") + with input_file.open('w',newline='\n') as f: + f.write("x = ${var1}\n") + f.write("y = ${var2}\n") + f.write("z = ${var3}\n") return input_file @@ -270,7 +273,9 @@ def test_fzo_json_format(self, temp_workspace, sample_model): """Test fzo with JSON format""" # Create a simple output file output_file = temp_workspace / "output.txt" - output_file.write_text("x = 1.0\ny = 2.0") + with output_file.open('w',newline='\n') as f: + f.write("x = 1.0\n") + f.write("y = 2.0\n") # Use a simple model (same as input model for consistency) result = run_fz_cli_function('fzo_main', [ @@ -579,7 +584,7 @@ def test_install_from_local_zip(self, test_model_zip, install_workspace): try: os.chdir(install_workspace) result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) @@ -604,7 +609,7 @@ def test_install_with_calculators(self, test_model_with_calculators, install_wor try: os.chdir(install_workspace) result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_with_calculators) ]) @@ -637,13 +642,13 @@ def test_list_installed_models(self, test_model_zip, install_workspace): os.chdir(install_workspace) # Install a model first install_result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) assert install_result.returncode == 0 # List models - result = run_fz_cli_function('main', ['list']) + result = run_fz_cli_function('main', ['list', 'models']) assert result.returncode == 0 assert "testmodel" in result.stdout @@ -661,7 +666,7 @@ def test_list_empty_models(self, temp_workspace): original_cwd = os.getcwd() try: os.chdir(empty_workspace) - result = run_fz_cli_function('main', ['list']) + result = run_fz_cli_function('main', ['list', 'models']) assert result.returncode == 0 # May show global models, just verify it runs without error @@ -679,7 +684,7 @@ def test_list_empty_models(self, temp_workspace): def test_install_invalid_source(self, temp_workspace): """Test error handling for invalid source""" result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(temp_workspace / "nonexistent.zip") ]) @@ -690,7 +695,7 @@ def test_install_invalid_source(self, temp_workspace): def test_install_from_github_name(self, install_workspace): """Test installing from GitHub shortname (requires network)""" result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', 'moret' ]) @@ -704,7 +709,7 @@ def test_install_from_github_name(self, install_workspace): def test_install_from_github_url(self, install_workspace): """Test installing from full GitHub URL (requires network)""" result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', 'https://github.com/Funz/fz-moret' ]) @@ -720,14 +725,14 @@ def test_install_overwrites_existing(self, test_model_zip, install_workspace): os.chdir(install_workspace) # Install first time result1 = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) assert result1.returncode == 0 # Install again (should overwrite) result2 = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) assert result2.returncode == 0 @@ -749,7 +754,7 @@ def test_uninstall_model(self, test_model_zip, install_workspace): os.chdir(install_workspace) # Install model first install_result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) assert install_result.returncode == 0 @@ -760,7 +765,7 @@ def test_uninstall_model(self, test_model_zip, install_workspace): # Uninstall result = run_fz_cli_function('main', [ - 'uninstall', + 'uninstall', 'model', 'testmodel' ]) @@ -778,7 +783,7 @@ def test_uninstall_nonexistent_model(self, install_workspace): try: os.chdir(install_workspace) result = run_fz_cli_function('main', [ - 'uninstall', + 'uninstall', 'model', 'nonexistent_model' ]) @@ -794,13 +799,13 @@ def test_list_shows_global_flag(self, test_model_zip, install_workspace): os.chdir(install_workspace) # Install a local model install_result = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) assert install_result.returncode == 0 # List models - result = run_fz_cli_function('main', ['list']) + result = run_fz_cli_function('main', ['list', 'models']) assert result.returncode == 0 # Should show local flag for installed model @@ -812,7 +817,7 @@ def test_global_install_and_uninstall(self, test_model_zip): """Test installing and uninstalling globally""" # Install globally result1 = run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip), '--global' ]) @@ -825,7 +830,7 @@ def test_global_install_and_uninstall(self, test_model_zip): # Uninstall globally result2 = run_fz_cli_function('main', [ - 'uninstall', + 'uninstall', 'model', 'testmodel', '--global' ]) @@ -842,12 +847,12 @@ def test_list_global_only(self, test_model_zip, install_workspace): os.chdir(install_workspace) # Install locally run_fz_cli_function('main', [ - 'install', + 'install', 'model', str(test_model_zip) ]) # List only global (should not show local model) - result = run_fz_cli_function('main', ['list', '--global']) + result = run_fz_cli_function('main', ['list', 'models', '--global']) assert result.returncode == 0 # Local testmodel should not appear diff --git a/tests/test_current_dir_fix.py b/tests/test_current_dir_fix.py index a77d323..866bd4a 100644 --- a/tests/test_current_dir_fix.py +++ b/tests/test_current_dir_fix.py @@ -73,4 +73,4 @@ def test_current_dir_fix(): f"Path resolution incorrect: expected '{expected_path_normalized}' in command '{resolved_cmd_normalized}'" if __name__ == "__main__": - test_current_dir_fix() \ No newline at end of file + test_current_dir_fix() diff --git a/tests/test_dataframe_input.py b/tests/test_dataframe_input.py new file mode 100644 index 0000000..8a94677 --- /dev/null +++ b/tests/test_dataframe_input.py @@ -0,0 +1,345 @@ +""" +Test DataFrame input support for non-factorial designs + +Tests the ability to use pandas DataFrames as input_variables, +where each row represents one case (non-factorial design). +""" +import pytest +import tempfile +import shutil +from pathlib import Path +import pandas as pd + +from fz.helpers import generate_variable_combinations +import fz + + +class TestDataFrameInput: + """Test DataFrame input for non-factorial designs""" + + def test_dataframe_basic(self): + """Test basic DataFrame input with 3 cases""" + df = pd.DataFrame({ + "x": [1, 2, 3], + "y": [10, 20, 30] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 3 + assert var_combinations[0] == {"x": 1, "y": 10} + assert var_combinations[1] == {"x": 2, "y": 20} + assert var_combinations[2] == {"x": 3, "y": 30} + + def test_dataframe_non_factorial(self): + """Test that DataFrame allows non-factorial combinations""" + # Non-factorial: only specific combinations + df = pd.DataFrame({ + "temp": [100, 200, 100, 300], + "pressure": [1.0, 1.0, 2.0, 1.5] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 4 + assert var_combinations[0] == {"temp": 100, "pressure": 1.0} + assert var_combinations[1] == {"temp": 200, "pressure": 1.0} + assert var_combinations[2] == {"temp": 100, "pressure": 2.0} + assert var_combinations[3] == {"temp": 300, "pressure": 1.5} + + def test_dataframe_vs_dict_factorial(self): + """Test that dict creates factorial design while DataFrame doesn't""" + # Dict with lists creates Cartesian product (factorial) + dict_input = {"x": [1, 2], "y": [10, 20]} + dict_combinations = generate_variable_combinations(dict_input) + assert len(dict_combinations) == 4 # 2 x 2 = 4 cases + + # DataFrame with same values creates only specified combinations + df = pd.DataFrame({ + "x": [1, 2], + "y": [10, 20] + }) + df_combinations = generate_variable_combinations(df) + assert len(df_combinations) == 2 # Only 2 cases (rows) + + assert df_combinations[0] == {"x": 1, "y": 10} + assert df_combinations[1] == {"x": 2, "y": 20} + + def test_dataframe_single_row(self): + """Test DataFrame with single row""" + df = pd.DataFrame({ + "a": [42], + "b": [99] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 1 + assert var_combinations[0] == {"a": 42, "b": 99} + + def test_dataframe_many_columns(self): + """Test DataFrame with many variables""" + df = pd.DataFrame({ + "var1": [1, 2], + "var2": [10, 20], + "var3": [100, 200], + "var4": [1000, 2000], + "var5": [10000, 20000] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 2 + assert var_combinations[0] == {"var1": 1, "var2": 10, "var3": 100, "var4": 1000, "var5": 10000} + assert var_combinations[1] == {"var1": 2, "var2": 20, "var3": 200, "var4": 2000, "var5": 20000} + + def test_dataframe_mixed_types(self): + """Test DataFrame with mixed data types""" + df = pd.DataFrame({ + "int_var": [1, 2, 3], + "float_var": [1.5, 2.5, 3.5], + "str_var": ["a", "b", "c"] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 3 + assert var_combinations[0] == {"int_var": 1, "float_var": 1.5, "str_var": "a"} + assert var_combinations[1] == {"int_var": 2, "float_var": 2.5, "str_var": "b"} + assert var_combinations[2] == {"int_var": 3, "float_var": 3.5, "str_var": "c"} + + def test_dataframe_with_repeated_values(self): + """Test DataFrame where same value appears multiple times""" + df = pd.DataFrame({ + "x": [1, 1, 2, 2, 2], + "y": [10, 20, 10, 20, 30] + }) + + var_combinations = generate_variable_combinations(df) + + assert len(var_combinations) == 5 + assert var_combinations[0] == {"x": 1, "y": 10} + assert var_combinations[1] == {"x": 1, "y": 20} + assert var_combinations[2] == {"x": 2, "y": 10} + assert var_combinations[3] == {"x": 2, "y": 20} + assert var_combinations[4] == {"x": 2, "y": 30} + + +class TestDataFrameWithFzr: + """Integration tests using DataFrame with fzr()""" + + def setup_method(self): + """Create temporary directory and test files for each test""" + self.test_dir = tempfile.mkdtemp() + self.test_path = Path(self.test_dir) + + # Create input template + # The sum formula will be evaluated by fz, so the result is already in input.txt + self.input_file = self.test_path / "input.txt" + with open(self.input_file, "w", newline='\n') as f: + f.write("x=$x\n") + f.write("y=$y\n") + f.write("sum=@{$x + $y}\n") + + # Create simple calculator script that reads from input.txt and computes result + # Use source and bash arithmetic like test_fzo_fzr_coherence.py does + self.calc_script = self.test_path / "calc.sh" + with open(self.calc_script, "w", newline='\n') as f: + f.write('#!/bin/bash\n') + f.write('source input.txt\n') + f.write('result=$((x + y))\n') + f.write('echo "result = $result" > output.txt\n') + self.calc_script.chmod(0o755) + + def teardown_method(self): + """Clean up temporary directory after each test""" + if self.test_path.exists(): + shutil.rmtree(self.test_path) + + def test_fzr_with_dataframe_basic(self): + """Test fzr() with DataFrame input""" + df = pd.DataFrame({ + "x": [1, 2, 3], + "y": [10, 20, 30] + }) + + model = { + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2" + } + } + + results = fz.fzr( + str(self.input_file), + df, + model, + calculators=f"sh://bash {self.calc_script}", + results_dir=str(self.test_path / "results") + ) + + # Should have 3 cases from DataFrame + assert len(results) == 3 + + # Check that we got the expected x, y combinations (not factorial) + results_sorted = results.sort_values("x").reset_index(drop=True) + assert results_sorted["x"].tolist() == [1, 2, 3] + assert results_sorted["y"].tolist() == [10, 20, 30] + + # Results should be x + y + assert results_sorted["result"].tolist() == [11, 22, 33] + + def test_fzr_dataframe_vs_dict(self): + """Compare DataFrame (non-factorial) vs dict (factorial) behavior""" + model = { + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2" + } + } + + # DataFrame: only 2 specific combinations + df = pd.DataFrame({ + "x": [1, 2], + "y": [10, 20] + }) + + results_df = fz.fzr( + str(self.input_file), + df, + model, + calculators=f"sh://bash {self.calc_script}", + results_dir=str(self.test_path / "results_df") + ) + + # Dict: 2x2 = 4 combinations (factorial) + dict_input = {"x": [1, 2], "y": [10, 20]} + + results_dict = fz.fzr( + str(self.input_file), + dict_input, + model, + calculators=f"sh://bash {self.calc_script}", + results_dir=str(self.test_path / "results_dict") + ) + + # DataFrame gives 2 cases + assert len(results_df) == 2 + assert results_df["x"].tolist() == [1, 2] + assert results_df["y"].tolist() == [10, 20] + assert results_df["result"].tolist() == [11, 22] + + # Dict gives 4 cases (factorial) + assert len(results_dict) == 4 + results_dict_sorted = results_dict.sort_values(["x", "y"]).reset_index(drop=True) + assert results_dict_sorted["x"].tolist() == [1, 1, 2, 2] + assert results_dict_sorted["y"].tolist() == [10, 20, 10, 20] + assert results_dict_sorted["result"].tolist() == [11, 21, 12, 22] + + def test_fzr_dataframe_non_factorial_pattern(self): + """Test DataFrame with non-factorial pattern (can't be created with dict)""" + # This pattern can't be created with a dict: + # x=1,y=10 and x=1,y=20 and x=2,y=20 (but NOT x=2,y=10) + df = pd.DataFrame({ + "x": [1, 1, 2], + "y": [10, 20, 20] + }) + + model = { + "formulaprefix": "@", + "delim": "{}", + "commentline": "#", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2" + } + } + + results = fz.fzr( + str(self.input_file), + df, + model, + calculators=f"sh://bash {self.calc_script}", + results_dir=str(self.test_path / "results") + ) + + assert len(results) == 3 + results_sorted = results.sort_values(["x", "y"]).reset_index(drop=True) + assert results_sorted["x"].tolist() == [1, 1, 2] + assert results_sorted["y"].tolist() == [10, 20, 20] + assert results_sorted["result"].tolist() == [11, 21, 22] + + +class TestInputValidation: + """Test input validation for input_variables""" + + def test_dict_input_still_works(self): + """Test that dict input still works as before""" + dict_input = {"x": [1, 2], "y": 3} + var_combinations = generate_variable_combinations(dict_input) + + assert len(var_combinations) == 2 + assert var_combinations[0] == {"x": 1, "y": 3} + assert var_combinations[1] == {"x": 2, "y": 3} + + def test_invalid_input_type(self): + """Test that invalid input type raises error""" + with pytest.raises(TypeError, match="input_variables must be a dict or pandas DataFrame"): + generate_variable_combinations([1, 2, 3]) + + with pytest.raises(TypeError, match="input_variables must be a dict or pandas DataFrame"): + generate_variable_combinations("invalid") + + with pytest.raises(TypeError, match="input_variables must be a dict or pandas DataFrame"): + generate_variable_combinations(42) + + +if __name__ == "__main__": + # Run tests + print("=" * 70) + print("Testing DataFrame Input Support") + print("=" * 70) + + test_df = TestDataFrameInput() + + print("\n1. Testing basic DataFrame input...") + test_df.test_dataframe_basic() + print("✓ Passed") + + print("\n2. Testing non-factorial combinations...") + test_df.test_dataframe_non_factorial() + print("✓ Passed") + + print("\n3. Testing DataFrame vs dict factorial...") + test_df.test_dataframe_vs_dict_factorial() + print("✓ Passed") + + print("\n4. Testing single row DataFrame...") + test_df.test_dataframe_single_row() + print("✓ Passed") + + print("\n5. Testing mixed data types...") + test_df.test_dataframe_mixed_types() + print("✓ Passed") + + print("\n6. Testing repeated values...") + test_df.test_dataframe_with_repeated_values() + print("✓ Passed") + + # Test input validation (doesn't require pandas) + test_validation = TestInputValidation() + + print("\n7. Testing dict input still works...") + test_validation.test_dict_input_still_works() + print("✓ Passed") + + print("\n8. Testing invalid input type...") + test_validation.test_invalid_input_type() + print("✓ Passed") + + print("\n" + "=" * 70) + print("ALL TESTS PASSED!") + print("=" * 70) diff --git a/tests/test_demos.py b/tests/test_demos.py new file mode 100644 index 0000000..ddd95d6 --- /dev/null +++ b/tests/test_demos.py @@ -0,0 +1,367 @@ +""" +Demo tests for fzd features + +These tests verify various fzd features work correctly by running +demonstrations that were previously standalone scripts. +""" + +import pytest +import fz +import tempfile +from pathlib import Path +import shutil +import logging + + +class TestAlgorithmAutoInstall: + """Test automatic package installation for algorithm requirements""" + + def test_load_algorithm_with_auto_install(self): + """Test that packages are automatically installed when loading an algorithm""" + # Create a test algorithm that requires a package + # We'll use 'six' as it's small and commonly used + test_algo_content = """ +#title: Test Algorithm with Package Requirement +#author: Test +#type: test +#require: six + +class TestAlgorithm: + def __init__(self, **options): + # Import the required package to verify it's installed + import six + self.options = options + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Test", "data": {}} +""" + + # Create temporary file + with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f: + f.write(test_algo_content) + algo_file = f.name + + try: + # Load the algorithm - this should auto-install 'six' if not present + from fz.algorithms import load_algorithm + algo = load_algorithm(algo_file) + + # Verify the algorithm loaded successfully + assert algo is not None + + # Verify we can now import six + import six + assert six is not None + + finally: + # Clean up temporary file + Path(algo_file).unlink() + + +class TestDisplayResultsTmp: + """Test intermediate progress display using get_analysis_tmp""" + + def test_get_analysis_tmp_is_called(self): + """Test that get_analysis_tmp is called during fzd iterations""" + # Create temporary directory + tmpdir = Path(tempfile.mkdtemp()) + + try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\ny = $y\n") + + # Define simple model + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.0' > output.txt", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Path to montecarlo_uniform algorithm (has get_analysis_tmp) + import os + repo_root = Path(__file__).parent.parent + algo_path = str(repo_root / "examples" / "algorithms" / "montecarlo_uniform.py") + + # Run fzd with small batch to see multiple iterations + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={ + "batch_sample_size": 3, # Small batches to see more iterations + "max_iterations": 5, + "target_confidence_range": 0.01, + "seed": 42 + } + ) + + # Verify result structure + assert 'XY' in result + assert 'analysis' in result + assert 'iterations' in result + assert result['iterations'] > 0 + + finally: + # Cleanup + shutil.rmtree(tmpdir) + + +class TestContentDetection: + """Test intelligent content detection and file saving""" + + def test_content_detection_with_different_formats(self): + """Test that fzd detects and processes different content types""" + # Create temporary directory + tmpdir = Path(tempfile.mkdtemp()) + + try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\n") + + # Define simple model + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.5' > output.txt", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Create algorithm with different content types in get_analysis + algo_file = tmpdir / "test_algo.py" + algo_file.write_text(""" +class TestContentAlgorithm: + def __init__(self, **options): + self.iteration = 0 + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + self.iteration += 1 + if self.iteration < 2: + return [{"x": 0.7}] + return [] + + def get_analysis(self, X, Y): + # Return different content types based on iteration + if self.iteration == 0: + # First iteration: JSON content + return { + "text": '{"mean": 1.234, "std": 0.567, "samples": 1}', + "data": {"iteration": 0} + } + elif self.iteration == 1: + # Second iteration: Key=Value content + return { + "text": "mean = 1.345\\nstd = 0.432\\nsamples = 2", + "data": {"iteration": 1} + } + else: + # Final iteration: Markdown content + return { + "text": '# Final Results\\n\\nMean: 1.456', + "data": {"iteration": 2} + } + + def get_analysis_tmp(self, X, Y): + return {"text": f"Progress: {len(X)} samples", "data": {}} +""") + + # Run fzd + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]"}, + model=model, + output_expression="result", + algorithm=str(algo_file) + ) + + # Verify result structure + assert 'XY' in result + assert 'analysis' in result + + finally: + # Cleanup + shutil.rmtree(tmpdir) + + +class TestDataFrame: + """Test XY DataFrame returned by fzd""" + + def test_xy_dataframe_structure(self): + """Test that fzd returns XY DataFrame with correct structure""" + # Create temporary directory + tmpdir = Path(tempfile.mkdtemp()) + + try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\ny = $y\n") + + # Define simple model + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.0' > output.txt", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Use randomsampling algorithm + repo_root = Path(__file__).parent.parent + algo_path = str(repo_root / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={"nvalues": 5, "seed": 42} + ) + + # Access the XY DataFrame + df = result['XY'] + + # Verify structure + assert df is not None + assert 'x' in df.columns + assert 'y' in df.columns + assert 'result' in df.columns # Output column named with output_expression + assert len(df) == 5 + + # Verify old keys are not present + assert 'input_vars' not in result + assert 'output_values' not in result + + # Verify new key is present + assert 'XY' in result + + finally: + # Cleanup + shutil.rmtree(tmpdir) + + +class TestProgressBar: + """Test progress bar with total time display""" + + def test_progress_bar_shows_total_time(self): + """Test that progress bar shows total time after completion""" + # Create temporary directory + tmpdir = Path(tempfile.mkdtemp()) + + try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\ny = $y\n") + + # Define simple model + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.0' > output.txt", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Use randomsampling algorithm + repo_root = Path(__file__).parent.parent + algo_path = str(repo_root / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={"nvalues": 5, "seed": 42} + ) + + # Verify result was returned (progress bar didn't block) + assert result is not None + assert 'XY' in result + + finally: + # Cleanup + shutil.rmtree(tmpdir) + + +class TestParallelExecution: + """Test parallel calculator execution in fzd""" + + def test_parallel_execution_structure(self): + """Test that fzd executes cases in batches enabling parallelization""" + # Create temporary directory + tmpdir = Path(tempfile.mkdtemp()) + + try: + # Create input directory + input_dir = tmpdir / "input" + input_dir.mkdir() + + # Create input file + (input_dir / "input.txt").write_text("x = $x\ny = $y\n") + + # Define simple model + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.0' > output.txt", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + # Path to randomsampling algorithm + repo_root = Path(__file__).parent.parent + algo_path = str(repo_root / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={"nvalues": 5, "seed": 42} + ) + + # Verify result structure + assert 'total_evaluations' in result + assert result['total_evaluations'] > 0 + assert 'XY' in result + + finally: + # Cleanup + shutil.rmtree(tmpdir) diff --git a/tests/test_dict_flattening.py b/tests/test_dict_flattening.py new file mode 100644 index 0000000..9bad7bf --- /dev/null +++ b/tests/test_dict_flattening.py @@ -0,0 +1,536 @@ +""" +Test dict flattening functionality in fzo and fzr + +Tests the automatic recursive flattening of dictionary-valued outputs +into separate columns with keys joined by underscores. +""" +import json +import os +import platform +import shutil +import tempfile +from pathlib import Path + +import pytest +import pandas as pd + +import fz +from fz.io import flatten_dict_recursive, flatten_dict_columns + + +class TestFlattenDictRecursive: + """Test the flatten_dict_recursive helper function""" + + def test_simple_dict(self): + """Test flattening a simple flat dict""" + d = {'a': 1, 'b': 2, 'c': 3} + result = flatten_dict_recursive(d) + assert result == {'a': 1, 'b': 2, 'c': 3} + + def test_nested_dict_one_level(self): + """Test flattening a dict with one level of nesting""" + d = {'stats': {'min': 1, 'max': 4}} + result = flatten_dict_recursive(d, parent_key='data', sep='_') + assert result == {'data_stats_min': 1, 'data_stats_max': 4} + + def test_nested_dict_two_levels(self): + """Test flattening a dict with two levels of nesting""" + d = {'level1': {'level2': {'a': 1, 'b': 2}}} + result = flatten_dict_recursive(d, sep='_') + assert result == {'level1_level2_a': 1, 'level1_level2_b': 2} + + def test_nested_dict_three_levels(self): + """Test flattening a deeply nested dict (3 levels)""" + d = {'l1': {'l2': {'l3': {'value': 42}}}} + result = flatten_dict_recursive(d, sep='_') + assert result == {'l1_l2_l3_value': 42} + + def test_mixed_nesting(self): + """Test flattening a dict with mixed nested and flat values""" + d = { + 'flat': 100, + 'nested': {'a': 1, 'b': 2}, + 'deep': {'level2': {'value': 3}} + } + result = flatten_dict_recursive(d, sep='_') + assert result == { + 'flat': 100, + 'nested_a': 1, + 'nested_b': 2, + 'deep_level2_value': 3 + } + + def test_custom_separator(self): + """Test flattening with a custom separator""" + d = {'a': {'b': 1}} + result = flatten_dict_recursive(d, sep='.') + assert result == {'a.b': 1} + + def test_empty_dict(self): + """Test flattening an empty dict""" + d = {} + result = flatten_dict_recursive(d) + assert result == {} + + +class TestFlattenDictColumns: + """Test the flatten_dict_columns function on DataFrames""" + + def test_no_dict_columns(self): + """Test DataFrame with no dict columns remains unchanged""" + df = pd.DataFrame({'x': [1, 2, 3], 'y': [4, 5, 6]}) + result = flatten_dict_columns(df) + assert list(result.columns) == ['x', 'y'] + assert result.equals(df) + + def test_simple_dict_column(self): + """Test flattening a simple dict column""" + df = pd.DataFrame({ + 'x': [1, 2, 3], + 'stats': [ + {'min': 1, 'max': 4}, + {'min': 2, 'max': 5}, + {'min': 3, 'max': 6} + ] + }) + result = flatten_dict_columns(df) + print(result) + + # Original dict column should be removed + assert 'stats' not in result.columns + + # Flattened columns should exist + assert 'stats_min' in result.columns + assert 'stats_max' in result.columns + + # Values should be correct + assert list(result['stats_min']) == [1, 2, 3] + assert list(result['stats_max']) == [4, 5, 6] + + # Original column should remain + assert list(result['x']) == [1, 2, 3] + + def test_nested_dict_column(self): + """Test flattening a nested dict column""" + df = pd.DataFrame({ + 'x': [1, 2], + 'data': [ + {'level1': {'level2': {'value': 10}}}, + {'level1': {'level2': {'value': 20}}} + ] + }) + result = flatten_dict_columns(df) + + assert 'data' not in result.columns + assert 'data_level1_level2_value' in result.columns + assert list(result['data_level1_level2_value']) == [10, 20] + + def test_deeply_nested_dict_column(self): + """Test flattening a deeply nested dict column (3 levels)""" + df = pd.DataFrame({ + 'x': [1, 2], + 'deep': [ + {'a': {'b': {'c': {'d': 100}}}}, + {'a': {'b': {'c': {'d': 200}}}} + ] + }) + result = flatten_dict_columns(df) + + assert 'deep_a_b_c_d' in result.columns + assert list(result['deep_a_b_c_d']) == [100, 200] + + def test_multiple_dict_columns(self): + """Test flattening multiple dict columns""" + df = pd.DataFrame({ + 'x': [1, 2], + 'stats': [ + {'min': 1, 'max': 4}, + {'min': 2, 'max': 5} + ], + 'info': [ + {'name': 'a', 'id': 100}, + {'name': 'b', 'id': 200} + ] + }) + result = flatten_dict_columns(df) + + # Both dict columns should be flattened + assert 'stats' not in result.columns + assert 'info' not in result.columns + assert 'stats_min' in result.columns + assert 'stats_max' in result.columns + assert 'info_name' in result.columns + assert 'info_id' in result.columns + + def test_dict_with_none_values(self): + """Test flattening dict column with None values""" + df = pd.DataFrame({ + 'x': [1, 2, 3], + 'stats': [ + {'min': 1, 'max': 4}, + None, + {'min': 3, 'max': 6} + ] + }) + result = flatten_dict_columns(df) + + assert 'stats_min' in result.columns + assert result['stats_min'].iloc[0] == 1.0 + assert pd.isna(result['stats_min'].iloc[1]) + assert result['stats_min'].iloc[2] == 3.0 + + def test_mixed_nested_and_flat_values(self): + """Test flattening dict with both nested and flat values""" + df = pd.DataFrame({ + 'x': [1, 2], + 'data': [ + {'nested': {'a': 1, 'b': 2}, 'flat': 99}, + {'nested': {'a': 3, 'b': 4}, 'flat': 88} + ] + }) + result = flatten_dict_columns(df) + + assert 'data_nested_a' in result.columns + assert 'data_nested_b' in result.columns + assert 'data_flat' in result.columns + assert list(result['data_flat']) == [99, 88] + + def test_empty_dataframe(self): + """Test flattening an empty DataFrame""" + df = pd.DataFrame() + result = flatten_dict_columns(df) + assert result.empty + + +class TestFzoWithDictFlattening: + """Test fzo with dict-valued outputs""" + + def test_fzo_with_dict_output(self): + """Test fzo automatically flattens dict outputs""" + with tempfile.TemporaryDirectory() as tmpdir: + # Save original directory to avoid Windows file deletion issues + original_cwd = os.getcwd() + try: + # Create result directory with dict output + result_dir = Path(tmpdir) / "results" / "x=5,y=10" + result_dir.mkdir(parents=True) + + # Write output file with JSON dict + with open(result_dir / "output.txt", "w") as f: + f.write("sum=15\n") + f.write('stats={"min": 5, "max": 10, "diff": 5}\n') + + # Define model + model = { + "varprefix": "$", + "delim": "{}", + "output": { + "sum": "grep 'sum=' output.txt | cut -d'=' -f2", + "stats": "grep 'stats=' output.txt | cut -d'=' -f2" + } + } + + # Run fzo + os.chdir(tmpdir) + results = fz.fzo("results/*", model) + + # Check flattening occurred + assert 'stats' not in results.columns + assert 'stats_min' in results.columns + assert 'stats_max' in results.columns + assert 'stats_diff' in results.columns + + # Check values + assert results['sum'].iloc[0] == 15 + assert results['stats_min'].iloc[0] == 5 + assert results['stats_max'].iloc[0] == 10 + assert results['stats_diff'].iloc[0] == 5 + finally: + # Restore original directory to allow cleanup on Windows + os.chdir(original_cwd) + + def test_fzo_with_nested_dict_output(self): + """Test fzo with nested dict outputs""" + with tempfile.TemporaryDirectory() as tmpdir: + # Save original directory to avoid Windows file deletion issues + original_cwd = os.getcwd() + try: + result_dir = Path(tmpdir) / "results" / "case1" + result_dir.mkdir(parents=True) + + # Write output with nested dict + nested_dict = { + 'basic': {'min': 1, 'max': 10}, + 'advanced': {'mean': 5.5, 'std': 2.5} + } + with open(result_dir / "output.txt", "w") as f: + f.write(f"data={json.dumps(nested_dict)}\n") + + model = { + "output": { + "data": "grep 'data=' output.txt | cut -d'=' -f2" + } + } + + os.chdir(tmpdir) + results = fz.fzo("results/*", model) + + # Check nested flattening + assert 'data_basic_min' in results.columns + assert 'data_basic_max' in results.columns + assert 'data_advanced_mean' in results.columns + assert 'data_advanced_std' in results.columns + finally: + # Restore original directory to allow cleanup on Windows + os.chdir(original_cwd) + + +class TestFzrWithDictFlattening: + """Test fzr with dict-valued outputs""" + + def test_fzr_with_dict_output(self): + """Test fzr automatically flattens dict outputs""" + with tempfile.TemporaryDirectory() as tmpdir: + # Save original directory to avoid Windows file deletion issues + original_cwd = os.getcwd() + try: + os.chdir(tmpdir) + + # Create input template + with open("input.txt", "w") as f: + f.write("x = ${x}\n") + + # Create calculator script that produces dict output + calc_script = Path(tmpdir) / "calc.py" + with open(calc_script, "w") as f: + f.write("""#!/usr/bin/env python3 +import json + +# Read input +with open('input.txt', 'r') as f: + content = f.read() + x = int([line for line in content.split('\\n') if 'x =' in line][0].split('=')[1].strip()) + +# Create dict output +stats = {'min': x - 1, 'max': x + 1, 'mean': x} + +# Write output +with open('output.txt', 'w') as f: + f.write(f"value={x}\\n") + f.write(f"stats={json.dumps(stats)}\\n") +""") + os.chmod(calc_script, 0o755) + + # Define model + model = { + "varprefix": "$", + "delim": "{}", + "output": { + "value": "grep 'value=' output.txt | cut -d'=' -f2", + "stats": "grep 'stats=' output.txt | cut -d'=' -f2" + } + } + + # Run fzr + results = fz.fzr( + input_path="input.txt", + input_variables={"x": [5, 10, 15]}, + model=model, + calculators=f"sh://python3 {calc_script}" + ) + + # Check flattening occurred + assert 'stats' not in results.columns + assert 'stats_min' in results.columns + assert 'stats_max' in results.columns + assert 'stats_mean' in results.columns + + # Check values for first row + assert results['x'].iloc[0] == 5 + assert results['value'].iloc[0] == 5 + assert results['stats_min'].iloc[0] == 4 + assert results['stats_max'].iloc[0] == 6 + assert results['stats_mean'].iloc[0] == 5 + + # Check all rows + assert len(results) == 3 + finally: + # Restore original directory to allow cleanup on Windows + os.chdir(original_cwd) + + def test_fzr_with_deeply_nested_dict(self): + """Test fzr with deeply nested dict outputs (3 levels)""" + with tempfile.TemporaryDirectory() as tmpdir: + # Save original directory to avoid Windows file deletion issues + original_cwd = os.getcwd() + try: + os.chdir(tmpdir) + + with open("input.txt", "w") as f: + f.write("x = ${x}\n") + + calc_script = Path(tmpdir) / "calc.py" + with open(calc_script, "w") as f: + f.write("""#!/usr/bin/env python3 +import json + +with open('input.txt', 'r') as f: + content = f.read() + x = int([line for line in content.split('\\n') if 'x =' in line][0].split('=')[1].strip()) + +# Create deeply nested output +result = { + 'level1': { + 'level2': { + 'level3': { + 'value': x * 2, + 'squared': x * x + } + } + } +} + +with open('output.txt', 'w') as f: + f.write(f"result={json.dumps(result)}\\n") +""") + os.chmod(calc_script, 0o755) + + model = { + "varprefix": "$", + "delim": "{}", + "output": { + "result": "grep 'result=' output.txt | cut -d'=' -f2" + } + } + + results = fz.fzr( + input_path="input.txt", + input_variables={"x": [3, 5]}, + model=model, + calculators=f"sh://python3 {calc_script}" + ) + + # Check deep nesting flattened correctly + assert 'result_level1_level2_level3_value' in results.columns + assert 'result_level1_level2_level3_squared' in results.columns + + # Check values + assert results['result_level1_level2_level3_value'].iloc[0] == 6 + assert results['result_level1_level2_level3_squared'].iloc[0] == 9 + assert results['result_level1_level2_level3_value'].iloc[1] == 10 + assert results['result_level1_level2_level3_squared'].iloc[1] == 25 + finally: + # Restore original directory to allow cleanup on Windows + os.chdir(original_cwd) + + def test_fzr_with_multiple_dict_outputs(self): + """Test fzr with multiple dict-valued outputs""" + with tempfile.TemporaryDirectory() as tmpdir: + # Save original directory to avoid Windows file deletion issues + original_cwd = os.getcwd() + try: + os.chdir(tmpdir) + + with open("input.txt", "w") as f: + f.write("x = ${x}\n") + + calc_script = Path(tmpdir) / "calc.py" + with open(calc_script, "w") as f: + f.write("""#!/usr/bin/env python3 +import json + +with open('input.txt', 'r') as f: + content = f.read() + x = int([line for line in content.split('\\n') if 'x =' in line][0].split('=')[1].strip()) + +stats = {'min': x - 1, 'max': x + 1} +meta = {'name': f'case{x}', 'id': x * 100} + +with open('output.txt', 'w') as f: + f.write(f"stats={json.dumps(stats)}\\n") + f.write(f"meta={json.dumps(meta)}\\n") +""") + os.chmod(calc_script, 0o755) + + model = { + "varprefix": "$", + "delim": "{}", + "output": { + "stats": "grep 'stats=' output.txt | cut -d'=' -f2", + "meta": "grep 'meta=' output.txt | cut -d'=' -f2" + } + } + + results = fz.fzr( + input_path="input.txt", + input_variables={"x": [5, 10]}, + model=model, + calculators=f"sh://python3 {calc_script}" + ) + + # Check both dicts flattened + assert 'stats_min' in results.columns + assert 'stats_max' in results.columns + assert 'meta_name' in results.columns + assert 'meta_id' in results.columns + + # Verify values + assert results['meta_name'].iloc[0] == 'case5' + assert results['meta_id'].iloc[0] == 500 + finally: + # Restore original directory to allow cleanup on Windows + os.chdir(original_cwd) + + +class TestEdgeCases: + """Test edge cases and error handling""" + + def test_dict_with_list_values(self): + """Test that dicts with list values are handled (lists not flattened further)""" + df = pd.DataFrame({ + 'x': [1], + 'data': [{'values': [1, 2, 3], 'count': 3}] + }) + result = flatten_dict_columns(df) + + assert 'data_values' in result.columns + assert 'data_count' in result.columns + # List should remain as list + assert result['data_values'].iloc[0] == [1, 2, 3] + + def test_inconsistent_dict_keys_across_rows(self): + """Test handling of dicts with different keys in different rows""" + df = pd.DataFrame({ + 'x': [1, 2, 3], + 'data': [ + {'a': 1, 'b': 2}, + {'a': 3, 'c': 4}, # Different key 'c' instead of 'b' + {'b': 5, 'c': 6} # Missing 'a' + ] + }) + result = flatten_dict_columns(df) + + # All keys should become columns + assert 'data_a' in result.columns + assert 'data_b' in result.columns + assert 'data_c' in result.columns + + # Missing values should be None/NaN + assert result['data_a'].iloc[0] == 1 + assert pd.isna(result['data_a'].iloc[2]) # Row 2 doesn't have 'a' + assert pd.isna(result['data_c'].iloc[0]) # Row 0 doesn't have 'c' + + def test_max_iterations_prevents_infinite_loop(self): + """Test that max iterations prevents infinite loops""" + # This is a safety check - normal dicts should never hit this limit + df = pd.DataFrame({ + 'x': [1], + 'data': [{'a': 1}] + }) + # Should complete without error even with iteration limit + result = flatten_dict_columns(df) + assert 'data_a' in result.columns + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) diff --git a/tests/test_fzd.py b/tests/test_fzd.py new file mode 100644 index 0000000..5d875bf --- /dev/null +++ b/tests/test_fzd.py @@ -0,0 +1,597 @@ +""" +Tests for fzd (iterative design of experiments with algorithms) +""" + +import os +import sys +import tempfile +import shutil +import pytest +from pathlib import Path + +# Add parent directory to path for importing fz +sys.path.insert(0, str(Path(__file__).parent.parent)) + +import fz +from fz.algorithms import parse_input_vars, evaluate_output_expression, load_algorithm + + +class TestParseInputVars: + """Test input variable range parsing""" + + def test_parse_simple_range(self): + """Test parsing simple ranges""" + result = parse_input_vars({"x": "[0;1]", "y": "[-5;5]"}) + assert result == {"x": (0.0, 1.0), "y": (-5.0, 5.0)} + + def test_parse_mixed_range_and_fixed(self): + """Test parsing mix of ranges and fixed values""" + from fz.algorithms import parse_fixed_vars + + input_vars = {"x": "[0;1]", "y": "0.5", "z": "[-2;2]"} + + # Variable ranges + ranges = parse_input_vars(input_vars) + assert ranges == {"x": (0.0, 1.0), "z": (-2.0, 2.0)} + + # Fixed values + fixed = parse_fixed_vars(input_vars) + assert fixed == {"y": 0.5} + + def test_parse_comma_delimiter(self): + """Test parsing with comma delimiter""" + result = parse_input_vars({"x": "[0,1]"}) + assert result == {"x": (0.0, 1.0)} + + def test_parse_float_range(self): + """Test parsing float ranges""" + result = parse_input_vars({"x": "[0.5;1.5]"}) + assert result == {"x": (0.5, 1.5)} + + def test_parse_invalid_format(self): + """Test parsing invalid format raises error""" + with pytest.raises(ValueError, match="Invalid format"): + parse_input_vars({"x": "0;1"}) # Missing brackets - not a valid range or fixed value + + def test_parse_invalid_order(self): + """Test parsing invalid order raises error""" + with pytest.raises(ValueError, match="min .* must be < max"): + parse_input_vars({"x": "[1;0]"}) # min > max + + +class TestEvaluateOutputExpression: + """Test output expression evaluation""" + + def test_simple_addition(self): + """Test simple addition""" + result = evaluate_output_expression("x + y", {"x": 1.0, "y": 2.0}) + assert result == 3.0 + + def test_multiplication(self): + """Test multiplication""" + result = evaluate_output_expression("x * 2", {"x": 3.0}) + assert result == 6.0 + + def test_complex_expression(self): + """Test complex expression""" + result = evaluate_output_expression("x + y * 2", {"x": 1.0, "y": 3.0}) + assert result == 7.0 + + def test_math_functions(self): + """Test math functions""" + result = evaluate_output_expression("sqrt(x)", {"x": 4.0}) + assert result == 2.0 + + def test_invalid_expression(self): + """Test invalid expression raises error""" + with pytest.raises(ValueError): + evaluate_output_expression("x + z", {"x": 1.0}) # z not defined + +class TestFzdIntegration: + """Integration tests for fzd function""" + + @pytest.fixture + def temp_dir(self): + """Create temporary directory for tests""" + tmpdir = tempfile.mkdtemp() + yield tmpdir + shutil.rmtree(tmpdir) + + @pytest.fixture + def simple_model(self, temp_dir): + """Create a simple test model""" + # Create input file + input_dir = Path(temp_dir) / "input" + input_dir.mkdir() + + input_file = input_dir / "input.txt" + input_file.write_text("x = $x\ny = $y\n") + + # Create model + model = { + "varprefix": "$", + "delim": "()", + "run": "bash -c 'source input.txt && result=$(echo \"scale=6; $x * $x + $y * $y\" | bc) && echo \"result = $result\" > output.txt'", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2 | tr -d ' '" + } + } + + return input_dir, model + + def test_fzd_randomsampling(self, simple_model): + """Test fzd with randomsampling""" + input_dir, model = simple_model + + # Skip if bc is not available (used in model) + if shutil.which("bc") is None: + pytest.skip("bc command not available") + + # Path to randomsampling algorithm + algo_path = str(Path(__file__).parent.parent / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd with randomsampling + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={"nvalues": 3, "seed": 42} + ) + + assert result is not None + assert "XY" in result + assert len(result["XY"]) == 3 + assert "x" in result["XY"].columns + assert "y" in result["XY"].columns + assert "result" in result["XY"].columns # output_expression as column name + assert algo_path in result["algorithm"] # algorithm field contains the path + + # Removed test_fzd_requires_pandas - pandas is now a required dependency + + def test_fzd_returns_dataframe(self, simple_model): + """Test that fzd returns XY DataFrame with all X and Y values""" + input_dir, model = simple_model + + # Skip if bc is not available (used in model) + if shutil.which("bc") is None: + pytest.skip("bc command not available") + + # Path to randomsampling algorithm + algo_path = str(Path(__file__).parent.parent / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]", "y": "[0;1]"}, + model=model, + output_expression="result", # This becomes the column name + algorithm=algo_path, + algorithm_options={"nvalues": 3, "seed": 42} + ) + + # Check that XY DataFrame is included + assert 'XY' in result + assert result['XY'] is not None + + # Check DataFrame structure + df = result['XY'] + assert len(df) == 3 # 3 evaluations + assert 'x' in df.columns + assert 'y' in df.columns + assert 'result' in df.columns # Output column named with output_expression + + # Check that input_vars and output_values are not in result + assert 'input_vars' not in result + assert 'output_values' not in result + + # Verify data types and structure + assert df['x'].dtype == 'float64' + assert df['y'].dtype == 'float64' + # result column may have None values, so check object type + assert df['result'].dtype in ['float64', 'object'] + + def test_fzd_with_fixed_variables(self, simple_model): + """Test that fzd only varies non-fixed variables""" + input_dir, model = simple_model + + # Skip if bc is not available (used in model) + if shutil.which("bc") is None: + pytest.skip("bc command not available") + + # Path to randomsampling algorithm + algo_path = str(Path(__file__).parent.parent / "examples" / "algorithms" / "randomsampling.py") + + # Run fzd with one variable range and one fixed value + result = fz.fzd( + input_path=str(input_dir), + input_variables={ + "x": "[0;1]", # Variable - will be varied by algorithm + "y": "0.5" # Fixed - will NOT be varied + }, + model=model, + output_expression="result", + algorithm=algo_path, + algorithm_options={"nvalues": 3, "seed": 42} + ) + + # Check that XY DataFrame has both columns + assert 'XY' in result + df = result['XY'] + assert 'x' in df.columns + assert 'y' in df.columns + assert 'result' in df.columns + + # Check that y is fixed at 0.5 for all rows + assert len(df) == 3 + assert all(df['y'] == 0.5), "y should be fixed at 0.5 for all evaluations" + + # Check that x varies + assert len(df['x'].unique()) > 1, "x should vary across evaluations" + + def test_fzd_get_analysis_tmp(self, temp_dir): + """Test that get_analysis_tmp is called at each iteration if it exists""" + from unittest.mock import Mock, patch + + # Create a simple model + input_dir = Path(temp_dir) / "input" + input_dir.mkdir() + (input_dir / "input.txt").write_text("x = $x\n") + + model = { + "varprefix": "$", + "delim": "()", + "run": "echo 'result = 1.0' > output.txt", + "output": {"result": "grep 'result = ' output.txt | cut -d '=' -f2"} + } + + # Create algorithm with get_analysis_tmp + algo_file = Path(temp_dir) / "algo_with_tmp.py" + algo_file.write_text(""" +class TestAlgorithm: + def __init__(self, **options): + self.call_count = 0 + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + # Run 2 iterations + self.call_count += 1 + if self.call_count < 2: + return [{"x": 0.7}] + return [] + + def get_analysis(self, X, Y): + return {"text": "Final", "data": {}} + + def get_analysis_tmp(self, X, Y): + return {"text": f"Iteration progress: {len(X)} samples", "data": {}} +""") + + # Mock logging to capture calls + with patch('fz.core.log_info') as mock_log: + result = fz.fzd( + input_path=str(input_dir), + input_variables={"x": "[0;1]"}, + model=model, + output_expression="result", + algorithm=str(algo_file) + ) + + # Verify get_analysis_tmp was called + # Should be called twice (once after each iteration) + tmp_calls = [call for call in mock_log.call_args_list + if 'intermediate results' in str(call)] + assert len(tmp_calls) >= 2, "get_analysis_tmp should be called at each iteration" + + +class TestLoadAlgorithmFromFile: + """Test loading algorithms from Python files""" + + @pytest.fixture + def temp_dir(self): + """Create temporary directory for tests""" + tmpdir = tempfile.mkdtemp() + yield tmpdir + shutil.rmtree(tmpdir) + + def test_load_algorithm_from_file(self, temp_dir): + """Test loading an algorithm from a Python file""" + # Create a simple algorithm file + algo_file = Path(temp_dir) / "simple_algo.py" + algo_file.write_text(""" +class SimpleAlgorithm: + def __init__(self, **options): + self.options = options + self.nvalues = options.get("nvalues", 5) + + def get_initial_design(self, input_vars, output_vars): + # Return center point + return [{var: (bounds[0] + bounds[1]) / 2 for var, bounds in input_vars.items()}] + + def get_next_design(self, X, Y): + # No next design + return [] + + def get_analysis(self, X, Y): + return {"text": "Test results", "data": {}} +""") + + # Load algorithm from file + algo = load_algorithm(str(algo_file), nvalues=10) + assert algo is not None + assert algo.nvalues == 10 + + # Test initial design + input_vars = {"x": (0.0, 1.0), "y": (-5.0, 5.0)} + design = algo.get_initial_design(input_vars, ["output"]) + assert len(design) == 1 + assert design[0]["x"] == 0.5 + assert design[0]["y"] == 0.0 + + def test_load_montecarlo_algorithm(self): + """Test loading the MonteCarlo_Uniform algorithm from file""" + # Use the test_algorithm_montecarlo.py file we just created + algo_file = Path(__file__).parent / "test_algorithm_montecarlo.py" + + # Load algorithm from file + algo = load_algorithm(str(algo_file), batch_sample_size=5, seed=42) + assert algo is not None + + # Test initial design + input_vars = {"x": (0.0, 1.0), "y": (-5.0, 5.0)} + design = algo.get_initial_design(input_vars, ["output"]) + assert len(design) == 5 + + # Check that all points are within bounds + for point in design: + assert 0.0 <= point["x"] <= 1.0 + assert -5.0 <= point["y"] <= 5.0 + + def test_load_algorithm_with_metadata(self, temp_dir): + """Test loading algorithm with metadata comments""" + algo_file = Path(temp_dir) / "algo_with_metadata.py" + algo_file.write_text("""#title: Test Algorithm +#author: Test Author +#type: optimization +#options: param1=10;param2=0.5 +#require: numpy + +class TestAlgo: + def __init__(self, **options): + self.options = options + self.param1 = int(options.get("param1", 5)) + self.param2 = float(options.get("param2", 0.1)) + + def get_initial_design(self, input_vars, output_vars): + return [] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Test", "data": {}} +""") + + # Load algorithm (should use default options from metadata) + algo = load_algorithm(str(algo_file)) + assert algo.param1 == 10 + assert algo.param2 == 0.5 + + # Load with explicit options (should override metadata) + algo2 = load_algorithm(str(algo_file), param1=20) + assert algo2.param1 == 20 + assert algo2.param2 == 0.5 # Still from metadata + + def test_load_algorithm_invalid_file(self): + """Test loading from non-existent file""" + with pytest.raises(ValueError, match="Algorithm file not found"): + load_algorithm("nonexistent_algo.py") + + def test_load_algorithm_non_python_file(self, temp_dir): + """Test loading from non-.py/.R file""" + txt_file = Path(temp_dir) / "not_python.txt" + txt_file.write_text("Not a Python file") + + with pytest.raises(ValueError, match="must be a Python \\(\\.py\\) or R \\(\\.R\\) file"): + load_algorithm(str(txt_file)) + + def test_load_algorithm_no_class(self, temp_dir): + """Test loading from file with no algorithm class""" + algo_file = Path(temp_dir) / "no_class.py" + algo_file.write_text(""" +# This file has no algorithm class +def some_function(): + pass +""") + + with pytest.raises(ValueError, match="No valid algorithm class found"): + load_algorithm(str(algo_file)) + + def test_load_algorithm_with_require_installed(self, temp_dir): + """Test loading algorithm with #require: header for already installed packages""" + algo_file = Path(temp_dir) / "algo_with_require.py" + algo_file.write_text(""" +#title: Test Algorithm +#require: sys;os + +class TestAlgorithm: + def __init__(self, **options): + import sys + import os + self.options = options + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Test", "data": {}} +""") + + # Should load successfully without trying to install sys/os (they're built-in) + algo = load_algorithm(str(algo_file)) + assert algo is not None + + def test_load_algorithm_with_require_missing(self, temp_dir): + """Test that missing packages trigger installation attempt""" + from unittest.mock import patch, MagicMock + import fz.algorithms + + algo_file = Path(temp_dir) / "algo_missing_pkg.py" + algo_file.write_text(""" +#require: nonexistent_test_package_12345 + +class TestAlgorithm: + def __init__(self, **options): + self.options = options + + def get_initial_design(self, input_vars, output_vars): + return [{"x": 0.5}] + + def get_next_design(self, X, Y): + return [] + + def get_analysis(self, X, Y): + return {"text": "Test", "data": {}} +""") + + # Mock subprocess.check_call to fail (package doesn't exist) + with patch('fz.algorithms.subprocess.check_call') as mock_call: + mock_call.side_effect = fz.algorithms.subprocess.CalledProcessError(1, 'pip') + + # Should raise RuntimeError about failed installation + with pytest.raises(RuntimeError, match="Failed to install required package"): + load_algorithm(str(algo_file)) + + +class TestContentDetection: + """Test content type detection and processing""" + + @pytest.fixture + def temp_dir(self): + """Create temporary directory for tests""" + tmpdir = tempfile.mkdtemp() + yield tmpdir + shutil.rmtree(tmpdir) + + def test_detect_html_content(self): + """Test HTML content detection""" + from fz.io import detect_content_type + + html_text = "

Hello

" + assert detect_content_type(html_text) == 'html' + + html_text2 = "Test" + assert detect_content_type(html_text2) == 'html' + + def test_detect_json_content(self): + """Test JSON content detection""" + from fz.io import detect_content_type + + json_text = '{"key": "value", "number": 42}' + assert detect_content_type(json_text) == 'json' + + json_array = '[1, 2, 3, 4]' + assert detect_content_type(json_array) == 'json' + + def test_detect_keyvalue_content(self): + """Test key=value content detection""" + from fz.io import detect_content_type + + kv_text = """name = John +age = 30 +city = Paris""" + assert detect_content_type(kv_text) == 'keyvalue' + + def test_detect_markdown_content(self): + """Test markdown content detection""" + from fz.io import detect_content_type + + md_text = """# Header +## Subheader +* Item 1 +* Item 2""" + assert detect_content_type(md_text) == 'markdown' + + def test_parse_keyvalue(self): + """Test parsing key=value text""" + from fz.io import parse_keyvalue_text + + kv_text = """name = John Doe +age = 30 +city = Paris""" + result = parse_keyvalue_text(kv_text) + assert result == {'name': 'John Doe', 'age': '30', 'city': 'Paris'} + + def test_process_analysis_content_with_json(self, temp_dir): + """Test processing analysis content with JSON""" + from fz.io import process_analysis_content + + results_dir = Path(temp_dir) + analysis_dict = { + 'text': '{"mean": 1.5, "std": 0.3}', + 'data': {'samples': 10} + } + + processed = process_analysis_content(analysis_dict, 1, results_dir) + + assert 'json_data' in processed + assert processed['json_data']['mean'] == 1.5 + assert 'json_file' in processed + assert (results_dir / processed['json_file']).exists() + + def test_process_analysis_content_with_html(self, temp_dir): + """Test processing analysis content with HTML""" + from fz.io import process_analysis_content + + results_dir = Path(temp_dir) + analysis_dict = { + 'html': '

Results

Test

', + 'data': {} + } + + processed = process_analysis_content(analysis_dict, 1, results_dir) + + assert 'html_file' in processed + assert (results_dir / processed['html_file']).exists() + + def test_process_analysis_content_with_markdown(self, temp_dir): + """Test processing analysis content with markdown""" + from fz.io import process_analysis_content + + results_dir = Path(temp_dir) + analysis_dict = { + 'text': '# Results\n\n* Item 1\n* Item 2', + 'data': {} + } + + processed = process_analysis_content(analysis_dict, 1, results_dir) + + assert 'md_file' in processed + assert (results_dir / processed['md_file']).exists() + + def test_process_analysis_content_with_keyvalue(self, temp_dir): + """Test processing analysis content with key=value""" + from fz.io import process_analysis_content + + results_dir = Path(temp_dir) + analysis_dict = { + 'text': 'mean = 1.5\nstd = 0.3\nsamples = 100', + 'data': {} + } + + processed = process_analysis_content(analysis_dict, 1, results_dir) + + assert 'keyvalue_data' in processed + assert processed['keyvalue_data']['mean'] == '1.5' + assert 'txt_file' in processed + assert (results_dir / processed['txt_file']).exists() + + +if __name__ == "__main__": + pytest.main([__file__, "-v"]) diff --git a/tests/test_fzo_fzr_coherence.py b/tests/test_fzo_fzr_coherence.py index a029e77..0f877ec 100644 --- a/tests/test_fzo_fzr_coherence.py +++ b/tests/test_fzo_fzr_coherence.py @@ -13,17 +13,12 @@ import platform import fz - -try: - import pandas as pd - PANDAS_AVAILABLE = True -except ImportError: - PANDAS_AVAILABLE = False +import pandas as pd def _get_value(result, key, index): """Helper to get value from DataFrame or dict""" - if PANDAS_AVAILABLE and isinstance(result, pd.DataFrame): + if isinstance(result, pd.DataFrame): value = result[key].iloc[index] # Convert numpy types to native Python types if hasattr(value, 'item'): @@ -35,7 +30,7 @@ def _get_value(result, key, index): def _get_length(result, key): """Helper to get length from DataFrame or dict""" - if PANDAS_AVAILABLE and isinstance(result, pd.DataFrame): + if isinstance(result, pd.DataFrame): return len(result[key]) else: return len(result[key]) diff --git a/tests/test_interrupt_handling.py b/tests/test_interrupt_handling.py index cb11e08..baa5bcf 100644 --- a/tests/test_interrupt_handling.py +++ b/tests/test_interrupt_handling.py @@ -32,7 +32,10 @@ def test_interrupt_sequential_execution(tmp_path): script_file = tmp_path / "script.sh" # Each case takes 3 seconds - script_file.write_text("#!/bin/bash\nsleep 3\necho 'done' > output.txt\n") + with open(script_file, 'w', newline='\n') as f: + f.write("#!/bin/bash\n") + f.write("sleep 3\n") + f.write("echo 'done' > output.txt\n") script_file.chmod(0o755) # Create multiple cases @@ -100,7 +103,10 @@ def test_interrupt_parallel_execution(tmp_path): script_file = tmp_path / "script.sh" # Each case takes 3 seconds - script_file.write_text("#!/bin/bash\nsleep 3\necho 'done' > output.txt\n") + with open(script_file, 'w', newline='\n') as f: + f.write("#!/bin/bash\n") + f.write("sleep 3\n") + f.write("echo 'done' > output.txt\n") script_file.chmod(0o755) # Create multiple cases @@ -165,7 +171,10 @@ def test_graceful_cleanup_on_interrupt(tmp_path): script_file = tmp_path / "script.sh" # Each case takes 3 seconds - script_file.write_text("#!/bin/bash\nsleep 3\necho 'done' > output.txt\n") + with open(script_file, 'w', newline='\n') as f: + f.write("#!/bin/bash\n") + f.write("sleep 3\n") + f.write("echo 'done' > output.txt\n") script_file.chmod(0o755) input_variables = {"x": [1, 2, 3]} diff --git a/tests/test_no_algorithms.py b/tests/test_no_algorithms.py new file mode 100644 index 0000000..c1037bc --- /dev/null +++ b/tests/test_no_algorithms.py @@ -0,0 +1,476 @@ +""" +Negative tests for algorithms +Tests error handling for invalid algorithms, missing algorithm files, bad algorithm options, etc. +""" + +import os +import tempfile +from pathlib import Path +import pytest +import pandas as pd + +from fz import fzd + + +def test_algorithm_nonexistent_file(): + """Test fzd with a non-existent algorithm file""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + # Create input file + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + # Create calculator script + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + model = { + "varprefix": "$", + "delim": "{}", + "output": { + "result": "grep 'result = ' output.txt | cut -d '=' -f2" + } + } + + analysis_dir = tmpdir / "analysis" + + # Use non-existent algorithm file + nonexistent_algo = tmpdir / "does_not_exist.py" + + # Should raise FileNotFoundError or ValueError + with pytest.raises((FileNotFoundError, ValueError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(nonexistent_algo), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_invalid_python_syntax(): + """Test algorithm file with invalid Python syntax""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Create algorithm file with syntax error + algo_file = tmpdir / "bad_syntax.py" + algo_file.write_text("class MyAlgo\n def invalid syntax here\n") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should raise SyntaxError or ImportError + with pytest.raises((SyntaxError, ImportError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_missing_required_methods(): + """Test algorithm missing required methods""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Algorithm without required methods + algo_file = tmpdir / "incomplete.py" + algo_file.write_text(""" +class IncompleteAlgorithm: + def __init__(self, **options): + pass + # Missing get_initial_design, get_next_design, get_analysis +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should raise AttributeError when trying to call missing methods + with pytest.raises((AttributeError, TypeError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_empty_file(): + """Test algorithm with empty file""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Empty algorithm file + algo_file = tmpdir / "empty.py" + algo_file.write_text("") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should raise error (no algorithm class found) + with pytest.raises((ValueError, AttributeError, ImportError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_with_none_value(): + """Test fzd when algorithm is None""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # algorithm is None + with pytest.raises((TypeError, ValueError, AttributeError)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=None, + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_invalid_type(): + """Test fzd with non-string algorithm parameter""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # algorithm is a dict instead of string + with pytest.raises((TypeError, ValueError, AttributeError)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm={"not": "a", "string": True}, + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_options_invalid_type(): + """Test fzd with algorithm_options that's not a dict""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Create a minimal valid algorithm + algo_file = tmpdir / "simple.py" + algo_file.write_text(""" +class SimpleAlgorithm: + def __init__(self, **options): + pass + def get_initial_design(self, input_vars, output_vars): + return [{"x": 5.0}] + def get_next_design(self, input_vars, output_values): + return [] + def get_analysis(self, input_vars, output_values): + return "Done" +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # algorithm_options is a list instead of dict + with pytest.raises((TypeError, ValueError)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + algorithm_options=["not", "a", "dict"], + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_with_missing_dependencies(): + """Test algorithm that requires missing dependencies""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Algorithm that requires non-existent package + algo_file = tmpdir / "requires_deps.py" + algo_file.write_text(""" +__require__ = ["nonexistent_package_xyz_12345"] + +class MyAlgorithm: + def __init__(self, **options): + pass + def get_initial_design(self, input_vars, output_vars): + return [{"x": 5.0}] + def get_next_design(self, input_vars, output_values): + return [] + def get_analysis(self, input_vars, output_values): + return "Done" +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should warn about missing dependencies but may still try to run + # (The actual behavior depends on implementation) + try: + result = fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + # May succeed with warnings + except (ImportError, ModuleNotFoundError): + # Or may fail if algorithm actually tries to import + pass + + +def test_algorithm_no_class_defined(): + """Test algorithm file with no class defined""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Algorithm file with only functions, no class + algo_file = tmpdir / "no_class.py" + algo_file.write_text(""" +def some_function(): + return 42 + +def another_function(x): + return x * 2 +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should raise error (no algorithm class found) + with pytest.raises((ValueError, AttributeError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_with_runtime_error_in_init(): + """Test algorithm that raises error in __init__""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Algorithm that raises error during initialization + algo_file = tmpdir / "error_init.py" + algo_file.write_text(""" +class ErrorAlgorithm: + def __init__(self, **options): + raise RuntimeError("Initialization failed!") + def get_initial_design(self, input_vars, output_vars): + return [[5.0]] + def get_next_design(self, input_vars, output_values): + return [] + def get_analysis(self, input_vars, output_values): + return "Done" +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should propagate the RuntimeError + with pytest.raises((RuntimeError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + +def test_algorithm_returns_invalid_initial_design(): + """Test algorithm that returns invalid initial design""" + with tempfile.TemporaryDirectory() as tmpdir: + tmpdir = Path(tmpdir) + + input_file = tmpdir / "input.txt" + input_file.write_text("x = ${x}\n") + + calc_script = tmpdir / "calc.sh" + calc_script.write_text("#!/bin/bash\necho 'result = 42' > output.txt\n") + calc_script.chmod(0o755) + + # Algorithm that returns invalid design format + algo_file = tmpdir / "bad_design.py" + algo_file.write_text(""" +class BadDesignAlgorithm: + def __init__(self, **options): + pass + def get_initial_design(self, input_vars, output_vars): + return "not a list" # Should return list of lists + def get_next_design(self, input_vars, output_values): + return [] + def get_analysis(self, input_vars, output_values): + return "Done" +""") + + model = { + "varprefix": "$", + "delim": "{}", + "output": {"result": "echo 42"} + } + + analysis_dir = tmpdir / "analysis" + + # Should fail when trying to process invalid design + with pytest.raises((TypeError, ValueError, Exception)): + fzd( + input_path=str(input_file), + input_variables={"x": "[0;10]"}, + model=model, + output_expression="result", + algorithm=str(algo_file), + calculators=f"sh://bash {calc_script}", + analysis_dir=str(analysis_dir) + ) + + + diff --git a/tests/test_platform_specific.py b/tests/test_platform_specific.py new file mode 100644 index 0000000..9b3e52a --- /dev/null +++ b/tests/test_platform_specific.py @@ -0,0 +1,65 @@ +""" +Platform-specific tests for fz + +These tests verify platform-specific functionality like interrupt handling +on different operating systems. +""" + +import pytest +import sys +import time +import platform +import tempfile +from pathlib import Path + + +class TestInterruptHandling: + """Test interrupt handling (Ctrl+C) on different platforms""" + + @pytest.mark.skipif( + platform.system() != "Windows", + reason="Windows-specific interrupt test" + ) + def test_windows_interrupt_basic(self): + """Test basic interrupt handling on Windows + + Note: This test cannot actually trigger Ctrl+C automatically. + It verifies that the interrupt mechanism is set up correctly. + """ + # Test that KeyboardInterrupt can be caught + caught_interrupt = False + try: + raise KeyboardInterrupt() + except KeyboardInterrupt: + caught_interrupt = True + + assert caught_interrupt, "KeyboardInterrupt should be catchable" + + @pytest.mark.skipif( + platform.system() != "Windows", + reason="Windows-specific test" + ) + @pytest.mark.slow + @pytest.mark.manual + def test_windows_fz_interrupt(self): + """Manual test for FZ interrupt handling on Windows + + This test should be run manually with Ctrl+C to verify interrupt handling. + It is marked as 'manual' and skipped by default. + """ + pytest.skip("Manual test - requires user interaction (Ctrl+C)") + + +class TestPandasRequirement: + """Test that fzd properly requires pandas""" + + def test_pandas_is_available(self): + """Test that pandas is installed and importable""" + try: + import pandas as pd + assert pd is not None + except ImportError: + pytest.fail("pandas should be installed for fzd to work") + + # Removed test_fzd_imports_pandas and test_fzd_requires_pandas_error_message + # pandas is now a required dependency, not optional diff --git a/tests/test_r_algorithms.py b/tests/test_r_algorithms.py new file mode 100644 index 0000000..927b1ff --- /dev/null +++ b/tests/test_r_algorithms.py @@ -0,0 +1,395 @@ +#!/usr/bin/env python3 +""" +Test R algorithm loading and integration with fzd + +This test suite verifies that: +1. R algorithms can be loaded via rpy2 +2. RAlgorithmWrapper correctly wraps R S3 class instances +3. All algorithm methods work correctly (get_initial_design, get_next_design, get_analysis, get_analysis_tmp) +4. Data type conversion between Python and R works correctly +5. R algorithms integrate seamlessly with fzd +""" + +import pytest +import sys +from pathlib import Path + +# Try to import rpy2 +try: + import rpy2 + import rpy2.robjects + HAS_RPY2 = True +except ImportError: + HAS_RPY2 = False + +# Skip all tests in this module if rpy2 is not available +pytestmark = pytest.mark.skipif( + not HAS_RPY2, + reason="rpy2 is required for R algorithm tests. Install with: pip install rpy2" +) + + +def test_r_algorithm_loading(): + """Test loading R algorithm with load_algorithm""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm( + str(r_algo_path), + batch_sample_size=5, + max_iterations=3, + confidence=0.9, + target_confidence_range=0.5, + seed=42 + ) + + # Verify wrapper was created + from fz.algorithms import RAlgorithmWrapper + assert isinstance(algo, RAlgorithmWrapper) + assert algo.r_instance is not None + assert algo.r_globals is not None + + +def test_r_algorithm_get_initial_design(): + """Test get_initial_design method with R algorithm""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Call get_initial_design + input_vars = { + "x": (0.0, 10.0), + "y": (-5.0, 5.0) + } + output_vars = ["result"] + + initial_design = algo.get_initial_design(input_vars, output_vars) + + # Verify result + assert isinstance(initial_design, list) + assert len(initial_design) == 5 + assert all(isinstance(point, dict) for point in initial_design) + assert all("x" in point and "y" in point for point in initial_design) + assert all(0.0 <= point["x"] <= 10.0 for point in initial_design) + assert all(-5.0 <= point["y"] <= 5.0 for point in initial_design) + + +def test_r_algorithm_get_next_design(): + """Test get_next_design method with R algorithm""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs + Y = [point["x"]**2 + point["y"]**2 for point in X] + + # Call get_next_design + next_design = algo.get_next_design(X, Y) + + # Verify result + assert isinstance(next_design, list) + assert len(next_design) == 5 + assert all(isinstance(point, dict) for point in next_design) + + +def test_r_algorithm_get_next_design_with_none(): + """Test get_next_design handles None values in outputs""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs with some None values (failed evaluations) + Y = [] + for i, point in enumerate(X): + if i % 2 == 0: + Y.append(point["x"]**2 + point["y"]**2) + else: + Y.append(None) # Failed evaluation + + # Call get_next_design - should handle None values + next_design = algo.get_next_design(X, Y) + + # Verify result (should still generate next design) + assert isinstance(next_design, list) + + +def test_r_algorithm_get_analysis(): + """Test get_analysis method with R algorithm""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs + Y = [point["x"]**2 + point["y"]**2 for point in X] + + # Call get_analysis + analysis = algo.get_analysis(X, Y) + + # Verify result + assert isinstance(analysis, dict) + assert "text" in analysis + assert "data" in analysis + assert isinstance(analysis["text"], str) + assert isinstance(analysis["data"], dict) + assert "mean" in analysis["data"] + assert "std" in analysis["data"] + + +def test_r_algorithm_get_analysis_tmp(): + """Test get_analysis_tmp method with R algorithm""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs + Y = [point["x"]**2 + point["y"]**2 for point in X] + + # Call get_analysis_tmp + tmp_analysis = algo.get_analysis_tmp(X, Y) + + # Verify result (method is optional, may return None if not implemented) + if tmp_analysis is not None: + assert isinstance(tmp_analysis, dict) + assert "text" in tmp_analysis or "data" in tmp_analysis + + +def test_r_algorithm_html_output(): + """Test that R algorithm can generate HTML output""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=10, seed=42) + + # Get some data + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + Y = [point["x"]**2 + point["y"]**2 for point in X] + + # Get more data + X2 = algo.get_next_design(X, Y) + for point in X2: + X.append(point) + Y.append(point["x"]**2 + point["y"]**2) + + # Call get_analysis + analysis = algo.get_analysis(X, Y) + + # Verify HTML output exists (if base64enc is available in R) + # HTML generation is optional and depends on base64enc package + if "html" in analysis: + assert isinstance(analysis["html"], str) + assert len(analysis["html"]) > 0 + + +def test_r_algorithm_empty_next_design(): + """Test that R algorithm returns empty list when finished""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm with very tight convergence criteria + algo = load_algorithm( + str(r_algo_path), + batch_sample_size=100, + max_iterations=1, # Only 1 iteration allowed + seed=42 + ) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs + Y = [point["x"]**2 + point["y"]**2 for point in X] + + # Call get_next_design - should return empty list (max iterations reached) + next_design = algo.get_next_design(X, Y) + + # Verify empty list is returned + assert isinstance(next_design, list) + assert len(next_design) == 0 + + +def test_r_algorithm_convergence(): + """Test that R algorithm converges based on confidence interval""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm with very loose convergence criteria (easy to reach) + algo = load_algorithm( + str(r_algo_path), + batch_sample_size=50, + max_iterations=10, + confidence=0.9, + target_confidence_range=100.0, # Very large target - should converge quickly + seed=42 + ) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Simulate outputs (constant function - will have tight confidence interval) + Y = [50.0 for _ in X] # All same value + + # Call get_next_design - should return empty list (converged) + next_design = algo.get_next_design(X, Y) + + # Verify empty list is returned (algorithm converged) + assert isinstance(next_design, list) + assert len(next_design) == 0 + + +def test_r_algorithm_data_type_conversion(): + """Test that data types are correctly converted between Python and R""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=3, seed=42) + + # Get initial design + input_vars = {"x": (0.0, 10.0), "y": (-5.0, 5.0)} + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Test with various output types + Y = [ + 10.5, # float + None, # None -> NULL in R + 25.3 # float + ] + + # Call get_next_design - should handle mixed types + next_design = algo.get_next_design(X, Y) + + # Verify it works + assert isinstance(next_design, list) + + # Call get_analysis + analysis = algo.get_analysis(X, Y) + + # Verify analysis data types + assert isinstance(analysis["data"]["mean"], float) + assert isinstance(analysis["data"]["std"], float) + assert isinstance(analysis["data"]["n_samples"], (int, float)) + + +def test_r_algorithm_multiple_variables(): + """Test R algorithm with multiple input variables""" + from fz.algorithms import load_algorithm + + # Get path to R algorithm + repo_root = Path(__file__).parent.parent + r_algo_path = repo_root / "examples" / "algorithms" / "montecarlo_uniform.R" + + # Load R algorithm + algo = load_algorithm(str(r_algo_path), batch_sample_size=5, seed=42) + + # Get initial design with 3 variables + input_vars = { + "x": (0.0, 10.0), + "y": (-5.0, 5.0), + "z": (1.0, 3.0) + } + output_vars = ["result"] + X = algo.get_initial_design(input_vars, output_vars) + + # Verify all variables are present + assert all("x" in point and "y" in point and "z" in point for point in X) + assert all(0.0 <= point["x"] <= 10.0 for point in X) + assert all(-5.0 <= point["y"] <= 5.0 for point in X) + assert all(1.0 <= point["z"] <= 3.0 for point in X) + + +def test_r_algorithm_error_handling(): + """Test that loading non-existent R file raises appropriate error""" + from fz.algorithms import load_algorithm + + # Try to load non-existent R file + with pytest.raises(ValueError, match="Algorithm file not found"): + load_algorithm("nonexistent_algorithm.R") + + +def test_r_algorithm_invalid_extension(): + """Test that loading file with wrong extension raises error""" + from fz.algorithms import load_algorithm + import tempfile + + # Create a temp file with wrong extension + with tempfile.NamedTemporaryFile(suffix=".txt", delete=False) as f: + temp_path = f.name + f.write(b"dummy content") + + try: + # Try to load file with wrong extension + with pytest.raises(ValueError, match="must be a Python \\(\\.py\\) or R \\(\\.R\\) file"): + load_algorithm(temp_path) + finally: + # Clean up + import os + os.unlink(temp_path)