Skip to content

pulseengine/loom

Repository files navigation

LOOM - Formally Verified WebAssembly Optimizer

Status License Tests

A high-performance WebAssembly optimizer with formal verification support, featuring a comprehensive 12-phase optimization pipeline built with ISLE (Instruction Selection and Lowering Engine) and Z3 SMT solver integration.

✨ Features

πŸš€ Comprehensive Optimization Pipeline

  • 12 optimization phases including constant folding, strength reduction, CSE, inlining, LICM, and DCE
  • Declarative rules using ISLE DSL for maintainable optimizations
  • Stateful dataflow analysis tracking locals and memory state
  • Idempotent optimizations - safe to run multiple times

πŸ”¬ Formal Verification

  • Z3 SMT-based verification proves optimization correctness via translation validation
  • Property-based testing ensures idempotence and validity
  • Counterexample generation for debugging failed optimizations
  • Optional verification feature (build with --features verification)

⚑ Performance

  • Ultra-fast: 10-30 Β΅s optimization time for most modules
  • Excellent compression: 80-95% binary size reduction
  • Instruction optimization: 0-40% instruction count reduction (varies by code)
  • Lightweight: Minimal dependencies, pure Rust implementation

🎯 Advanced Features

  • Component Model support for modern WebAssembly
  • wasm32-wasip2 build target support
  • Comprehensive benchmarking with Criterion
  • Full WAT and WASM format support

πŸ›οΈ Architecture

Loom is built with a modular architecture featuring a shared foundation:

  • loom-shared: Core ISLE definitions and WebAssembly IR (stable API)
  • loom-core: Optimization pipeline implementation
  • loom-cli: Command-line interface and tooling
  • loom-testing: Differential testing framework

The loom-shared crate provides a stable, reusable foundation that other WebAssembly tools can build upon. It contains:

  • ISLE term definitions for all WebAssembly instructions
  • Module IR (Module, Function, Instruction types)
  • WASM parsing and encoding utilities
  • Z3 verification infrastructure (optional)

This architecture enables both rapid prototyping in Loom and potential use in safety-critical applications through derived tools.

πŸ“– See LOOM_SYNTH_ARCHITECTURE.md for detailed architecture documentation.

πŸ“¦ Quick Start

Installation

# Build from source
git clone https://github.com/pulseengine/loom
cd loom
cargo build --release

# Binary at target/release/loom
./target/release/loom --help

Basic Usage

# Optimize WebAssembly module
loom optimize input.wasm -o output.wasm

# Show detailed statistics
loom optimize input.wat -o output.wasm --stats

# Run verification checks
loom optimize input.wasm -o output.wasm --verify

# Output as WAT text format
loom optimize input.wasm -o output.wat --wat

Example Output

πŸ”§ LOOM Optimizer v0.1.0
Input: input.wasm
βœ“ Parsed in 234Β΅s
⚑ Optimizing...
βœ“ Optimized in 0 ms

πŸ“Š Optimization Statistics
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Instructions: 24 β†’ 20 (16.7% reduction)
Binary size:  797 β†’ 92 bytes (88.5% reduction)
Constant folds: 3
Optimization time: 0 ms
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

βœ… Optimization complete!

πŸ—οΈ 12-Phase Optimization Pipeline

Phase Optimization Example Impact
1 Precompute Global constant propagation Enables folding
2 ISLE Folding 10 + 20 β†’ 30 Compile-time evaluation
3 Strength Reduction x * 8 β†’ x << 3 2-3x speedup
4 CSE Cache duplicate computations Reduces redundancy
5 Function Inlining Inline small functions Call overhead removal
6 ISLE (Post-inline) Fold exposed constants Cross-function optimization
7 Code Folding Flatten blocks Control flow simplification
8 LICM Hoist loop invariants Loop speedup
9 Branch Simplify Simplify conditionals Fewer branches
10 DCE Remove unreachable code Code cleanup
11 Block Merge Merge consecutive blocks Reduce overhead
12 Vacuum & Locals Remove unused variables Final cleanup

πŸ“Š Benchmark Results

Performance (Criterion benchmarks)

Constant Folding:        8-11 Β΅s
Strength Reduction:      10 Β΅s
CSE:                     9-14 Β΅s
Function Inlining:       16-18 Β΅s
Full Pipeline:           19-28 Β΅s
Parser:                  6.8 Β΅s
Encoder:                 183 ns (!)

Real-World Results

Fixture Instructions Binary Size Constant Folds
bench_bitops 24 β†’ 20 (16.7%) 88.5% reduction 0
test_input 9 β†’ 7 (22.2%) 81.6% reduction 1
fibonacci 12 β†’ 12 (0%) 92.6% reduction 0
quicksort Complex 92.5% reduction 0
game_logic Complex 92.5% reduction 0

Key Insight: Binary size reductions are consistently excellent (80-93%), while instruction count improvements vary by code complexity.

πŸ”¬ Formal Verification

LOOM supports two verification modes:

1. Property-Based (Always Available)

loom optimize input.wasm -o output.wasm --verify
  • Fast idempotence checks
  • Constant folding validation
  • ~5ms overhead

2. Z3 SMT Formal Proof (Optional)

# Install Z3
brew install z3  # macOS
sudo apt install z3  # Linux

# Build with verification
cargo build --release --features verification

# Verify with formal proof
./target/release/loom optimize input.wasm -o output.wasm --verify

Output:

πŸ”¬ Running Z3 SMT verification...
  βœ… Z3 verification passed: optimizations are semantically equivalent

Z3 verification proves mathematically that optimizations preserve program semantics via translation validation.

πŸ’‘ Examples

Example 1: Strength Reduction

Input:

(module
  (func $optimize_me (param $x i32) (result i32)
    local.get $x
    i32.const 8
    i32.mul
  )
)

After Optimization:

(module
  (func $optimize_me (param $x i32) (result i32)
    local.get $x
    i32.const 3
    i32.shl  ;; 2-3x faster than multiply!
  )
)

Example 2: Constant Folding

Input:

(func $calculate (result i32)
  i32.const 10
  i32.const 20
  i32.add
  i32.const 5
  i32.mul
)

After Optimization:

(func $calculate (result i32)
  i32.const 150  ;; Computed at compile-time
)

Example 3: CSE

Input:

(func $duplicate (param $x i32) (result i32)
  local.get $x
  i32.const 4
  i32.mul
  local.get $x
  i32.const 4
  i32.mul  ;; Duplicate computation!
  i32.add
)

After Optimization:

(func $duplicate (param $x i32) (result i32)
  local.get $x
  i32.const 4
  i32.mul
  local.tee $temp
  local.get $temp  ;; Reuse cached result
  i32.add
)

πŸ›οΈ Architecture

                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  WebAssembly     β”‚
                    β”‚  (WAT or WASM)   β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚   wasmparser     β”‚
                    β”‚  Parse to AST    β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  ISLE Terms      β”‚
                    β”‚  (IR)            β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
    β”‚     12-Phase Optimization Pipeline         β”‚
    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
    β”‚  β”‚ 1. Precompute                        β”‚  β”‚
    β”‚  β”‚ 2. ISLE Constant Folding             β”‚  β”‚
    β”‚  β”‚ 3. Strength Reduction                β”‚  β”‚
    β”‚  β”‚ 4. Common Subexpression Elimination  β”‚  β”‚
    β”‚  β”‚ 5. Function Inlining                 β”‚  β”‚
    β”‚  β”‚ 6. ISLE (Post-inline)                β”‚  β”‚
    β”‚  β”‚ 7. Code Folding                      β”‚  β”‚
    β”‚  β”‚ 8. Loop-Invariant Code Motion        β”‚  β”‚
    β”‚  β”‚ 9. Branch Simplification             β”‚  β”‚
    β”‚  β”‚ 10. Dead Code Elimination            β”‚  β”‚
    β”‚  β”‚ 11. Block Merging                    β”‚  β”‚
    β”‚  β”‚ 12. Vacuum & Simplify Locals         β”‚  β”‚
    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
    β”‚              β”‚                              β”‚
    β”‚              β–Ό                              β”‚
    β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
    β”‚  β”‚   Dataflow Analysis                  β”‚  β”‚
    β”‚  β”‚   (locals, memory state tracking)    β”‚  β”‚
    β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                             β”‚
                             β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚   Optional: Z3 SMT       β”‚
              β”‚   Verification           β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚   wasm-encoder           β”‚
              β”‚   Encode to binary       β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                           β”‚
                           β–Ό
              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
              β”‚  Optimized WebAssembly   β”‚
              β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“š Documentation

User Guides

Technical Documentation

For Contributors

πŸ§ͺ Testing

# Run all tests
cargo test

# Run optimization-specific tests
cargo test --test optimization_tests

# Run benchmarks
cargo bench

# Test on real fixtures
./target/release/loom optimize tests/fixtures/quicksort.wat -o /tmp/out.wasm --stats --verify

Test Status:

  • βœ… 20/20 optimization tests passing (100%)
  • βœ… 54/57 unit tests passing (95%)
  • βœ… All benchmarks complete successfully

πŸ“ Project Structure

loom/
β”œβ”€β”€ loom-core/                # Core optimizer implementation
β”‚   β”œβ”€β”€ src/lib.rs            # 12-phase pipeline, optimizations
β”‚   β”œβ”€β”€ src/verify.rs         # Z3 verification module
β”‚   β”œβ”€β”€ tests/                # Unit and integration tests
β”‚   └── benches/              # Criterion performance benchmarks
β”œβ”€β”€ loom-isle/                # ISLE term definitions and rules
β”‚   β”œβ”€β”€ isle/                 # ISLE DSL files
β”‚   └── src/lib.rs            # Rust integration
β”œβ”€β”€ loom-cli/                 # Command-line interface
β”‚   β”œβ”€β”€ src/main.rs           # CLI implementation
β”‚   └── BUILD.bazel           # Bazel build rules
β”œβ”€β”€ tests/fixtures/           # Real-world test cases
β”‚   β”œβ”€β”€ fibonacci.wat
β”‚   β”œβ”€β”€ quicksort.wat
β”‚   β”œβ”€β”€ matrix_multiply.wat
β”‚   └── ...
└── docs/                     # Comprehensive documentation
    β”œβ”€β”€ USAGE_GUIDE.md
    β”œβ”€β”€ QUICK_REFERENCE.md
    └── ...

πŸ”§ Building

Standard Build

cargo build --release

With Z3 Verification

cargo build --release --features verification

WASM Build (wasm32-wasip2)

# Using Cargo
cargo build --release --target wasm32-wasip2

# Using Bazel
bazel build //loom-cli:loom_wasm --platforms=@rules_rust//rust/platform:wasm

See WASM_BUILD.md for details.

πŸš€ Use Cases

1. Production Deployment

# Optimize and verify before deployment
loom optimize app.wasm -o app.optimized.wasm --verify --stats

2. CI/CD Integration

- name: Optimize WebAssembly
  run: |
    loom optimize dist/*.wasm -o dist/*.wasm --stats --verify

3. Development Workflow

# Optimize during build
cargo build --target wasm32-unknown-unknown
loom optimize target/wasm32-unknown-unknown/release/app.wasm -o dist/app.wasm

4. Performance Analysis

# Compare before/after
ls -lh original.wasm optimized.wasm
loom optimize original.wasm -o optimized.wasm --stats

🎯 Optimization Patterns

Strength Reduction

  • x * 2^n β†’ x << n (2-3x faster)
  • x / 2^n β†’ x >> n (2-3x faster)
  • x % 2^n β†’ x & (2^n - 1) (2-3x faster)

Algebraic Simplification

  • x | 0 β†’ x
  • x & -1 β†’ x
  • x ^ 0 β†’ x
  • x + 0 β†’ x
  • x * 1 β†’ x

Constant Folding

  • Compile-time evaluation of all constant expressions
  • Propagation through local variables
  • Cross-function constant propagation via inlining

🀝 Contributing

Contributions welcome! This project is under active development.

  1. Check existing issues
  2. Run tests: cargo test && cargo clippy
  3. Follow Rust conventions
  4. Add tests for new features
  5. Update documentation

πŸ“œ License

Apache License 2.0

πŸ™ Acknowledgments

πŸ”— Related Projects

  • Binaryen - Reference WebAssembly optimizer (C++)
  • wasm-opt - Industry-standard optimizer
  • Cranelift - High-performance code generator
  • WRT - WebAssembly Component Model runtime

πŸ“ˆ Roadmap

Current Status (v0.1.0):

  • βœ… 12-phase optimization pipeline
  • βœ… Z3 formal verification
  • βœ… Comprehensive benchmarking
  • βœ… Component Model support
  • βœ… wasm32-wasip2 build target

Coming Soon:

  • 🚧 More aggressive LICM (arithmetic operations, global reads)
  • 🚧 Profile-guided optimization
  • 🚧 SIMD-specific optimizations
  • 🚧 Inter-procedural analysis
  • 🚧 Custom optimization passes

Built with ❀️ by PulseEngine

About

WebAssembly optimizer using ISLE with formal verification - proof of concept

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •