diff --git a/docs/netsuke-design.md b/docs/netsuke-design.md index a83983c4..fbe32a4d 100644 --- a/docs/netsuke-design.md +++ b/docs/netsuke-design.md @@ -1075,11 +1075,16 @@ This transformation involves several steps: The implemented algorithm performs a depth-first traversal of the target graph and maintains a recursion stack. Order-only dependencies are ignored - during this search. Encountering an already visiting node indicates a cycle. - The stack slice from the first occurrence of that node forms the cycle and - is returned in `IrGenError::CircularDependency` for improved debugging. The - cycle list is rotated so the lexicographically smallest node appears first, - ensuring deterministic error messages. + during this search. Self-edges are rejected immediately, and encountering an + already visiting node indicates a cycle. The stack slice from the first + occurrence of that node forms the cycle and is returned in + `IrGenError::CircularDependency` for improved debugging. The cycle list is + rotated so the lexicographically smallest node appears first, ensuring + deterministic error messages. + + Traversal state is managed by a small `CycleDetector` helper struct. This + type owns the recursion stack and visitation map, allowing the traversal + functions to remain focused and easily testable. ### 5.4 Ninja File Synthesis (`ninja_gen.rs`) @@ -1181,17 +1186,17 @@ The command construction will follow this pattern: 1. A new `Command` is created via `Command::new("ninja")`. Netsuke will assume `ninja` is available in the system's `PATH`. -1. Arguments passed to Netsuke's own CLI will be translated and forwarded to +2. Arguments passed to Netsuke's own CLI will be translated and forwarded to Ninja. For example, a `Netsuke build my_target` command would result in `Command::new("ninja").arg("my_target")`. Flags like `-j` for parallelism will also be passed through.[^8] -1. The working directory for the Ninja process will be set using +3. The working directory for the Ninja process will be set using `.current_dir()`. When the user supplies a `-C` flag, Netsuke canonicalises the path and applies it via `current_dir` rather than forwarding the flag to Ninja. -1. Standard I/O streams (`stdin`, `stdout`, `stderr`) will be configured using +4. Standard I/O streams (`stdin`, `stdout`, `stderr`) will be configured using `.stdout(Stdio::piped())` and `.stderr(Stdio::piped())`.[^24] This allows Netsuke to capture the real-time output from Ninja, which can then be streamed to the user's console, potentially with additional formatting or @@ -1279,11 +1284,11 @@ three fundamental questions: 1. **What** went wrong? A concise summary of the failure (e.g., "YAML parsing failed," "Build configuration is invalid"). -1. **Where** did it go wrong? Precise location information, including the file, +2. **Where** did it go wrong? Precise location information, including the file, line number, and column where applicable (e.g., "in `Netsukefile` at line 15, column 3"). -1. **Why** did it go wrong, and what can be done about it? The underlying cause +3. **Why** did it go wrong, and what can be done about it? The underlying cause of the error and a concrete suggestion for how to fix it (e.g., "Cause: Found a tab character, which is not allowed. Hint: Use spaces for indentation instead."). @@ -1347,22 +1352,22 @@ enrichment: 1. A specific, low-level error occurs within a module. For instance, the IR generator detects a missing rule and creates an `IrGenError::RuleNotFound`. -1. The function where the error occurred returns +2. The function where the error occurred returns `Err(IrGenError::RuleNotFound {... }.into())`. The `.into()` call converts the specific `thiserror` enum variant into a generic `anyhow::Error` object, preserving the original error as its source. -1. A higher-level function in the call stack, which called the failing function, +3. A higher-level function in the call stack, which called the failing function, receives this `Err` value. It uses the `.with_context()` method to wrap the error with more application-level context. For example: `ir::from_manifest(ast)` `.with_context(|| "Failed to build the internal build graph from the manifest")?` . -1. This process of propagation and contextualization repeats as the error +4. This process of propagation and contextualization repeats as the error bubbles up towards `main`. -1. Finally, the `main` function receives the `Err` result. It prints the entire +5. Finally, the `main` function receives the `Err` result. It prints the entire error chain provided by `anyhow`, which displays the highest-level context first, followed by a list of underlying "Caused by:" messages. This provides the user with a rich, layered explanation of the failure, from the general @@ -1530,15 +1535,15 @@ goal. 1. Implement the initial `clap` CLI structure for the `build` command. - 1. Implement the YAML parser using `serde_yml` and the AST data structures + 2. Implement the YAML parser using `serde_yml` and the AST data structures (`ast.rs`). - 1. Implement the AST-to-IR transformation logic, including basic validation + 3. Implement the AST-to-IR transformation logic, including basic validation like checking for rule existence. - 1. Implement the IR-to-Ninja file generator (`ninja_gen.rs`). + 4. Implement the IR-to-Ninja file generator (`ninja_gen.rs`). - 1. Implement the `std::process::Command` logic to invoke `ninja`. + 5. Implement the `std::process::Command` logic to invoke `ninja`. - **Success Criterion:** Netsuke can successfully take a `Netsukefile` file *without any Jinja syntax* and compile it to a `build.ninja` file, then @@ -1554,13 +1559,13 @@ goal. 1. Integrate the `minijinja` crate into the build pipeline. - 1. Implement the two-pass parsing mechanism: first render the manifest with + 2. Implement the two-pass parsing mechanism: first render the manifest with `minijinja`, then parse the result with `serde_yml`. - 1. Populate the initial Jinja context with the global `vars` from the + 3. Populate the initial Jinja context with the global `vars` from the manifest. - 1. Implement basic Jinja control flow (`{% if... %}`, `{% for... %}`) and + 4. Implement basic Jinja control flow (`{% if... %}`, `{% for... %}`) and variable substitution. - **Success Criterion:** Netsuke can successfully build a manifest that uses @@ -1577,15 +1582,15 @@ goal. 1. Implement the full suite of custom Jinja functions (`glob`, `env`, etc.) and filters (`shell_escape`). - 1. Mandate the use of `shell-quote` for all command variable substitutions. + 2. Mandate the use of `shell-quote` for all command variable substitutions. - 1. Refactor the error handling to fully adopt the `anyhow`/`thiserror` + 3. Refactor the error handling to fully adopt the `anyhow`/`thiserror` strategy, ensuring all user-facing errors are contextual and actionable as specified in Section 7. - 1. Implement the `clean` and `graph` subcommands. + 4. Implement the `clean` and `graph` subcommands. - 1. Refine the CLI output for clarity and readability. + 5. Refine the CLI output for clarity and readability. - **Success Criterion:** Netsuke is a feature-complete, secure, and user-friendly build tool that meets all the initial design goals. @@ -1652,7 +1657,7 @@ projects. ### **Works cited** [^1]: Ninja, a small build system with a focus on speed. Accessed on 12 July - 2025. + 2025\. [^2]: "Ninja (build system)." Wikipedia. Accessed on 12 July 2025. diff --git a/src/ir.rs b/src/ir.rs index 17ae6ebb..2179455e 100644 --- a/src/ir.rs +++ b/src/ir.rs @@ -478,85 +478,74 @@ enum VisitState { Visited, } -fn should_visit_node<'a>( - states: &'a mut HashMap, - node: &'a PathBuf, -) -> Result { - match states.get(node) { - Some(VisitState::Visited) => Ok(false), - Some(VisitState::Visiting) => Err(node), - None => { - states.insert(node.clone(), VisitState::Visiting); - Ok(true) - } - } +/// Detects cycles in a dependency graph by tracking traversal state. +struct CycleDetector<'a> { + targets: &'a HashMap, + stack: Vec, + states: HashMap, } -fn find_cycle(targets: &HashMap) -> Option> { - fn visit( - targets: &HashMap, - node: &PathBuf, - stack: &mut Vec, - states: &mut HashMap, - ) -> Option> { - match should_visit_node(states, node) { - Ok(false) => return None, - Err(path) => { - if let Some(idx) = stack.iter().position(|n| n == path) { - let mut cycle = stack.get(idx..).expect("slice").to_vec(); - cycle.push(path.clone()); - return Some(canonicalize_cycle(cycle)); - } - return Some(vec![path.clone(), path.clone()]); - } - Ok(true) => {} +impl<'a> CycleDetector<'a> { + fn new(targets: &'a HashMap) -> Self { + Self { + targets, + stack: Vec::new(), + states: HashMap::new(), } - - stack.push(node.clone()); - - if let Some(cycle) = targets - .get(node) - .and_then(|edge| visit_dependencies(targets, &edge.inputs, stack, states)) - { - return Some(cycle); - } - - stack.pop(); - states.insert(node.clone(), VisitState::Visited); - None } - fn visit_dependencies( - targets: &HashMap, - deps: &[PathBuf], - stack: &mut Vec, - states: &mut HashMap, - ) -> Option> { - for dep in deps { - if !targets.contains_key(dep) { + fn detect(&mut self) -> Option> { + for node in self.targets.keys() { + if self.states.contains_key(node.as_path()) { continue; } - - if let Some(cycle) = visit(targets, dep, stack, states) { + if let Some(cycle) = self.visit(node.clone()) { return Some(cycle); } } None } - let mut states = HashMap::new(); - let mut stack = Vec::new(); - - for node in targets.keys() { - // Skip nodes we've already processed to avoid redundant traversal. - if states.contains_key(node) { - continue; + fn visit(&mut self, node: PathBuf) -> Option> { + match self.states.get(node.as_path()) { + Some(VisitState::Visited) => return None, + Some(VisitState::Visiting) => { + let idx = self + .stack + .iter() + .position(|n| n == &node) + .expect("node should be present in the traversal stack"); + let mut cycle: Vec = self.stack.iter().skip(idx).cloned().collect(); + cycle.push(node.clone()); + return Some(canonicalize_cycle(cycle)); + } + None => { + self.states.insert(node.clone(), VisitState::Visiting); + } } - if let Some(cycle) = visit(targets, node, &mut stack, &mut states) { - return Some(cycle); + + self.stack.push(node.clone()); + + if let Some(edge) = self.targets.get(&node) { + for dep in &edge.inputs { + if !self.targets.contains_key(dep) { + continue; + } + if let Some(cycle) = self.visit(dep.clone()) { + return Some(cycle); + } + } } + + self.stack.pop(); + self.states.insert(node, VisitState::Visited); + None } - None +} + +fn find_cycle(targets: &HashMap) -> Option> { + let mut detector = CycleDetector::new(targets); + detector.detect() } fn canonicalize_cycle(mut cycle: Vec) -> Vec { @@ -570,9 +559,10 @@ fn canonicalize_cycle(mut cycle: Vec) -> Vec { .enumerate() .min_by(|(_, a), (_, b)| a.cmp(b)) .map_or(0, |(idx, _)| idx); + cycle.pop(); cycle.rotate_left(start); - if let (Some(first), Some(slot)) = (cycle.first().cloned(), cycle.get_mut(len)) { - slot.clone_from(&first); + if let Some(first) = cycle.first().cloned() { + cycle.push(first); } cycle } @@ -581,33 +571,99 @@ fn canonicalize_cycle(mut cycle: Vec) -> Vec { mod tests { use super::*; - #[test] - fn find_cycle_identifies_cycle() { - let mut targets = HashMap::new(); - let edge_a = BuildEdge { + fn edge_with_inputs(inputs: &[&str], output: &str) -> BuildEdge { + BuildEdge { action_id: "id".into(), - inputs: vec![PathBuf::from("b")], - explicit_outputs: vec![PathBuf::from("a")], + inputs: inputs.iter().map(PathBuf::from).collect(), + explicit_outputs: vec![PathBuf::from(output)], implicit_outputs: Vec::new(), order_only_deps: Vec::new(), phony: false, always: false, - }; - let edge_b = BuildEdge { - action_id: "id".into(), - inputs: vec![PathBuf::from("a")], - explicit_outputs: vec![PathBuf::from("b")], - implicit_outputs: Vec::new(), - order_only_deps: Vec::new(), - phony: false, - always: false, - }; - targets.insert(PathBuf::from("a"), edge_a); - targets.insert(PathBuf::from("b"), edge_b); + } + } + + fn cyclic_targets() -> HashMap { + let mut targets = HashMap::new(); + targets.insert(PathBuf::from("a"), edge_with_inputs(&["b"], "a")); + targets.insert(PathBuf::from("b"), edge_with_inputs(&["a"], "b")); + targets + } + #[test] + fn cycle_detector_detects_simple_cycle() { + let targets = cyclic_targets(); + let mut detector = CycleDetector::new(&targets); + let cycle = detector.detect().expect("cycle"); + assert_eq!( + cycle, + vec![PathBuf::from("a"), PathBuf::from("b"), PathBuf::from("a")] + ); + } + + #[test] + fn cycle_detector_detects_self_edge() { + let mut targets = HashMap::new(); + targets.insert(PathBuf::from("loop"), edge_with_inputs(&["loop"], "loop")); + + let mut detector = CycleDetector::new(&targets); + let cycle = detector.detect().expect("cycle"); + assert_eq!(cycle, vec![PathBuf::from("loop"), PathBuf::from("loop")]); + } + + #[test] + fn cycle_detector_returns_none_for_acyclic_graph() { + let mut targets = HashMap::new(); + targets.insert(PathBuf::from("a"), edge_with_inputs(&[], "a")); + targets.insert(PathBuf::from("b"), edge_with_inputs(&["a"], "b")); + + let mut detector = CycleDetector::new(&targets); + assert!(detector.detect().is_none()); + } + + #[test] + fn find_cycle_identifies_cycle() { + let targets = cyclic_targets(); let cycle = find_cycle(&targets).expect("cycle"); - let option_a = vec![PathBuf::from("a"), PathBuf::from("b"), PathBuf::from("a")]; - let option_b = vec![PathBuf::from("b"), PathBuf::from("a"), PathBuf::from("b")]; - assert!(cycle == option_a || cycle == option_b); + assert_eq!( + cycle, + vec![PathBuf::from("a"), PathBuf::from("b"), PathBuf::from("a")] + ); + } + + #[test] + fn canonicalize_cycle_rotates_smallest_node() { + let cycle = vec![ + PathBuf::from("c"), + PathBuf::from("a"), + PathBuf::from("b"), + PathBuf::from("c"), + ]; + let canonical = canonicalize_cycle(cycle); + let expected = vec![ + PathBuf::from("a"), + PathBuf::from("b"), + PathBuf::from("c"), + PathBuf::from("a"), + ]; + assert_eq!(canonical, expected); + } + + #[test] + fn canonicalize_cycle_handles_reverse_direction() { + let cycle = vec![ + PathBuf::from("c"), + PathBuf::from("b"), + PathBuf::from("a"), + PathBuf::from("c"), + ]; + let canonical = canonicalize_cycle(cycle); + let expected = vec![ + PathBuf::from("a"), + PathBuf::from("c"), + PathBuf::from("b"), + PathBuf::from("a"), + ]; + assert_eq!(canonical, expected); } }