Skip to content

Conversation

@cooperlees
Copy link
Owner

  • Add timeout to forked btrfs ...

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR migrates the btrfs exporter from synchronous subprocess handling with threading to asynchronous execution using Tokio. The main purpose is to modernize the codebase with async/await patterns and add timeout support for subprocess execution.

Key Changes:

  • Replaced synchronous subprocess execution with Tokio's async Command API and added 30-second timeout
  • Migrated from thread-based parallelism to Tokio tasks for concurrent btrfs stat collection
  • Replaced signal-hook with Tokio's native signal handling for SIGINT

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
src/main.rs Converted fork_btrfs and get_btrfs_stats to async functions, migrated from thread::spawn to tokio::spawn, replaced subprocess crate with tokio::process::Command, updated signal handling to use tokio::signal, and restructured main loop to collect stats before waiting for scrape requests
Cargo.toml Removed signal-hook and subprocess dependencies, added futures crate, and expanded tokio features to include process, signal, and sync
Comments suppressed due to low confidence (1)

src/main.rs:117

  • The newly converted async functions fork_btrfs and get_btrfs_stats lack test coverage. While the synchronous parse_btrfs_stats function has a test, the async wrapper functions that handle command execution, timeouts, and error handling are not tested. Consider adding tests for these functions to verify timeout behavior, error handling, and parallel execution.
async fn fork_btrfs(cmd: Vec<String>) -> Result<HashMap<String, f64>> {
    let command_timeout = Duration::from_secs(30);
    let output = match timeout(
        command_timeout,
        Command::new(&cmd[0]).args(&cmd[1..]).output(),
    )
    .await
    {
        Ok(result) => result?,
        Err(_) => {
            error!("{:?} timed out after {:?}", cmd, command_timeout);
            return Ok(HashMap::new());
        }
    };

    if output.status.success() {
        let stdout = String::from_utf8_lossy(&output.stdout).to_string();
        return Ok(parse_btrfs_stats(stdout));
    } else {
        let stderr = String::from_utf8_lossy(&output.stderr);
        error!("{:?} failed: {:?}", cmd, stderr);
    }
    Ok(HashMap::new())
}

#[tracing::instrument]
async fn get_btrfs_stats(mountpoints: String) -> Result<HashMap<String, f64>> {
    let btrfs_bin = "/usr/bin/btrfs".to_string();
    let sudo_bin = "/usr/bin/sudo".to_string();

    // Call btrfs CLI to get error counters
    let mut tasks = vec![];
    for mountpoint in mountpoints.split(',') {
        let cmd = Vec::from([
            sudo_bin.clone(),
            btrfs_bin.clone(),
            "device".to_string(),
            "stats".to_string(),
            mountpoint.to_string(),
        ]);
        debug!("--> Spawning async task to run {:?}", cmd);
        tasks.push(tokio::spawn(fork_btrfs(cmd)));
    }

    // Collect the stats from each task
    let mut stats: HashMap<String, f64> = HashMap::new();
    let results = join_all(tasks).await;
    for result in results {
        match result {
            Ok(Ok(stat_hash)) => stats.extend(stat_hash),
            Ok(Err(e)) => error!("Task failed: {:?}", e),
            Err(e) => error!("Join error: {:?}", e),
        }
    }

    Ok(stats)
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +225 to +226
if let Some(stat_guage_value) = stat_guage {
stat_guage_value
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name "stat_guage_value" contains a spelling error. It should be "stat_gauge_value" (gauge is spelled with 'u' before 'g').

Suggested change
if let Some(stat_guage_value) = stat_guage {
stat_guage_value
if let Some(stat_gauge_value) = stat_guage {
stat_gauge_value

Copilot uses AI. Check for mistakes.
src/main.rs Outdated
.set(*err_count);
}

// TODO: Move to function passing all guages etc.
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment contains a spelling error. "guages" should be spelled "gauges".

Suggested change
// TODO: Move to function passing all guages etc.
// TODO: Move to function passing all gauges etc.

Copilot uses AI. Check for mistakes.
Comment on lines +149 to 161
tokio::spawn(async move {
match signal::ctrl_c().await {
Ok(()) => {
info!("Received SIGINT, shutting down");
opentelemetry::global::shutdown_tracer_provider();
process::exit(0);
}
Err(e) => {
error!("Failed to listen for SIGINT: {:?}", e);
}
}
});
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using tokio::spawn here without storing the join handle means the signal handler task could be dropped/cancelled without warning. While this might work for the intended use case (since the process exits anyway), it's better practice to either store the handle or use a scoped spawn to ensure the task lifecycle is managed properly. Consider storing the handle or documenting that the task intentionally runs for the lifetime of the program.

Copilot uses AI. Check for mistakes.
Comment on lines +61 to 83
async fn fork_btrfs(cmd: Vec<String>) -> Result<HashMap<String, f64>> {
let command_timeout = Duration::from_secs(30);
let output = match timeout(
command_timeout,
Command::new(&cmd[0]).args(&cmd[1..]).output(),
)
.await
{
Ok(result) => result?,
Err(_) => {
error!("{:?} timed out after {:?}", cmd, command_timeout);
return Ok(HashMap::new());
}
};

if output.status.success() {
let stdout = String::from_utf8_lossy(&output.stdout).to_string();
return Ok(parse_btrfs_stats(stdout));
} else {
p.terminate()?;
error!("{:?} failed: {:?}", cmd, err);
let stderr = String::from_utf8_lossy(&output.stderr);
error!("{:?} failed: {:?}", cmd, stderr);
}
Ok(HashMap::new())
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a command times out or fails, the function returns an empty HashMap. This silent failure could mask real issues. In the original code, threads would join and errors would be logged, but now callers cannot distinguish between a timeout/failure and a legitimately empty result. Consider using a different error handling strategy that propagates the error rather than returning an empty HashMap, or at least ensure the calling code can differentiate between these cases.

Copilot uses AI. Check for mistakes.
src/main.rs Outdated
Comment on lines 200 to 207
// Collect stats asynchronously first
let stats_hash = get_btrfs_stats(args.mountpoints.clone())
.await
.expect("Failed to get btrfs stats");
debug!("Stats collected: {:?}", stats_hash);

// Wait for scrape request (blocking) - guard is not held across await
let guard = exporter.wait_request();
Copy link

Copilot AI Jan 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The stats are now being collected before waiting for a scrape request, which means stats will be continuously collected even when no scrapes are requested. This changes the behavior from on-demand collection to eager collection, which could lead to unnecessary resource usage. The stats should be collected after the wait_request call to maintain the previous on-demand behavior.

Copilot uses AI. Check for mistakes.
@cooperlees cooperlees merged commit b4da78e into main Jan 3, 2026
2 checks passed
@cooperlees cooperlees deleted the async_all branch January 3, 2026 04:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants