BugFarm

Artifact repository for the paper Challenging Bug Prediction and Repair Models with Synthetic Bugs, accepted at SCAM 2025, Auckland, New Zealand. Authors are Ali Reza Ibrahimzada, Yang Chen, Ryan Rong, and Reyhaneh Jabbarvand.

Overview

BugFarm is a framework that generates synthetic bugs through the analysis of least-attended tokens and statements in code. These synthetic bugs challenge and evaluate bug prediction and repair models. The pipeline involves extracting methods from projects, analyzing attention weights, determining least-attended components, and using LLMs to generate plausible bugs.

Data Archive

Please visit Zenodo to access the results of BugFarm. We will refer to certain files from this archive in the following sections.

Getting Started

Using Docker (Recommended)

The easiest way to set up BugFarm is using Docker:

# Build the Docker image
docker build -t bugfarm .

# Run the container
docker run -it bugfarm bash

Manual Setup

If you prefer a manual setup:

Install miniconda

Create and activate the environment:

conda env create -f environment.yaml
conda activate bugfarm

Set up the tokenizer tool
Install dependencies and download projects:
```
bash setup.sh
```

Project Modules

Attention Analyzer

This module extracts methods from projects and analyzes attention weights to determine least attended tokens (LAT) and least attended statements (LAS).

Key steps:

Extract methods from projects
Extract attention weights
Analyze attention weights to determine LAT/LAS

For detailed instructions, see Attention Analyzer README.

Bug Generator

This module uses LLMs to generate synthetic bugs based on the attention analysis results.

Key steps:

Prompt LLM with LAT/LAS information
Parse LLM responses to extract buggy methods
Select the most suitable bugs

For detailed instructions, see Bug Generator README.

We provide synthetic bugs on Zenodo. Please download mutants.zip from the BugFarm Zenodo archive.

Create Defect Dataset

This module creates datasets for training and evaluating bug detection models using various sources:

BugSwarm
Mockito-Closure (from Defects4J)
RegMiner
LEAM
muBERT

For detailed instructions, see Create Defect Dataset README.

We provide defect datasets on Zenodo. Please download defect_datasets.zip from the BugFarm Zenodo archive.

Bug Prediction

This module finetunes models for bug prediction using the created defect datasets.

For detailed instructions, see Finetuning README.

Bug Repair

We use artifacts of FitRepair for performing bug repair on the generated mutants. Please refer to the original repository for details on how to use FitRepair. We provide the generated patches from FitRepair on Zenodo. Please download apr.zip from the BugFarm Zenodo archive.

Human Study

Please refer to human_study.zip in the BugFarm Zenodo archive for the results of our human study on the generated bugs. You can also find human labeler results directly on UIUCPlus. Please refer to different branches for different human labelers and mutants.

LEAM

This module generates mutants using the LEAM framework.

For detailed instructions, see LEAM README.

muBERT

This module generates mutants using the muBERT framework.

For detailed instructions, see muBERT README.

Contact

For any questions or issues, please contact Ali Reza Ibrahimzada or open an issue on GitHub.

Name		Name	Last commit message	Last commit date
Latest commit History 167 Commits
configs		configs
scripts		scripts
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
environment.yaml		environment.yaml
setup.sh		setup.sh
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

BugFarm

Table of Contents

Overview

Data Archive

Getting Started

Using Docker (Recommended)

Manual Setup

Project Modules

Attention Analyzer

Bug Generator

Create Defect Dataset

Bug Prediction

Bug Repair

Human Study

LEAM

muBERT

Contact

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

Intelligent-CAT-Lab/BugFarm

Folders and files

Latest commit

History

Repository files navigation

BugFarm

Table of Contents

Overview

Data Archive

Getting Started

Using Docker (Recommended)

Manual Setup

Project Modules

Attention Analyzer

Bug Generator

Create Defect Dataset

Bug Prediction

Bug Repair

Human Study

LEAM

muBERT

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages