Skip to content

Support of the Java Preprocessor library #11

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# VEVOS: Ground Truth Extraction v2.0.0
# VEVOS: Ground Truth Extraction v2.1.0

VEVOS is a tool suite for the simulation of the evolution of clone-and-own projects and consists of two main components:
The ground truth extraction, called VEVOS/Extraction and the variant simulation called VEVOS/Simulation.
Expand Down
6 changes: 4 additions & 2 deletions build-docker-image.sh
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
#! /bin/bash
docker build --build-arg USER_ID=$(id -u) --build-arg GROUP_ID=$(id -g) -t extraction .
#!/bin/bash
USER_ID=$(id -u ${SUDO_USER:-$(whoami)})
GROUP_ID=$(id -g ${SUDO_USER:-$(whoami)})

docker build --build-arg USER_ID=$USER_ID --build-arg GROUP_ID=$GROUP_ID -t extraction .
2 changes: 1 addition & 1 deletion docker-resources/custom.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
Project name | Domain | Source code available (**y**es/**n**o)? | Is it a git repository (**y**es/**n**o)? | Repository URL | Clone URL | Estimated number of commits
-------------------|-------------------------|-----------------------------------------|------------------------------------------|--------------------------------------------------------------|----------------------------------------------------|-----------------------------
linux | operating system | y | y | https://github.com/torvalds/linux | https://github.com/DiffDetective/linux.git | 1,072,142
argouml-spl | modelling tool | y | y | https://github.com/marcusvnac/argouml-spl.git | https://github.com/marcusvnac/argouml-spl.git | 23
9 changes: 5 additions & 4 deletions docker-resources/custom.properties
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,8 @@
# Define the path to the dataset here
# You can also delete or add lines to the custom.md
diff-detective.dataset-file=custom.md
######
# Parameters that probably should not be changed
######
# Which files to process, specified as comma-separated list of file extensions
extraction.file-extensions=java
# Print the ground truth for each commit. Only activate this for the debugging of small datasets.
extraction.print-enabled=false
# Should changes to only the presence condition of source code be ignored? If set to true, VEVOS will only consider
Expand All @@ -23,4 +22,6 @@ extraction.gt-save-dir=ground-truth
# Number of threads to use
diff-detective.num-threads=1
# Number of commits to process in a single batch by one thread
diff-detective.batch-size=8
diff-detective.batch-size=8
# Use the parser for C preprocessor annotations (i.e., 'cpp') or for JavaPP annotations (i.e., 'jpp')
diff-detective.parser = jpp
55 changes: 26 additions & 29 deletions docker-resources/extract.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,41 +6,38 @@ echo "Files in ground-truth folder:"
ls -l /home/user
ls -l ground-truth

if [ "$1" == 'verification' ]
then
echo "Executing variability extraction defined in verification.properties."
PROPS=verification.properties
elif [ "$1" == 'custom' ]
then
echo "Executing variability extraction defined in verification.properties."
PROPS=custom.properties
elif [ "$1" == '--help' ]
then
echo "Examples:"
echo "# Run verification with small datasets"
echo "./start-extraction.sh verification fast"
echo "# Run extraction of all Diff Detective datasets apart from the linux kernel"
echo "./start-extraction.sh fast"
echo "# Run the extraction with a custom set of properties which have to be defined in docker-resources/custom.properties, BEFORE building the Docker image."
echo "./start-extraction.sh custom full"
exit 0
if [ "$1" == 'verification' ]; then
echo "Executing variability extraction defined in verification.properties."
PROPS=verification.properties
elif [ "$1" == 'custom' ]; then
echo "Executing variability extraction defined in verification.properties."
PROPS=custom.properties
elif [ "$1" == '--help' ]; then
echo "Examples:"
echo "# Run verification with small datasets"
echo "./start-extraction.sh verification fast"
echo "# Run extraction of all Diff Detective datasets apart from the linux kernel"
echo "./start-extraction.sh fast"
echo "# Run the extraction with a custom set of properties which have to be defined in docker-resources/custom.properties, BEFORE building the Docker image."
echo "./start-extraction.sh custom full"
exit 0
else
echo "Executing variability extraction defined in without_linux.properties."
PROPS=without_linux.properties
echo "Executing variability extraction defined in without_linux.properties."
PROPS=without_linux.properties
fi

JAR=Extraction-jar-with-dependencies.jar

if [ "$1" == 'fast' ] || [ "$2" == 'fast' ]
then
EX_TYPE=org.variantsync.vevos.extraction.FastGroundTruthExtraction
elif [ "$1" == 'full' ] || [ "$2" == 'full' ]
then
EX_TYPE=org.variantsync.vevos.extraction.FullGroundTruthExtraction
if [ "$1" == 'fast' ] || [ "$2" == 'fast' ]; then
EX_TYPE=org.variantsync.vevos.extraction.FastGroundTruthExtraction
elif [ "$1" == 'full' ] || [ "$2" == 'full' ]; then
EX_TYPE=org.variantsync.vevos.extraction.FullGroundTruthExtraction
else
echo "You either have to select the 'fast' or the 'full' extraction. See --help for more information"
exit 1
echo "You either have to select the 'fast' or the 'full' extraction. See --help for more information"
exit 1
fi

touch log.txt
java -Xmx128g -jar -Dtinylog.configuration=/home/user/tinylog.properties $JAR $PROPS $EX_TYPE
#java -jar -Dtinylog.configuration=/home/user/tinylog.properties $JAR $PROPS $EX_TYPE
#java -jar -Dtinylog.configuration=/home/user/tinylog.properties $JAR $PROPS $EX_TYPE

6 changes: 5 additions & 1 deletion docker-resources/verification.properties
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
#####################################
# Path to the dataset file
diff-detective.dataset-file=verification.md
# Which files to process, specified as comma-separated list of file extensions
extraction.file-extensions=c,cpp,h,hpp
# Print the ground truth for each commit. Only activate this for the debugging of small datasets.
extraction.print-enabled=false
# Should changes to only the presence condition of source code be ignored? If set to true, VEVOS will only consider
Expand All @@ -19,4 +21,6 @@ extraction.gt-save-dir=ground-truth
# Number of threads to use
diff-detective.num-threads=1
# Number of commits to process in a single batch by one thread
diff-detective.batch-size=8
diff-detective.batch-size=8
# Use the parser for C preprocessor annotations (i.e., 'cpp') or for JavaPP annotations (i.e., 'jpp')
diff-detective.parser = cpp
6 changes: 5 additions & 1 deletion docker-resources/without_linux.properties
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
#####################################
# Path to the dataset file
diff-detective.dataset-file=without_linux.md
# Which files to process, specified as comma-separated list of file extensions
extraction.file-extensions=c,cpp,h,hpp
# Print the ground truth for each commit. Only activate this for the debugging of small datasets.
extraction.print-enabled=false
# Should changes to only the presence condition of source code be ignored? If set to true, VEVOS will only consider
Expand All @@ -19,4 +21,6 @@ extraction.gt-save-dir=ground-truth
# Number of threads to use
diff-detective.num-threads=32
# Number of commits to process in a single batch by one thread
diff-detective.batch-size=8
diff-detective.batch-size=8
# Use the parser for C preprocessor annotations (i.e., 'cpp') or for JavaPP annotations (i.e., 'jpp')
diff-detective.parser = cpp
2 changes: 1 addition & 1 deletion local-maven-repo/deploy-diffdetective.sh
100644 → 100755
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
mvn deploy:deploy-file -DgroupId=org.variantsync -DartifactId=diffdetective -Dversion=2.0.0 -Durl=file:../local-maven-repo/ -DrepositoryId=local-maven-repo -DupdateReleaseInfo=true -Dfile=../src/main/resources/lib/diffdetective-2.0.0-jar-with-dependencies.jar
mvn deploy:deploy-file -DgroupId=org.variantsync -DartifactId=diffdetective -Dversion=2.2.0 -Durl=file:../local-maven-repo/ -DrepositoryId=local-maven-repo -DupdateReleaseInfo=true -Dfile=../src/main/resources/lib/diffdetective-2.2.0-jar-with-dependencies.jar
rm -rf ~/.m2/repository/org/variantsync/

This file was deleted.

This file was deleted.

This file was deleted.

This file was deleted.

Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
0b245d60c6129d43673bc34a5236b88c
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
36274fe1dba1cce46df2ace6991825681c36fb54
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,5 @@
<modelVersion>4.0.0</modelVersion>
<groupId>org.variantsync</groupId>
<artifactId>diffdetective</artifactId>
<version>2.0.0</version>
<version>2.2.0</version>
</project>
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
cc9e2bde8437bcf67cbf4bd889a906ca
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
289b68a0459bc9f42fcefae1a13d5f9d3cbfbffa
Original file line number Diff line number Diff line change
Expand Up @@ -3,11 +3,10 @@
<groupId>org.variantsync</groupId>
<artifactId>diffdetective</artifactId>
<versioning>
<release>2.0.0</release>
<release>2.2.0</release>
<versions>
<version>1.0.0</version>
<version>2.0.0</version>
<version>2.2.0</version>
</versions>
<lastUpdated>20231102083916</lastUpdated>
<lastUpdated>20240223170318</lastUpdated>
</versioning>
</metadata>
Original file line number Diff line number Diff line change
@@ -1 +1 @@
ee328247eea6708871ef37008de824e8
429d3bbaabca0f4b185822875d4970b5
Original file line number Diff line number Diff line change
@@ -1 +1 @@
668bf5175e6c376a160f74b91ee06fae0c99b6d9
70470517c24ba88d3ebe718b9eee0f5ca8389d0c
10 changes: 5 additions & 5 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
<groupId>org.variantsync.vevos.extraction</groupId>
<!-- Adjust the project name -->
<artifactId>Extraction</artifactId>
<version>2.0.0</version>
<version>2.1.0</version>

<properties>
<!-- Adjust your java version here -->
Expand Down Expand Up @@ -81,7 +81,7 @@
<dependency>
<groupId>org.eclipse.jgit</groupId>
<artifactId>org.eclipse.jgit</artifactId>
<version>5.12.0.202106070339-r</version>
<version>6.8.0.202311291450-r</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
Expand All @@ -91,7 +91,7 @@
<dependency>
<groupId>net.lingala.zip4j</groupId>
<artifactId>zip4j</artifactId>
<version>2.11.3</version>
<version>2.11.5</version>
</dependency>
<dependency>
<groupId>org.tinylog</groupId>
Expand All @@ -106,7 +106,7 @@
<dependency>
<groupId>org.variantsync</groupId>
<artifactId>diffdetective</artifactId>
<version>2.0.0</version>
<version>2.2.0</version>
</dependency>
</dependencies>
</project>
</project>
Original file line number Diff line number Diff line change
@@ -1,22 +1,15 @@
package org.variantsync.vevos.extraction;

public class ConfigProperties {
public static final String PRINT_ENABLED
= "extraction.print-enabled";
public static final String GT_SAVE_DIR
= "extraction.gt-save-dir";
public static final String IGNORE_PC_CHANGES
= "extraction.ignore-pc-changes";
public static final String DATASET_FILE
= "diff-detective.dataset-file";
public static final String DD_OUTPUT_DIR
= "diff-detective.output-dir";
public static final String REPO_SAVE_DIR
= "diff-detective.repo-storage-dir";
public static final String NUM_THREADS
= "diff-detective.num-threads";
public static final String BATCH_SIZE
= "diff-detective.batch-size";
public static final String EXTRACT_CODE_MATCHING
= "extraction.extract-code-matching";
public static final String PRINT_ENABLED = "extraction.print-enabled";
public static final String GT_SAVE_DIR = "extraction.gt-save-dir";
public static final String IGNORE_PC_CHANGES = "extraction.ignore-pc-changes";
public static final String DATASET_FILE = "diff-detective.dataset-file";
public static final String DD_OUTPUT_DIR = "diff-detective.output-dir";
public static final String REPO_SAVE_DIR = "diff-detective.repo-storage-dir";
public static final String NUM_THREADS = "diff-detective.num-threads";
public static final String BATCH_SIZE = "diff-detective.batch-size";
public static final String PARSER = "diff-detective.parser";
public static final String EXTRACT_CODE_MATCHING = "extraction.extract-code-matching";
public static final String FILE_EXTENSIONS = "extraction.file-extensions";
}
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,8 @@
import org.variantsync.diffdetective.datasets.PatchDiffParseOptions;
import org.variantsync.diffdetective.datasets.Repository;
import org.variantsync.diffdetective.diff.git.DiffFilter;
import org.variantsync.diffdetective.feature.AnnotationParser;
import org.variantsync.diffdetective.feature.PreprocessorAnnotationParser;
import org.variantsync.diffdetective.variation.diff.parse.VariationDiffParseOptions;
import org.variantsync.vevos.extraction.gt.GroundTruth;

Expand Down Expand Up @@ -123,6 +125,22 @@ public static Properties getProperties(File propertiesFile) {
* @return The options instance
*/
public static AnalysisRunner.Options diffdetectiveOptions(Properties properties) {
final String[] allowedFileExtensions;
String propertyValue = properties.getProperty(FILE_EXTENSIONS);
if (propertyValue == null) {
final String[] defaultExtensions = {"h", "hpp", "c", "cpp"};
allowedFileExtensions = defaultExtensions;
} else {
allowedFileExtensions = propertyValue.split("\\w*,\\w*");
}

AnnotationParser parser;
switch (properties.getProperty(PARSER)) {
case "jpp" -> parser = PreprocessorAnnotationParser.JPPAnnotationParser;
case "cpp" -> parser = PreprocessorAnnotationParser.CPPAnnotationParser;
default -> throw new IllegalArgumentException("The parser " + properties.getProperty(PARSER) +
" is not supported. Choose between 'jpp' and 'cpp'");
}

return new AnalysisRunner.Options(Path.of(properties.getProperty(REPO_SAVE_DIR)),
Path.of(properties.getProperty(DD_OUTPUT_DIR)),
Expand All @@ -132,10 +150,9 @@ public static AnalysisRunner.Options diffdetectiveOptions(Properties properties)
PatchDiffParseOptions.DiffStoragePolicy.DO_NOT_REMEMBER,
new VariationDiffParseOptions(
repoDefault.variationDiffParseOptions().annotationParser(),
false, false));
false, false)).withAnnotationParser(parser);
}, repo -> new DiffFilter.Builder().allowMerge(true)
// TODO: make configurable
.allowedFileExtensions("h", "hpp", "c", "cpp").build(),
.allowedFileExtensions(allowedFileExtensions).build(),
true, false);
}

Expand Down
Binary file not shown.