This repository contains a set of scripts for managing and processing field images and specimen records. The workflow is designed to rename folders containg field images based on specimen catalog numbers provided in a CSV. A second script then renames files based on the containing folder. This repository contains sample data and scripts to create sample folders and files to test the workflow.
The workflow consists of four main steps to test the scripts:
- Create folders from a list of record numbers
- Add sample image files to each folder
- Rename folders based on catalog number data from a CSV file
- Rename files within folders to match the folder name
- Python 3.6+
- Bash shell (Linux/macOS or WSL on Windows)
create_folders.sh- Creates folders from a text file listcreate_sample_images.sh- Creates sample image files in subfoldersrename_folders.py- Renames folders based on CSV datarename_files_by_folder.py- Renames files within folders
data_sample_folders.txt- Example list of folder names (record numbers)data_sample_data.csv- Example CSV with recordNumber and catalogNumber mappingsdata_sample_data.xlsx- Excel version of the sample data
Creates a folder for each line in a text file.
chmod +x create_folders.sh
./create_folders.sh data_sample_folders.txt /path/to/base/folderInput: A text file with one folder name per line (e.g., AV_1112, AV_1113, etc.)
Output: Creates folders with names matching each line
Example:
/path/to/base/folder/
├── AV_1112/
├── AV_1113/
├── AV_1114/
└── ...
Creates sample image files in each subfolder.
chmod +x create_sample_images.sh
./create_sample_images.sh /path/to/base/folder 2Parameters:
- First parameter: Base folder path containing subfolders
- Second parameter (optional): Number of images to create per folder (default: 2)
Output: Creates image1.jpg, image2.jpg, etc. in each subfolder
Example:
/path/to/base/folder/
├── AV_1112/
│ ├── image1.jpg
│ └── image2.jpg
├── AV_1113/
│ ├── image1.jpg
│ └── image2.jpg
└── ...
Renames folders by prepending the catalog number from a CSV file.
python rename_folders.py data_sample_data.csv /path/to/base/folderCSV Requirements:
- Must contain two columns:
recordNumberandcatalogNumber - Records with blank
catalogNumbervalues will be skipped with INFO notices
Example CSV:
recordNumber,catalogNumber
AV_1112,
AV_1113,BRIT788514
AV_1114,BRIT788515
Renaming Logic:
- Matches folder names to
recordNumbervalues - Renames to
catalogNumber_recordNumberformat - Example:
AV_1113→BRIT788514_AV_1113
Output:
/path/to/base/folder/
├── AV_1112/ (unchanged - blank catalogNumber)
├── BRIT788514_AV_1113/
├── BRIT788515_AV_1114/
└── ...
Summary Report:
- Successfully renamed folders
- Folders not found
- Errors
- Records with blank catalogNumber
Renames all files in each folder to match the folder name with alphabetic suffixes.
python rename_files_by_folder.py /path/to/base/folderRenaming Logic:
- Files are renamed to
foldername_A.ext,foldername_B.ext, etc. - Alphabetic suffixes are added in order (A, B, C... Z, AA, AB...)
- File extensions are preserved
Example:
Before:
BRIT788514_AV_1113/
├── image1.jpg
└── image2.jpg
After:
BRIT788514_AV_1113/
├── BRIT788514_AV_1113_A.jpg
└── BRIT788514_AV_1113_B.jpg
# Step 1: Create folders
./create_folders.sh data_sample_folders.txt /path/to/specimens
# Step 2: Add sample images (2 per folder)
./create_sample_images.sh /path/to/specimens 2
# Step 3: Rename folders based on catalog numbers
python rename_folders.py data_sample_data.csv /path/to/specimens
# Step 4: Rename files to match folder names
python rename_files_by_folder.py /path/to/specimensAll scripts include:
- Input validation
- Existence checks for files and folders
- Detailed error messages
- Summary reports with counts
- Duplicate detection: Scripts skip operations if target already exists
- Non-destructive: Warns about conflicts rather than overwriting
- Informative output: Shows each operation as it happens
- Summary reports: Clear accounting of all operations
- Blank catalog numbers:
rename_folders.pyskips records with empty catalogNumber fields and reports them - File ordering:
rename_files_by_folder.pysorts files alphabetically for consistent results - Extended alphabets: Supports more than 26 files per folder (A-Z, then AA-AZ, BA-BZ, etc.)
Each script provides:
- Real-time progress updates
- Detailed operation logs
- Summary statistics
- Error and warning messages
Example summary:
==================================================
Summary:
Successfully renamed: 138
Folders not found: 5
Errors: 0
Records with blank catalogNumber: 12
==================================================
This workflow is designed for:
- Managing specimen image collections
- Organizing herbarium specimen folders
- Batch renaming based on catalog systems
- Standardizing file naming conventions
- Migrating folder structures based on database records
Folders not found:
- Verify the recordNumber values in your CSV match actual folder names
- Check for extra whitespace in folder names or CSV values
Permission errors:
- Ensure bash scripts are executable:
chmod +x *.sh - Verify write permissions in target directories
CSV format issues:
- Confirm CSV has headers:
recordNumberandcatalogNumber - Check for proper CSV encoding (UTF-8 recommended)
These scripts are provided as-is for specimen management workflows.
Feel free to submit issues or pull requests for improvements.