Skip to content

BRITorg/field_image_workflow

Repository files navigation

Field Image Folder and File Renaming Workflow

This repository contains a set of scripts for managing and processing field images and specimen records. The workflow is designed to rename folders containg field images based on specimen catalog numbers provided in a CSV. A second script then renames files based on the containing folder. This repository contains sample data and scripts to create sample folders and files to test the workflow.

Overview

The workflow consists of four main steps to test the scripts:

  1. Create folders from a list of record numbers
  2. Add sample image files to each folder
  3. Rename folders based on catalog number data from a CSV file
  4. Rename files within folders to match the folder name

Prerequisites

  • Python 3.6+
  • Bash shell (Linux/macOS or WSL on Windows)

Files in this Repository

Scripts

  • create_folders.sh - Creates folders from a text file list
  • create_sample_images.sh - Creates sample image files in subfolders
  • rename_folders.py - Renames folders based on CSV data
  • rename_files_by_folder.py - Renames files within folders

Sample Data

  • data_sample_folders.txt - Example list of folder names (record numbers)
  • data_sample_data.csv - Example CSV with recordNumber and catalogNumber mappings
  • data_sample_data.xlsx - Excel version of the sample data

Workflow Steps

Step 1: Create Folders

Creates a folder for each line in a text file.

chmod +x create_folders.sh
./create_folders.sh data_sample_folders.txt /path/to/base/folder

Input: A text file with one folder name per line (e.g., AV_1112, AV_1113, etc.)

Output: Creates folders with names matching each line

Example:

/path/to/base/folder/
  ├── AV_1112/
  ├── AV_1113/
  ├── AV_1114/
  └── ...

Step 2: Add Sample Images

Creates sample image files in each subfolder.

chmod +x create_sample_images.sh
./create_sample_images.sh /path/to/base/folder 2

Parameters:

  • First parameter: Base folder path containing subfolders
  • Second parameter (optional): Number of images to create per folder (default: 2)

Output: Creates image1.jpg, image2.jpg, etc. in each subfolder

Example:

/path/to/base/folder/
  ├── AV_1112/
  │   ├── image1.jpg
  │   └── image2.jpg
  ├── AV_1113/
  │   ├── image1.jpg
  │   └── image2.jpg
  └── ...

Step 3: Rename Folders Based on CSV Data

Renames folders by prepending the catalog number from a CSV file.

python rename_folders.py data_sample_data.csv /path/to/base/folder

CSV Requirements:

  • Must contain two columns: recordNumber and catalogNumber
  • Records with blank catalogNumber values will be skipped with INFO notices

Example CSV:

recordNumber,catalogNumber
AV_1112,
AV_1113,BRIT788514
AV_1114,BRIT788515

Renaming Logic:

  • Matches folder names to recordNumber values
  • Renames to catalogNumber_recordNumber format
  • Example: AV_1113BRIT788514_AV_1113

Output:

/path/to/base/folder/
  ├── AV_1112/                  (unchanged - blank catalogNumber)
  ├── BRIT788514_AV_1113/
  ├── BRIT788515_AV_1114/
  └── ...

Summary Report:

  • Successfully renamed folders
  • Folders not found
  • Errors
  • Records with blank catalogNumber

Step 4: Rename Files Within Folders

Renames all files in each folder to match the folder name with alphabetic suffixes.

python rename_files_by_folder.py /path/to/base/folder

Renaming Logic:

  • Files are renamed to foldername_A.ext, foldername_B.ext, etc.
  • Alphabetic suffixes are added in order (A, B, C... Z, AA, AB...)
  • File extensions are preserved

Example:

Before:
  BRIT788514_AV_1113/
    ├── image1.jpg
    └── image2.jpg

After:
  BRIT788514_AV_1113/
    ├── BRIT788514_AV_1113_A.jpg
    └── BRIT788514_AV_1113_B.jpg

Complete Workflow Example

# Step 1: Create folders
./create_folders.sh data_sample_folders.txt /path/to/specimens

# Step 2: Add sample images (2 per folder)
./create_sample_images.sh /path/to/specimens 2

# Step 3: Rename folders based on catalog numbers
python rename_folders.py data_sample_data.csv /path/to/specimens

# Step 4: Rename files to match folder names
python rename_files_by_folder.py /path/to/specimens

Script Features

Error Handling

All scripts include:

  • Input validation
  • Existence checks for files and folders
  • Detailed error messages
  • Summary reports with counts

Safety Features

  • Duplicate detection: Scripts skip operations if target already exists
  • Non-destructive: Warns about conflicts rather than overwriting
  • Informative output: Shows each operation as it happens
  • Summary reports: Clear accounting of all operations

Special Handling

  • Blank catalog numbers: rename_folders.py skips records with empty catalogNumber fields and reports them
  • File ordering: rename_files_by_folder.py sorts files alphabetically for consistent results
  • Extended alphabets: Supports more than 26 files per folder (A-Z, then AA-AZ, BA-BZ, etc.)

Output and Logging

Each script provides:

  1. Real-time progress updates
  2. Detailed operation logs
  3. Summary statistics
  4. Error and warning messages

Example summary:

==================================================
Summary:
  Successfully renamed: 138
  Folders not found: 5
  Errors: 0
  Records with blank catalogNumber: 12
==================================================

Use Cases

This workflow is designed for:

  • Managing specimen image collections
  • Organizing herbarium specimen folders
  • Batch renaming based on catalog systems
  • Standardizing file naming conventions
  • Migrating folder structures based on database records

Troubleshooting

Folders not found:

  • Verify the recordNumber values in your CSV match actual folder names
  • Check for extra whitespace in folder names or CSV values

Permission errors:

  • Ensure bash scripts are executable: chmod +x *.sh
  • Verify write permissions in target directories

CSV format issues:

  • Confirm CSV has headers: recordNumber and catalogNumber
  • Check for proper CSV encoding (UTF-8 recommended)

License

These scripts are provided as-is for specimen management workflows.

Contributing

Feel free to submit issues or pull requests for improvements.

About

Scripts for processing field images to be associated with herbarium specimen records

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published