Skip to content

add DocLayout-YOLO & D-Fine model backend examples #787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
72 changes: 72 additions & 0 deletions label_studio_ml/examples/d_fine/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# Use a PyTorch base image with CUDA support
FROM pytorch/pytorch:2.1.2-cuda12.1-cudnn8-runtime
ARG DEBIAN_FRONTEND=noninteractive
ARG TEST_ENV

WORKDIR /app

# Install essential packages and D-FINE dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
wget \
curl \
# For OpenCV if needed by D-FINE or its deps, though D-FINE requirements.txt might handle it via pip
libgl1-mesa-glx \
libglib2.0-0 \
&& rm -rf /var/lib/apt/lists/*

ENV PYTHONUNBUFFERED=1 \
PYTHONDONTWRITEBYTECODE=1 \
PIP_CACHE_DIR=/.cache \
PORT=9090 \
WORKERS=1 \
THREADS=4 \
CUDA_HOME=/usr/local/cuda \
DFINE_CODE_DIR=/app/d-fine-code

# Set Conda's CUDA_HOME if it's a conda based PyTorch image, otherwise system CUDA_HOME
# For official pytorch/pytorch images, system CUDA_HOME is usually fine.
# ENV CUDA_HOME=/opt/conda

ENV PATH="${CUDA_HOME}/bin:${PATH}"
ENV TORCH_CUDA_ARCH_LIST="6.0;6.1;7.0;7.5;8.0;8.6+PTX;8.9;9.0"

# Install base requirements for Label Studio ML Backend
COPY requirements-base.txt .
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
pip install --no-cache-dir -r requirements-base.txt

# --- D-FINE specific setup ---
# 1. Copy D-FINE's requirements.txt
COPY d_fine_requirements.txt .
# 2. Install D-FINE's Python dependencies
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
pip install --no-cache-dir -r d_fine_requirements.txt

# 3. Copy D-FINE's 'src' and 'configs' directories
COPY d-fine-code/src ${DFINE_CODE_DIR}/src
COPY d-fine-code/configs ${DFINE_CODE_DIR}/configs
# --- End D-FINE specific setup ---

# Install ML backend example specific requirements (if any, usually empty for this setup)
#COPY requirements.txt .
#RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
# pip install --no-cache-dir -r requirements.txt

# install test requirements if needed
COPY requirements-test.txt .
# build only when TEST_ENV="true"
RUN --mount=type=cache,target=${PIP_CACHE_DIR},sharing=locked \
if [ "$TEST_ENV" = "true" ]; then \
pip3 install -r requirements-test.txt; \
fi

# Copy the rest of the ML backend example files
COPY . ./

# Set PYTHONPATH to include the D-FINE source code directory
ENV PYTHONPATH=${DFINE_CODE_DIR}:${PYTHONPATH}

EXPOSE ${PORT}

CMD gunicorn --preload --bind :${PORT} --workers ${WORKERS} --threads ${THREADS} --timeout 0 _wsgi:app
101 changes: 101 additions & 0 deletions label_studio_ml/examples/d_fine/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,101 @@
# D-FINE ML Backend for Label Studio

This ML backend integrates the [D-FINE](https://github.com/Peterande/D-FINE) object detection model with Label Studio. It allows you to get pre-annotations for object detection tasks using pre-trained D-FINE models.

## Features

- Loads pre-trained D-FINE models (e.g., COCO-trained models).
- Provides bounding box predictions for `RectangleLabels` in Label Studio.
- Configurable via environment variables for model paths, device, and thresholds.

## Prerequisites

1. **Docker and Docker Compose**: For building and running the ML backend.
2. **D-FINE Model Files**:
* **Source Code**: You need the `src` and `configs` directories from the [official D-FINE repository](https://github.com/Peterande/D-FINE).
* **Model Weights**: Download the desired `.pth` model weights (e.g., `dfine_l_coco.pth`).
3. **Label Studio**: A running instance of Label Studio.

## Setup

1. **Clone this repository** (if you haven't already) and navigate to this example directory:
```bash
# Assuming you are in the root of label-studio-ml-backend
cd label_studio_ml/examples/d_fine
```

2. **Prepare D-FINE code**:
* Create a directory named `d-fine-code` within the current `label_studio_ml/examples/d_fine` directory.
* Copy the `src` and `configs` directories from your clone of the [D-FINE repository](https://github.com/Peterande/D-FINE) into this newly created `d-fine-code` directory.
Your structure should look like:
```
label_studio_ml/examples/d_fine/
├── d-fine-code/
│ ├── src/
│ └── configs/
├── Dockerfile
├── docker-compose.yml
├── model.py
└── ... (other files in this example)
```

3. **Prepare D-FINE model weights**:
* Create a directory named `models` within the current `label_studio_ml/examples/d_fine` directory.
* Place your downloaded D-FINE `.pth` model weights file (e.g., `dfine_l_coco.pth`) into this `models` directory.
Your structure should look like:
```
label_studio_ml/examples/d_fine/
├── models/
│ └── dfine_l_coco.pth (or your chosen model weights)
└── ... (other files)
```

4. **Configure `docker-compose.yml`**:
* Adjust environment variables as needed, especially:
* `DFINE_CONFIG_FILE`: Name of the D-FINE `.yml` config file (e.g., `dfine_hgnetv2_l_coco.yml`). This file must exist in `d-fine-code/configs/dfine/`.
* `DFINE_MODEL_WEIGHTS`: Name of the D-FINE `.pth` weights file (e.g., `dfine_l_coco.pth`). This file must exist in the `models` directory you created.
* `DEVICE`: Set to `cuda` if you have a GPU and want to use it, otherwise `cpu`.
* `LABEL_STUDIO_URL` and `LABEL_STUDIO_API_KEY` (if Label Studio needs to serve image data to the backend, e.g., for local file uploads or cloud storage not directly accessible by the backend).

## Running the ML Backend

1. **Build and start the Docker container**:
```bash
docker-compose up --build
```
If you have a GPU and configured it in `docker-compose.yml`, it should be utilized.

2. **Verify the backend is running**:
Open your browser or use `curl` to check the health endpoint:
```bash
curl http://localhost:9090/health
```
You should see `{"status":"UP","model_class":"DFINEModel"}`.

## Connecting to Label Studio

1. Open your Label Studio project.
2. Go to **Settings > Machine Learning**.
3. Click **Add Model**.
4. Enter a **Title** for your ML backend (e.g., "D-FINE Detector").
5. Set the **URL** to `http://localhost:9090` (or the appropriate host/port if not running locally or on a different port).
6. Enable **Interactive preannotations** if desired.
7. Click **Validate and Save**.

## Labeling Configuration

This ML backend expects a labeling configuration with an `Image` object tag and a `RectangleLabels` control tag.

Example:
```xml
<View>
<Image name="image" value="$image"/>
<RectangleLabels name="label" toName="image" model_score_threshold="0.3">
<!-- Map Label Studio labels to D-FINE model's COCO class names -->
<!-- D-FINE outputs COCO class names like 'person', 'car', etc. -->
<Label value="Pedestrian" background="green" predicted_values="person"/>
<Label value="Vehicle" background="blue" predicted_values="car,truck,bus,motorcycle"/>
<Label value="Bicycle" background="orange" predicted_values="bicycle"/>
<!-- Add more labels as needed, mapping to COCO_CLASSES -->
</RectangleLabels>
</View>
139 changes: 139 additions & 0 deletions label_studio_ml/examples/d_fine/_wsgi.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
import os
import argparse
import json
import logging
import logging.config

# Set a default log level if LOG_LEVEL is not defined
log_level = os.getenv("LOG_LEVEL", "INFO")

logging.config.dictConfig({
"version": 1,
"disable_existing_loggers": False,
"formatters": {
"standard": {
"format": "[%(asctime)s] [%(levelname)s] [%(name)s::%(funcName)s::%(lineno)d] %(message)s"
}
},
"handlers": {
"console": {
"class": "logging.StreamHandler",
"level": log_level,
"stream": "ext://sys.stdout",
"formatter": "standard"
}
},
"root": {
"level": log_level,
"handlers": [
"console"
],
"propagate": True
}
})

from label_studio_ml.api import init_app
from model import DFINEModel # Changed from NewModel to DFINEModel


_DEFAULT_CONFIG_PATH = os.path.join(os.path.dirname(__file__), 'config.json')


def get_kwargs_from_config(config_path=_DEFAULT_CONFIG_PATH):
if not os.path.exists(config_path):
return dict()
with open(config_path) as f:
config = json.load(f)
assert isinstance(config, dict)
return config


if __name__ == "__main__":
parser = argparse.ArgumentParser(description='Label Studio ML Backend for D-FINE')
parser.add_argument(
'-p', '--port', dest='port', type=int, default=9090,
help='Server port')
parser.add_argument(
'--host', dest='host', type=str, default='0.0.0.0',
help='Server host')
parser.add_argument(
'--kwargs', '--with', dest='kwargs', metavar='KEY=VAL', nargs='+', type=lambda kv: kv.split('='),
help='Additional LabelStudioMLBase model initialization kwargs')
parser.add_argument(
'-d', '--debug', dest='debug', action='store_true',
help='Switch debug mode')
parser.add_argument(
'--log-level', dest='log_level', choices=['DEBUG', 'INFO', 'WARNING', 'ERROR'], default=log_level,
help='Logging level')
parser.add_argument(
'--model-dir', dest='model_dir', default=os.getenv('MODEL_DIR', '/data/models'), # Default from Docker env
help='Directory where models (.pth weights) are stored')
parser.add_argument(
'--check', dest='check', action='store_true',
help='Validate model instance before launching server')
parser.add_argument('--basic-auth-user',
default=os.environ.get('ML_SERVER_BASIC_AUTH_USER', None),
help='Basic auth user')

parser.add_argument('--basic-auth-pass',
default=os.environ.get('ML_SERVER_BASIC_AUTH_PASS', None),
help='Basic auth pass')

args = parser.parse_args()

# setup logging level
if args.log_level:
logging.root.setLevel(args.log_level)

def isfloat(value):
try:
float(value)
return True
except ValueError:
return False

def parse_kwargs():
param = dict()
if args.kwargs:
for k, v in args.kwargs:
if v.isdigit():
param[k] = int(v)
elif v == 'True' or v == 'true':
param[k] = True
elif v == 'False' or v == 'false':
param[k] = False
elif isfloat(v):
param[k] = float(v)
else:
param[k] = v
return param

kwargs_parsed = get_kwargs_from_config()
kwargs_parsed.update(parse_kwargs())

# Pass MODEL_DIR to the model constructor if needed, or rely on env vars within the model
if args.model_dir:
kwargs_parsed['model_dir'] = args.model_dir
# Also update environment variable if model relies on it directly and it's not already set
if not os.getenv('MODEL_DIR'):
os.environ['MODEL_DIR'] = args.model_dir


if args.check:
print('Check "' + DFINEModel.__name__ + '" instance creation..')
model = DFINEModel(**kwargs_parsed)

app = init_app(model_class=DFINEModel, **kwargs_parsed) # Pass parsed kwargs here

app.run(host=args.host, port=args.port, debug=args.debug)

else:
# for uWSGI use
# Ensure MODEL_DIR is available for the model initialization
kwargs_for_init = get_kwargs_from_config()
if not os.getenv('MODEL_DIR') and 'model_dir' not in kwargs_for_init:
kwargs_for_init['model_dir'] = os.getenv('MODEL_DIR', '/data/models')
if not os.getenv('MODEL_DIR'):
os.environ['MODEL_DIR'] = kwargs_for_init['model_dir']

app = init_app(model_class=DFINEModel, **kwargs_for_init)
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
task: detection

evaluator:
type: CocoEvaluator
iou_types: ['bbox', ]

num_classes: 80
remap_mscoco_category: True

train_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /data/COCO2017/train2017/
ann_file: /data/COCO2017/annotations/instances_train2017.json
return_masks: False
transforms:
type: Compose
ops: ~
shuffle: True
num_workers: 4
drop_last: True
collate_fn:
type: BatchImageCollateFunction


val_dataloader:
type: DataLoader
dataset:
type: CocoDetection
img_folder: /data/COCO2017/val2017/
ann_file: /data/COCO2017/annotations/instances_val2017.json
return_masks: False
transforms:
type: Compose
ops: ~
shuffle: False
num_workers: 4
drop_last: False
collate_fn:
type: BatchImageCollateFunction
Loading