An interactive web interface for the image editing system with scribble and add modes. In scribble mode, users will draw directly on the canvas, which will be inputted alongside a prompt to guide generation. In add mode, users draw bounding boxes around a region, which is passed with a prompt to generate the specified objects within the specified region.


This project consists of two parts:
- Frontend: React TypeScript application for the user interface
- Backend: Flask API server that runs the GPU-intensive diffusion models
- Dual Image Display: Side-by-side boxes showing current image and edit preview
- Interactive Drawing: Overlay canvas for scribble mode and bounding box selection
- Mode Switching: Toggle between Add Mode and Scribble Mode
- Accept/Reject Workflow: Preview edits before applying them
- Settings Panel: Adjust generation parameters like seed, steps, and brush size
- Add Mode: Draw bounding box and specify what to add to that region
- Scribble Mode: Draw rough sketches that get transformed into detailed additions (like hair ribbons)
Navigate to the main project directory:
cd /path/to/box_edit
Install Python dependencies:
pip install -r api_requirements.txt
Start the Flask server:
python app.py
The server will start on http://localhost:5000
and automatically load the diffusion models.
Navigate to the React app directory:
cd image-editor-ui
Install Node.js dependencies:
npm install
Important: Update the API URL in src/App.tsx
if your server is not on localhost:
// Change this line to match your server address
const API_BASE = 'http://YOUR_SERVER_IP:5000/api';
Start the React development server:
npm start
The UI will open at http://localhost:3000
- Enter a prompt (e.g., "anime girl with long hair") and click Generate
- Wait for the base image to appear in the left box
- Choose your editing mode and start editing!
- Click Add Mode to enable bounding box selection
- Draw a rectangle on the image where you want to add something
- Enter what you want to add (e.g., "butterfly", "flower")
- Click Add and wait for the preview
- Accept or Reject the edit
- Click Scribble Mode to enable drawing
- Draw a rough sketch on the image
- Enter what the scribble should become (e.g., "pink hair ribbon")
- Adjust brush size if needed using the slider in the top bar
- Click Apply and wait for the preview
- Accept or Reject the edit
- Seed: Controls randomness (same seed = same result)
- Steps: More steps = higher quality but slower generation
- Brush Size: Only appears in Scribble Mode
The Flask backend provides these endpoints:
Generate a new base image
{
"prompt": "anime girl with long hair",
"seed": 42,
"num_inference_steps": 50
}
Add elements using bounding box
{
"session_id": "uuid",
"box": [x0, y0, x1, y1],
"add_prompt": "butterfly",
"num_inference_steps": 50
}
Edit using scribble drawing
{
"session_id": "uuid",
"scribble_image": "base64_image_data",
"scribble_prompt": "pink hair ribbon",
"num_inference_steps": 50
}
Accept the current edit preview
{
"session_id": "uuid"
}
Reject the current edit preview
{
"session_id": "uuid"
}
- React 19 with TypeScript
- Fabric.js for canvas drawing and interaction
- Axios for API communication
- CSS3 with responsive design
- Flask with CORS support
- PyTorch and Diffusers for AI models
- ControlNet for scribble-based editing
- InstDiffEdit for automatic attention masking
- Session Management: Each browser session maintains its own image state
- Latent Caching: Base images are stored as latents for faster editing
- Hybrid Masking: Combines ControlNet constraints with attention-based blending
- Real-time Preview: Edit results are shown before applying
"Failed to generate image. Is the server running?"
- Check if the Flask server is running on the correct port
- Verify the API_BASE URL in App.tsx matches your server address
- Check server logs for CUDA/GPU issues
Canvas not responding in drawing modes
- Ensure you have an image loaded first
- Try refreshing the page and generating a new image
- Check browser console for JavaScript errors
Out of memory errors
- Reduce batch size or image resolution in the server code
- Monitor GPU memory usage on the server
- Consider using CPU fallback for testing
Slow generation times
- Reduce num_inference_steps (try 20-30 for faster results)
- Ensure CUDA is properly configured on the server
- Consider using xformers for memory optimization
- Use lower step counts (20-30) for faster previews
- Generate base images once and reuse them for multiple edits
- Keep scribbles simple and clear for better results
- Use consistent seeds for reproducible results
To modify the UI:
- Edit React components in
src/
- Styles are in
src/App.css
- API integration is in
src/App.tsx
To modify the backend:
- Edit
app.py
for API endpoints - Model logic is imported from existing project files
- Add new endpoints following the existing pattern
Built on top of the InstDiffEdit and ControlNet image editing system with hybrid attention masking for precise, localized edits.