A tool for managing SLURM job workflows across multiple HPC clusters. ssync bridges local development environments with remote SLURM systems, for unified job submission, monitoring, and file synchronization capabilities.
ssync helps researchers and developers working with SLURM-based HPC systems by providing:
- Local Development - Maintain your development workflow on local machines
- Automated Deployment - Synchronize code to multiple clusters for quick prototyping
- Job Management - Submit and monitor jobs without manual SSH sessions to each cluster
- Monitoring - Track job status and outputs across all configured clusters from a single interface
- Web Interface - UI for job submission and monitoring
uv pip install git+https://github.com/Ramlaoui/ssync.gitCreate a configuration file at ~/.config/ssync/config.yaml:
hosts:
- name: cluster1
hostname: login.cluster1.edu
username: your_username
work_dir: /scratch/your_username/projects
- name: cluster2
hostname: hpc.university.edu
username: your_username
work_dir: /home/your_username/work
# Optional: Default SLURM parameters for a given cluster
# These can be overridden in job scripts / cli submissions
slurm_defaults:
partition: gpu
time: 60 # minutes
cpus: 4
mem: 16 # GB# View all jobs across clusters
ssync status
# Filter by specific host
ssync status --host cluster1
# Show only running jobs
ssync status --state R
# Display recent completed jobs
ssync status --since 1d# Sync local directory to remote cluster
ssync sync ./project-dir --host cluster1
# Exclude specific patterns
ssync sync ./project-dir --host cluster1 --exclude "*.log"# Submit a job script
ssync submit job.sh --host cluster1
# Combined sync and submit operation
ssync launch job.sh ./project-dir --host cluster1# View job output
ssync status --job-id 12345 --cat-outputLaunch the complete web interface (serves both API and UI):
# Start in background with HTTPS (default)
ssync web
# Use HTTP instead of HTTPS
ssync web --no-https
# Stop the server
ssync web --stop
# Check if running
ssync web --status
# Run in foreground for debugging
ssync web --foregroundThe ssync web command:
- Uses HTTPS by default with auto-generated self-signed certificates
- Runs in the background by default (doesn't block your terminal)
- Builds the frontend automatically if needed
- Serves both API and UI on the same port
- Opens your browser automatically
Access at https://localhost:8042
Note on HTTPS: The first time you access the site, your browser will warn about the self-signed certificate. This is normal for local development. Accept the certificate to proceed.
For API-only mode (no UI):
ssync apiFeatures include:
- Real-time job status dashboard
- Interactive script editor with SLURM directive validation
- Directory browser for source selection
- Job submission interface
- Live log streaming for running jobs
ssync supports a structured script format that separates login node setup from compute node execution. This is particularly useful for clusters where compute nodes lack internet access:
#!/bin/bash
#SBATCH --job-name=experiment
#SBATCH --time=2:00:00
#LOGIN_SETUP_BEGIN
# Commands executed on login node
pip install -r requirements.txt
module load cuda/11.4
#LOGIN_SETUP_END
# Compute node execution
python train.py --epochs 100The synchronization process automatically respects .gitignore patterns, preventing unnecessary transfer of build artifacts, virtual environments, and other excluded files.
Job scripts and metadata are cached locally, allowing retrieval of job information even after SLURM's job history expiration.
ssync provides a REST API for programmatic access:
import requests
# Query job status
response = requests.get("https://localhost:8042/api/status", verify=False) # verify=False for self-signed cert
jobs = response.json()
# Submit a job
response = requests.post("https://localhost:8042/api/jobs/launch", json={
"host": "cluster1",
"script_content": "#!/bin/bash\npython train.py",
"source_dir": "/path/to/project"
})For production deployments or multi-user environments, enable API authentication:
# Generate API key
ssync auth setup
# Enable authentication requirement
export SSYNC_REQUIRE_API_KEY=true
ssync api- Python 3.11 or higher
- SSH access to target SLURM clusters
- rsync (typically pre-installed on Unix systems)
To modify the web interface:
cd web-frontend
npm install
npm run dev # Development server with hot reload
npm run build # Production build- Verify SSH access:
ssh <cluster-hostname> - Configure SSH keys for passwordless access:
ssh-copy-id <cluster-hostname>
- Review
.gitignorepatterns for large file exclusions - Use
--excludeflag for additional pattern-based filtering
- Ensure job completion before attempting output retrieval
- Verify
work_dirconfiguration matches actual job execution directory
Contributions are welcome. Please submit issues and pull requests through the project repository.
Apache 2.0 - See LICENSE file for details.