BIOMERO Worker Containerο
The BIOMERO worker container handles distributed image analysis processing through Slurm integration and serves as the dedicated OMERO.grid Processor node.
Overviewο
Based on the openmicroscopy/omero-server
image, this container is configured as a specialized OMERO worker node that exclusively handles script execution via the Processor-0 role in OMERO.grid.
Grid Role Assignment:
CONFIG_omero_server_nodedescriptors: >-
master:Blitz-0
omeroworker-1:Tables-0,Indexer-0,PixelData-0,DropBox,MonitorServer,FileServer,Storm
biomeroworker-external:Processor-0
This ensures all OMERO script execution (including BIOMERO scripts) is routed to this container, which has the specialized environment needed for HPC cluster integration.
Key Featuresο
- HPC Integration
SSH Access: Direct SSH connectivity to Slurm clusters
Single Account Model: One SSH key, one Slurm account per deployment
Secure Mounting: SSH keys mounted securely via startup scripts, not prepackaged
- Workflow Processing
OMERO Script Execution: All script processing via Processor-0 role
Event Sourcing: Complete workflow tracking in PostgreSQL database
Data Export: ZARR format export for HPC workflows
- Analysis Pipeline
Format Conversion: OMERO β ZARR β TIFF workflow
Multi-format Support: Handles diverse input formats via bioformats2raw
Workflow Management: Configurable analysis pipelines
Container Customizationsο
SSH Integrationο
SSH Client Installation:
RUN yum install -y openssh-clients
COPY biomeroworker/10-mount-ssh.sh /startup/10-mount-ssh.sh
SSH Key Mounting (10-mount-ssh.sh
):
Copies SSH keys from
/tmp/.ssh
to/opt/omero/server/.ssh
Sets proper permissions (700 for directory, 600 for private keys)
Enables secure HPC cluster access, without baking secrets into the container image
Usage:
# Mount SSH keys in docker-compose
volumes:
- "$HOME/.ssh:/tmp/.ssh:ro"
Database Integrationο
PostgreSQL Support:
RUN yum install -y python3-devel postgresql-devel gcc
Purpose:
Event Sourcing: Complete workflow execution tracking
Analytics: Detailed workflow performance data
Audit Trail: Full history of analysis jobs and statuses
SLURM Job Accounting: Tracks resource usage per job and per OMERO user
BIOMERO 2.0 Feature: Near real-time event logging provides a single source of truth for all workflow events.
Data Export Pipelineο
bioformats2raw Installation:
RUN wget https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.7.0/bioformats2raw-0.7.0.zip
ZARR Export Support:
RUN yum install -y blosc-devel
# ...
RUN $VIRTUAL_ENV/bin/python -m pip install omero-cli-zarr==0.5.5
Export Workflow:
OMERO Data β Export via
omero-cli-zarr
ZARR Format β Universal intermediate format
TIFF Conversion β On HPC cluster for analysis tools
Results Import β Back to OMERO as new images/annotations
BIOMERO Library Integrationο
Core BIOMERO Installation:
RUN $VIRTUAL_ENV/bin/python -m pip install biomero==${BIOMERO_VERSION}
Supporting Libraries:
For BIOMERO scripts. These can have extra dependencies above just BIOMERO python library.
RUN $VIRTUAL_ENV/bin/python -m pip install \
ezomero==1.1.1 \
tifffile==2020.9.3 \
omero-metadata==0.12.0
Zero-C ICE Pre-built Wheels:
RUN wget https://github.com/glencoesoftware/zeroc-ice-py-linux-x86_64/releases/download/20240202/zeroc_ice-3.6.5-cp39-cp39-manylinux_2_28_x86_64.whl
Custom Processor Implementationο
Modified processor.py:
COPY biomeroworker/processor.py /opt/omero/server/venv3/lib/python3.9/site-packages/omero/
Warning
Maintenance Alert: This file overrides the base OMERO processor.py and may conflict with future OMERO updates.
Key Changes: * Environment variable forwarding to subprocesses (HTTP_PROXY, etc.) * Enhanced subprocess handling for BIOMERO workflows
Maintenance Required: Periodically merge important changes from upstream OMERO processor.py to maintain compatibility.
Original Source: ome/omero-py processor.py
Configuration Managementο
Slurm Configurationο
Base Configuration (slurm-config.ini
):
COPY biomeroworker/slurm-config.ini /etc/slurm-config.ini
This file contains:
SSH Settings:
[SSH]
host=localslurm # SSH alias for cluster connection
Slurm Paths:
[SLURM]
slurm_data_path=/data/my-scratch/data
slurm_images_path=/data/my-scratch/singularity_images/workflows
slurm_script_path=/data/my-scratch/slurm-scripts
Workflow Models:
Cellpose segmentation
StarDist segmentation
CellProfiler measurements
Custom analysis workflows
Configuration Override:
The web interface can mount a different configuration that overrides this base file:
# In docker-compose
volumes:
- "./slurm-config-override.ini:/etc/slurm-config.ini:ro"
Note
Configuration Hierarchy:
Base file (in container): Default workflows and settings
Override file (mounted): Admin customizations via web interface
Limitation: Override can modify/add but cannot delete base configurations
Analytics Configurationο
BIOMERO 2.0 Analytics (from slurm-config.ini
):
[ANALYTICS]
track_workflows=True
enable_job_accounting=True
enable_job_progress=True
enable_workflow_analytics=True
Database Connection:
# Uses environment variable SQLALCHEMY_URL or container's PostgreSQL connection
sqlalchemy_url=postgresql+psycopg2://user:password@localhost:5432/biomero
Worker Startup Processο
Configuration Generationο
The startup script dynamically generates OMERO configuration:
Internal Worker (99-run.sh
):
# For workers in same Docker network
MASTER_ADDR=$(getent hosts $CONFIG_omero_master_host | cut -d\ -f1)
WORKER_ADDR=$(getent hosts $OMERO_WORKER_NAME | cut -d\ -f1)
Worker Configuration:
cat > OMERO.server/etc/$OMERO_WORKER_NAME.cfg << EOF
IceGrid.Node.Endpoints=tcp -h $WORKER_ADDR -p $WORKER_PORT
IceGrid.Node.Name=$OMERO_WORKER_NAME
IceGrid.Node.Data=var/$OMERO_WORKER_NAME
Ice.StdOut=var/log/$OMERO_WORKER_NAME.out
EOF
ICE Configuration:
sed -e "s/@omero.master.host@/$MASTER_ADDR/" \
OMERO.server/etc/templates/ice.config > \
OMERO.server/etc/ice.config
Development Guidelinesο
BIOMERO Script Developmentο
Script Location: BIOMERO scripts are installed on the OMERO server container, not the worker:
Scripts live in:
/opt/omero/server/OMERO.server/lib/scripts/biomero/
Worker executes scripts via OMERO.grid Processor-0 role
Script changes require OMERO server container rebuild on release, but during development you can just upload them through web for on-the-fly testing!
Workflow Development:
Create workflow: In separate repository (e.g., W_NucleiSegmentation-Cellpose)
Add to config: Update
slurm-config.ini
with new workflowTest locally: Use development environment
Deploy: Release NL-BIOMERO with new workflow support via the config. Or just set it via admin in a live environment.
SSH Key Managementο
Development Setup:
# Generate SSH key for HPC access
ssh-keygen -t rsa -f ~/.ssh/hpc_key
# Add public key to HPC cluster
# Mount in development docker-compose
Production Deployment:
Single SSH Key: One key per deployment
Single Slurm Account: One account per deployment
Security: Keys should be rotated regularly
Access Control: Limit SSH key to specific HPC resources
Configuration Testingο
Test Slurm Configuration:
# Access worker container
docker-compose exec biomeroworker bash
# Test SSH connection
ssh localslurm # SSH alias for HPC cluster in config.ini
# Test BIOMERO configuration
python -c "from biomero.slurm_client import SlurmClient; client = SlurmClient.from_config(); print(client.validate())"
Test Analytics Database:
Check database connection and initialize analytics (Option 1: direct configuration)
from biomero import SlurmClient
slurmClient = SlurmClient(track_workflows=True,
enable_job_accounting=False,
enable_job_progress=True,
enable_workflow_analytics=False)
slurmClient.initialize_analytics_system(True)
print('Analytics system initialized')
Option 2: From config file
from biomero import SlurmClient
slurmClient = SlurmClient.from_config()
slurmClient.workflowTracker.notification_log.section_size = 100
Inspect workflow notifications
from pprint import pprint
notifications = slurmClient.workflowTracker.notification_log.select(54, 10)
if notifications:
print(f'Found {len(notifications)} workflow notifications')
[pprint(i.__dict__) for i in notifications]
else:
print('No workflow notifications found')
Eventsourcingο
from biomero import WorkflowTracker
# Process events from the start (use any leader name if desired)
slurmClient.workflowTracker.pull_and_process(
leader_name=WorkflowTracker.__name__,
start=1
)
NotificationLog
# Read the first page of notifications
slurmClient.workflowTracker.notification_log.select(start=1, limit=10)
Aggregate view
# Load an aggregate by its UUID
slurmClient.workflowTracker.repository.get('747fc951-15ca-4b56-a19e-418e1db97d14')
Troubleshootingο
Common Issuesο
SSH Connection Failures:
Check SSH key permissions (600 for private keys)
Verify SSH key is added to HPC cluster
Test SSH connection manually from container
Processor Role Issues:
Verify grid role assignment in docker-compose
Check OMERO.grid node status:
omero admin diagnostics
Ensure only one Processor-0 node is active
BIOMERO Script Failures:
Check script installation on OMERO server container
Verify BIOMERO library version compatibility
Review workflow configuration in
slurm-config.ini
Database Connection Issues:
Verify PostgreSQL connection settings
Check SQLALCHEMY_URL environment variable
Ensure database schema is initialized
Performance Optimizationο
Resource Allocation:
CPU: Processor-intensive role benefits from multiple cores
Memory: OME-ZARR export requires sufficient memory for large datasets
Storage: Temporary data storage for ZARR exports and Slurm imports
Network Optimization:
External Workers: Consider network latency to master
HPC Access: Optimize SSH connection pooling
Data Transfer: Monitor ZARR export/import performance
Upgrade Considerationsο
BIOMERO Library Updatesο
Version Management:
ARG BIOMERO_VERSION
RUN pip install biomero==${BIOMERO_VERSION}
Upgrade Process:
Test new BIOMERO version in development
Update Dockerfile with new version
Rebuild container with updated dependencies
Validate workflows in staging environment
Processor.py Maintenanceο
Warning
Critical Maintenance Task: The custom processor.py
requires periodic review and merging with upstream changes.
Maintenance Process:
Monitor OMERO processor.py updates
Review changes for compatibility and security fixes
Merge important updates while preserving custom environment variable handling
Test thoroughly before deploying to production
Current Custom Features:
HTTP_PROXY and HTTPS_PROXY forwarding to subprocesses
Enhanced environment variable support for BIOMERO workflows