BIOMERO Worker Container ======================== The BIOMERO worker container handles distributed image analysis processing through Slurm integration and serves as the dedicated OMERO.grid Processor node. Overview -------- Based on the ``openmicroscopy/omero-server`` image, this container is configured as a specialized OMERO worker node that exclusively handles script execution via the **Processor-0** role in OMERO.grid. **Grid Role Assignment**: .. code-block:: yaml CONFIG_omero_server_nodedescriptors: >- master:Blitz-0 omeroworker-1:Tables-0,Indexer-0,PixelData-0,DropBox,MonitorServer,FileServer,Storm biomeroworker-external:Processor-0 This ensures all OMERO script execution (**including BIOMERO scripts**) is routed to this container, which has the specialized environment needed for HPC cluster integration. Key Features ------------ **HPC Integration** * **SSH Access**: Direct SSH connectivity to Slurm clusters * **Single Account Model**: One SSH key, one Slurm account per deployment * **Secure Mounting**: SSH keys mounted securely via startup scripts, not prepackaged **Workflow Processing** * **OMERO Script Execution**: All script processing via Processor-0 role * **Event Sourcing**: Complete workflow tracking in PostgreSQL database * **Data Export**: ZARR format export for HPC workflows **Analysis Pipeline** * **Format Conversion**: OMERO → ZARR → TIFF workflow * **Multi-format Support**: Handles diverse input formats via bioformats2raw * **Workflow Management**: Configurable analysis pipelines Container Customizations ------------------------ SSH Integration ~~~~~~~~~~~~~~ **SSH Client Installation**: .. code-block:: dockerfile RUN yum install -y openssh-clients COPY biomeroworker/10-mount-ssh.sh /startup/10-mount-ssh.sh **SSH Key Mounting** (``10-mount-ssh.sh``): * Copies SSH keys from ``/tmp/.ssh`` to ``/opt/omero/server/.ssh`` * Sets proper permissions (700 for directory, 600 for private keys) * Enables secure HPC cluster access, without baking secrets into the container image **Usage**: .. code-block:: yaml # Mount SSH keys in docker-compose volumes: - "$HOME/.ssh:/tmp/.ssh:ro" Database Integration ~~~~~~~~~~~~~~~~~~~ **PostgreSQL Support**: .. code-block:: dockerfile RUN yum install -y python3-devel postgresql-devel gcc **Purpose**: * **Event Sourcing**: Complete workflow execution tracking * **Analytics**: Detailed workflow performance data * **Audit Trail**: Full history of analysis jobs and statuses * **SLURM Job Accounting**: Tracks resource usage per job and per OMERO user **BIOMERO 2.0 Feature**: Near real-time event logging provides a single source of truth for all workflow events. Data Export Pipeline ~~~~~~~~~~~~~~~~~~~ **bioformats2raw Installation**: .. code-block:: dockerfile RUN wget https://github.com/glencoesoftware/bioformats2raw/releases/download/v0.7.0/bioformats2raw-0.7.0.zip **ZARR Export Support**: .. code-block:: dockerfile RUN yum install -y blosc-devel # ... RUN $VIRTUAL_ENV/bin/python -m pip install omero-cli-zarr==0.5.5 **Export Workflow**: 1. **OMERO Data** → Export via ``omero-cli-zarr`` 2. **ZARR Format** → Universal intermediate format 3. **TIFF Conversion** → On HPC cluster for analysis tools 4. **Results Import** → Back to OMERO as new images/annotations BIOMERO Library Integration ~~~~~~~~~~~~~~~~~~~~~~~~~~ **Core BIOMERO Installation**: .. code-block:: dockerfile RUN $VIRTUAL_ENV/bin/python -m pip install biomero==${BIOMERO_VERSION} **Supporting Libraries**: For BIOMERO scripts. These can have extra dependencies above just BIOMERO python library. .. code-block:: dockerfile RUN $VIRTUAL_ENV/bin/python -m pip install \ ezomero==1.1.1 \ tifffile==2020.9.3 \ omero-metadata==0.12.0 **Zero-C ICE Pre-built Wheels**: .. code-block:: dockerfile RUN wget https://github.com/glencoesoftware/zeroc-ice-py-linux-x86_64/releases/download/20240202/zeroc_ice-3.6.5-cp39-cp39-manylinux_2_28_x86_64.whl Custom Processor Implementation ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ **Modified processor.py**: .. code-block:: dockerfile COPY biomeroworker/processor.py /opt/omero/server/venv3/lib/python3.9/site-packages/omero/ .. warning:: **Maintenance Alert**: This file overrides the base OMERO processor.py and may conflict with future OMERO updates. **Key Changes**: * Environment variable forwarding to subprocesses (HTTP_PROXY, etc.) * Enhanced subprocess handling for BIOMERO workflows **Maintenance Required**: Periodically merge important changes from upstream OMERO processor.py to maintain compatibility. **Original Source**: `ome/omero-py processor.py `_ Configuration Management ----------------------- Slurm Configuration ~~~~~~~~~~~~~~~~~~ **Base Configuration** (``slurm-config.ini``): .. code-block:: dockerfile COPY biomeroworker/slurm-config.ini /etc/slurm-config.ini This file contains: **SSH Settings**: .. code-block:: ini [SSH] host=localslurm # SSH alias for cluster connection **Slurm Paths**: .. code-block:: ini [SLURM] slurm_data_path=/data/my-scratch/data slurm_images_path=/data/my-scratch/singularity_images/workflows slurm_script_path=/data/my-scratch/slurm-scripts **Workflow Models**: * Cellpose segmentation * StarDist segmentation * CellProfiler measurements * Custom analysis workflows **Configuration Override**: The web interface can mount a different configuration that overrides this base file: .. code-block:: yaml # In docker-compose volumes: - "./slurm-config-override.ini:/etc/slurm-config.ini:ro" .. note:: **Configuration Hierarchy**: 1. **Base file** (in container): Default workflows and settings 2. **Override file** (mounted): Admin customizations via web interface 3. **Limitation**: Override can modify/add but cannot delete base configurations Analytics Configuration ~~~~~~~~~~~~~~~~~~~~~~ **BIOMERO 2.0 Analytics** (from ``slurm-config.ini``): .. code-block:: ini [ANALYTICS] track_workflows=True enable_job_accounting=True enable_job_progress=True enable_workflow_analytics=True **Database Connection**: .. code-block:: ini # Uses environment variable SQLALCHEMY_URL or container's PostgreSQL connection sqlalchemy_url=postgresql+psycopg2://user:password@localhost:5432/biomero Worker Startup Process --------------------- Configuration Generation ~~~~~~~~~~~~~~~~~~~~~~~ The startup script dynamically generates OMERO configuration: **Internal Worker** (``99-run.sh``): .. code-block:: bash # For workers in same Docker network MASTER_ADDR=$(getent hosts $CONFIG_omero_master_host | cut -d\ -f1) WORKER_ADDR=$(getent hosts $OMERO_WORKER_NAME | cut -d\ -f1) **Worker Configuration**: .. code-block:: bash cat > OMERO.server/etc/$OMERO_WORKER_NAME.cfg << EOF IceGrid.Node.Endpoints=tcp -h $WORKER_ADDR -p $WORKER_PORT IceGrid.Node.Name=$OMERO_WORKER_NAME IceGrid.Node.Data=var/$OMERO_WORKER_NAME Ice.StdOut=var/log/$OMERO_WORKER_NAME.out EOF **ICE Configuration**: .. code-block:: bash sed -e "s/@omero.master.host@/$MASTER_ADDR/" \ OMERO.server/etc/templates/ice.config > \ OMERO.server/etc/ice.config Development Guidelines ---------------------- BIOMERO Script Development ~~~~~~~~~~~~~~~~~~~~~~~~~ **Script Location**: BIOMERO scripts are installed on the **OMERO server container**, not the worker: * Scripts live in: ``/opt/omero/server/OMERO.server/lib/scripts/biomero/`` * Worker executes scripts via OMERO.grid Processor-0 role * Script changes require OMERO server container rebuild on release, but during development you can just upload them through web for on-the-fly testing! **Workflow Development**: 1. **Create workflow**: In separate repository (e.g., `W_NucleiSegmentation-Cellpose `_) 2. **Add to config**: Update ``slurm-config.ini`` with new workflow 3. **Test locally**: Use development environment 4. **Deploy**: Release NL-BIOMERO with new workflow support via the config. Or just set it via admin in a live environment. SSH Key Management ~~~~~~~~~~~~~~~~~ **Development Setup**: .. code-block:: bash # Generate SSH key for HPC access ssh-keygen -t rsa -f ~/.ssh/hpc_key # Add public key to HPC cluster # Mount in development docker-compose **Production Deployment**: * **Single SSH Key**: One key per deployment * **Single Slurm Account**: One account per deployment * **Security**: Keys should be rotated regularly * **Access Control**: Limit SSH key to specific HPC resources Configuration Testing ~~~~~~~~~~~~~~~~~~~~ **Test Slurm Configuration**: .. code-block:: bash # Access worker container docker-compose exec biomeroworker bash # Test SSH connection ssh localslurm # SSH alias for HPC cluster in config.ini # Test BIOMERO configuration python -c "from biomero.slurm_client import SlurmClient; client = SlurmClient.from_config(); print(client.validate())" **Test Analytics Database**: Check database connection and initialize analytics (Option 1: direct configuration) .. code-block:: python from biomero import SlurmClient slurmClient = SlurmClient(track_workflows=True, enable_job_accounting=False, enable_job_progress=True, enable_workflow_analytics=False) slurmClient.initialize_analytics_system(True) print('Analytics system initialized') Option 2: From config file .. code-block:: python from biomero import SlurmClient slurmClient = SlurmClient.from_config() slurmClient.workflowTracker.notification_log.section_size = 100 Inspect workflow notifications .. code-block:: python from pprint import pprint notifications = slurmClient.workflowTracker.notification_log.select(54, 10) if notifications: print(f'Found {len(notifications)} workflow notifications') [pprint(i.__dict__) for i in notifications] else: print('No workflow notifications found') Eventsourcing ~~~~~~~~~~~~~ .. code-block:: python from biomero import WorkflowTracker # Process events from the start (use any leader name if desired) slurmClient.workflowTracker.pull_and_process( leader_name=WorkflowTracker.__name__, start=1 ) NotificationLog .. code-block:: python # Read the first page of notifications slurmClient.workflowTracker.notification_log.select(start=1, limit=10) Aggregate view .. code-block:: python # Load an aggregate by its UUID slurmClient.workflowTracker.repository.get('747fc951-15ca-4b56-a19e-418e1db97d14') Troubleshooting -------------- Common Issues ~~~~~~~~~~~~ **SSH Connection Failures**: * Check SSH key permissions (600 for private keys) * Verify SSH key is added to HPC cluster * Test SSH connection manually from container **Processor Role Issues**: * Verify grid role assignment in docker-compose * Check OMERO.grid node status: ``omero admin diagnostics`` * Ensure only one Processor-0 node is active **BIOMERO Script Failures**: * Check script installation on OMERO server container * Verify BIOMERO library version compatibility * Review workflow configuration in ``slurm-config.ini`` **Database Connection Issues**: * Verify PostgreSQL connection settings * Check SQLALCHEMY_URL environment variable * Ensure database schema is initialized Performance Optimization ~~~~~~~~~~~~~~~~~~~~~~~ **Resource Allocation**: * **CPU**: Processor-intensive role benefits from multiple cores * **Memory**: OME-ZARR export requires sufficient memory for large datasets * **Storage**: Temporary data storage for ZARR exports and Slurm imports **Network Optimization**: * **External Workers**: Consider network latency to master * **HPC Access**: Optimize SSH connection pooling * **Data Transfer**: Monitor ZARR export/import performance Upgrade Considerations --------------------- BIOMERO Library Updates ~~~~~~~~~~~~~~~~~~~~~~ **Version Management**: .. code-block:: dockerfile ARG BIOMERO_VERSION RUN pip install biomero==${BIOMERO_VERSION} **Upgrade Process**: 1. **Test new BIOMERO version** in development 2. **Update Dockerfile** with new version 3. **Rebuild container** with updated dependencies 4. **Validate workflows** in staging environment Processor.py Maintenance ~~~~~~~~~~~~~~~~~~~~~~~ .. warning:: **Critical Maintenance Task**: The custom ``processor.py`` requires periodic review and merging with upstream changes. **Maintenance Process**: 1. **Monitor** `OMERO processor.py updates `_ 2. **Review changes** for compatibility and security fixes 3. **Merge important updates** while preserving custom environment variable handling 4. **Test thoroughly** before deploying to production **Current Custom Features**: * HTTP_PROXY and HTTPS_PROXY forwarding to subprocesses * Enhanced environment variable support for BIOMERO workflows Related Documentation --------------------- * :doc:`omeroserver` - Server container and script installation * :doc:`../architecture` - Overall system architecture * :doc:`releases` - Container release process * `BIOMERO Documentation `_ * `OMERO.grid Documentation `_ * `Slurm Documentation `_ External Resources ------------------ * `BIOMERO Scripts Repository `_ * `Example Workflow: Cellpose `_ * `bioformats2raw Documentation `_ * `OMERO CLI ZARR Plugin `_