Server Architecture

The xorl training server is built as a layered system of independent processes that communicate over ZMQ. This page covers how those layers fit together, what each component does, and the full API surface.

System Overview

The server is composed of three processes launched by a single Launcher:

Component Deep Dive

Launcher

src/xorl/server/launcher.py

The Launcher is the single entrypoint. It orchestrates startup and shutdown of all sub-processes.

Two modes:

Mode	What it does
`auto`	Spawns GPU workers via `torchrun`, then starts Orchestrator and API Server as subprocesses
`connect`	Attaches to already-running workers (multi-node: head node connects to workers started manually)

Startup sequence:

1. _launch_workers_with_torchrun()    → torchrun subprocess (all GPU ranks)
2. _get_rank0_worker_address()        → discover rank 0 ZMQ address (file-based for multi-node)
3. multiprocessing.Process(run_orchestrator)  → Orchestrator
4. _save_initial_checkpoint()         → checkpoint "000000" before any training
5. multiprocessing.Process(run_api_server)    → API Server
6. wait()                              → poll processes until one dies → stop()

Key config:

Launcher(
    mode="auto",
    config_path="server_config.yaml",
    api_host="0.0.0.0",
    api_port=5555,
    max_running_requests=2,      # concurrent in-flight requests
    max_pending_requests=100,    # request queue depth
    operation_timeout=1800.0,    # seconds per operation
)

API Server

src/xorl/server/api_server/server.py

APIServer is a FastAPI application composed via mixins:

APIServer
  ├── TrainingOpsMixin    forward_backward, forward, optim_step
  ├── WeightsMixin        save/load checkpoints, list, delete, weights_info
  ├── InferenceEndpointsMixin  add/remove/list inference endpoints, sync weights
  └── HealthMixin         /health, /healthz, sleep, wake

Two-phase async pattern:

Every training operation uses a non-blocking two-phase protocol to avoid HTTP timeout issues on long operations:

Phase 1: POST /api/v1/forward_backward
  → APIServer sends OrchestratorRequest via ZMQ ROUTER
  → Returns UntypedAPIFuture { request_id: "uuid-..." } immediately

Phase 2: POST /api/v1/retrieve_future { request_id: "uuid-..." }
  → Returns TryAgainResponse if still running
  → Returns result dict when complete
  → Returns error if failed

The xorl_client SDK handles polling automatically — callers just call .result() on the returned future object.

ZMQ topology:

API Server                    Orchestrator
  ROUTER bind :6000   ──────►  DEALER connect :6000   (API → Engine)
  PULL   bind :6001   ◄──────  PUSH   connect :6001   (Engine → API)

Messages are serialized with msgpack for efficiency.

Orchestrator

src/xorl/server/orchestrator/orchestrator.py

The Orchestrator runs in its own process as an event loop. It is the bridge between the HTTP API and the GPU workers.

Internal structure:

Key classes:

Class	File	Role
`Orchestrator`	`orchestrator.py`	Main event loop; coordinates all sub-components
`Scheduler`	`scheduler.py`	FIFO queue with per-request state (pending → processing → completed/failed)
`RequestProcessor`	`request_processor.py`	Packs incoming datums into micro-batches; dispatches to backend
`RemoteBackend`	`backend/remote.py`	ZMQ PAIR socket to rank 0 worker; sends commands, receives outputs

RunnerDispatcher + ModelRunner

src/xorl/server/runner/runner_dispatcher.py

The dispatcher runs on rank 0 inside the torchrun process group. It acts as the ZMQ boundary between the Orchestrator and the actual GPU computation.

Protocol:

Orchestrator (RemoteBackend)
  → ZMQ PAIR  ──► RunnerDispatcher (rank 0)
                    │  NCCL broadcast → ranks 1..N (command type)
                    │  NCCL scatter   → ranks 1..N (batch data)
                    │
                    ▼  All ranks call ModelRunner.forward_backward()
                    │
                    │  NCCL gather ← ranks 1..N (outputs)
                    │
  ← ZMQ PAIR  ◄── RunnerDispatcher (rank 0)  (result)

Supported commands:

Command	Action
`FORWARD_BACKWARD`	Forward + backward pass, accumulate gradients
`OPTIM_STEP`	Apply gradients, step optimizer, advance LR scheduler
`SAVE_STATE`	Save DCP checkpoint
`LOAD_STATE`	Load DCP checkpoint
`HEALTH_CHECK`	Verify all ranks are alive
`SHUTDOWN`	Graceful exit

src/xorl/server/runner/model_runner.py — ModelRunner handles the actual model forward/backward/optimizer step on each rank. It receives the already-distributed micro-batches and calls into the model using the same FSDP2/EP/PP stack as local training.

Protocol Layer

Messages between the API Server and Orchestrator use typed dataclasses serialized with msgpack:

@dataclass
class OrchestratorRequest:
    request_id: str          # UUID for tracking
    request_type: RequestType  # ADD | ABORT | UTILITY
    operation: str           # "forward_backward" | "optim_step" | ...
    payload: OperationPayload  # ModelPassData | OptimStepData | ...
    seq_id: Optional[int]    # Ordering within a session
    timestamp: Optional[float]

@dataclass
class OrchestratorOutputs:
    request_id: str
    output_type: OutputType  # forward | forward_backward | optim_step | error
    outputs: List[Dict[str, Any]]

Messages between the Orchestrator and workers use a separate protocol defined in protocol/orchestrator_runner.py.

Source

File	Description
`src/xorl/server/launcher.py`	`Launcher` — process orchestration, startup/shutdown lifecycle
`src/xorl/server/api_server/server.py`	`APIServer` — FastAPI app, OrchestratorClient, FutureStore
`src/xorl/server/api_server/endpoints.py`	All FastAPI endpoint handlers
`src/xorl/server/orchestrator/orchestrator.py`	`Orchestrator` — event loop, request queue management
`src/xorl/server/orchestrator/scheduler.py`	`Scheduler` — FIFO request ordering
`src/xorl/server/orchestrator/request_processor.py`	`RequestProcessor` — sample packing, backend dispatch
`src/xorl/server/runner/runner_dispatcher.py`	`RunnerDispatcher` — rank 0 ZMQ↔NCCL bridge
`src/xorl/server/runner/model_runner.py`	`ModelRunner` — actual forward/backward/optim on GPU ranks
`src/xorl/server/protocol/`	Typed msgpack protocol messages between all layers