EVE Microservice Classification
EVE's pillar package implements a set of microservices that together provide
the complete EVE OS experience. These microservices can be divided into two
categories based on their role: device management and workload management.
Note: this document covers only the pillar microservices (the processes that
run inside the pillar container). EVE also runs a set of linuxkit system
services outside of pillar — such as newlogd, wwan, vtpm, watchdog,
vector, edgeview, and others — whose classification is outside the scope of
this document.
Why this distinction matters
Remote manageability is the fundamental promise of EVE: a deployed edge device should never require physical access. Maintaining that property requires that the device can always reach its controller, receive configuration updates, roll out new EVE-OS versions, and maintain its cryptographic identity. A bug in a device management microservice that prevents controller communication cannot be recovered remotely and requires a physical truck roll.
Workload management microservices handle running applications on the device — creating domains, network instances, and volumes for user workloads. If a workload management service misbehaves, the device remains remotely manageable: the controller can push a corrected EVE version or a revised workload configuration to recover the situation without physical access.
This distinction has direct implications for:
- Testing priority: Device management microservices warrant more thorough test coverage because their failure mode is catastrophic (loss of remote manageability). Workload management failures, while serious, are remotely recoverable.
- Code review rigor: Changes to device management microservices should be reviewed and tested more carefully than workload management changes.
- EVE image partitioning: EVE's dual IMGA/IMGB rootfs partitions exist precisely
to enable safe rollback of device management code. Since workload runtime failures
are recoverable without physical access, workload runtimes are candidates for
living outside the A/B partitions — in
/persist— as explored in the EVE-K design.
Device Management Microservices
These microservices are essential for keeping the device remotely manageable. They handle controller communication, EVE-OS updates, device identity, and hardware security. A sustained failure in any of these may require physical intervention to recover the device.
| Microservice | Role |
|---|---|
nim |
Manages network interfaces; ensures the device can reach the controller |
zedagent |
Retrieves device configuration from the controller; distributes it to other services via pubsub |
client |
Device registration and initial onboarding with the controller |
baseosmgr |
Manages EVE base OS downloads and A/B partition update state machine |
downloader |
Downloads content from datastores (EVE updates and app content) |
verifier |
Verifies cryptographic integrity of downloaded content |
volumemgr |
Manages storage volumes (EVE-OS update volumes and application volumes) |
nodeagent |
Manages node state transitions, reboots, and hardware watchdog |
tpmmgr |
TPM provisioning, vault management, and device certificate lifecycle |
loguploader |
Collects and uploads device and application logs to the controller |
downloader, verifier, and volumemgr are dual-use: they serve both EVE-OS
update content and application content. Their correct operation is therefore
critical for both remote management and workload deployment.
Workload Management Microservices
These microservices create and operate application workloads — network instances, application volumes, and application domains. They are not required for the device to remain remotely manageable.
| Microservice | Role |
|---|---|
domainmgr |
Creates and manages application domains (VMs, containers, unikernels) |
zedrouter |
Creates and manages network instances for application connectivity |
zedmanager |
Orchestrates app instance lifecycle state machines |
diag |
Diagnostics: tests controller reachability and reports device health to console; observability only, not in the controller communication path |
The hypervisor layer (KVM, Xen), the container runtime (containerd for user
applications), and optional runtimes such as the Kubernetes distribution used by
EVE-K are also part of workload management. The EVE-K design takes this further by
placing the Kubernetes runtime and its associated storage provider (Longhorn) in
/persist rather than in the IMGA/IMGB partitions, decoupling their lifecycle from
EVE core updates entirely.
Relationship to image partitioning and rollback
EVE's A/B partition scheme is designed around device management. If a new EVE version contains a bug in a device management microservice, the device falls back to the previous IMGA/IMGB image, restoring remote manageability. Because workload management failures do not prevent the controller from reaching the device, there is no comparable need to include workload runtimes in the A/B partitions.
This is the architectural justification for extracting workload runtimes out of the EVE rootfs: it enables a richer set of optional runtimes (different hypervisors, Kubernetes variants, proprietary GPU stacks) without inflating the device management footprint that must fit in the size-constrained IMGA/IMGB partitions and without compromising the rollback guarantee that protects remote manageability.
Testing implications
See CODE-COVERAGE.md for how this classification informs test coverage priorities. The short summary: any coverage gap in device management code carries higher operational risk (physical intervention required to recover) than an equal gap in workload management code, so device management gaps should be addressed first.