TPM Manager
Overview
tpmmgr is the EVE microservice responsible for managing the
Trusted Platform Module (TPM)
on EVE systems. The TPM is a
Trusted Computing Group standard
for a discrete hardware chip or platform firmware component that
creates crypto keys, seals secrets, measures system components, and
stores those measurements in a tamper-evident way. EVE targets the
TPM 2.0 revision of the standard; a TPM 1.0 device, or a TPM 2.0
that fails the runtime checks in etpm.IsTpmEnabled(), is treated
as "no usable TPM" and tpmmgr switches to the soft fallback paths.
tpmmgr is the userspace interface between EVE and that hardware (or
firmware).
tpmmgr runs in two modes:
- Single-shot CLI, invoked from
onboot.sh/device-steps.shduring boot to provision the device identity key/cert and the EK / SRK / AK / quote / ECDH keys, generate the matching X.509 certs, and save TPM vendor info. Each subcommand is a one-shot process. - Long-running service, started under
zedboxafter onboarding. It publishes the device's edge-node certificates tozedagent, services controller-initiated attestation quotes viaAttestNonce→AttestQuote, and runs an hourlytpmSanityCheckwhoseTpmSanityStatuspublication drivesnodeagent'sMaintenanceModeReasonTpm{Enc,Quote}Failurepaths. TheEdgeNodeCertpublication carries the EK'sTPM2B_PUBLICas metadata so the controller can verify the key chain back to the TPM manufacturer.
Without a usable TPM (no TPM at all, TPM 1.x, or a TPM 2.0 that fails
etpm.IsTpmEnabled()) the device identity key and ECDH / quote certs
are generated as soft keys on disk so the device can still onboard.
The EK cert is not published, attestation quotes are not produced,
and the sanity check is skipped.
The on-TPM data layout — well-known handles, NV indices for the
device cert and credentials, the sealed disk-key blob, PCR selections
— is owned by the pkg/pillar/evetpm package; see
pkg/pillar/evetpm/evetpm.md for the canonical
table. tpmmgr is the microservice that drives it; vaultmgr and
zedagent use the same package directly for vault key sealing/unsealing
and for cert hashing.
Where edge-node certificates live on disk
| Cert / key | Path | TPM-backed? | Survives EVE re-install? |
|---|---|---|---|
| Device cert PEM | /config/device.cert.pem |
Yes (private key in TPM) | Yes — the cert PEM is also mirrored into TPM NV index 0x1500000, so a fresh /config is restored from NV on first boot |
| Device key PEM | /config/device.key.pem |
No (soft path only) | Yes if present (only exists on no-TPM platforms) |
| ECDH cert / key | /persist/certs/ecdh.{cert,key}.pem |
Cert yes; soft key only when TPM path failed | No — regenerated on first boot after re-install |
| Quote ("attest") cert / key | /persist/certs/attest.{cert,key}.pem |
Cert yes; soft key only when TPM path failed | No — regenerated on first boot after re-install |
| EK cert | /persist/certs/ek.cert.pem |
Yes (no soft variant) | No — regenerated on first boot after re-install |
The asymmetry matters: re-installing EVE preserves the device
identity (because it lives in TPM NVRAM), but the EK / ECDH / quote
certs are regenerated. The controller therefore sees fresh
EdgeNodeCert publications after every re-install even though the
device's identity hasn't changed.
Key Input/Output
tpmmgr consumes (via pubsub unless noted):
- global configuration
ConfigItemValueMapfromzedagent— only used for log levels.- attestation requests
AttestNoncefromzedagent— published with a fresh nonce each time the controller wants a quote; carriesRequesterfor log traceability andNonce(12–32 bytes).- TPM device
/dev/tpmrm0(etpm.TpmDevicePath) — the resource-managed TPM device file. All TPM2 commands go through this.- Persistent handles (EK / SRK / AK / quote / ECDH at
0x8100000{1..5}, device key at0x817FFFFF) and NV indices (device-cert backup0x1500000, credentials0x1600000, sealed disk-key pair0x1800000/0x1900000) — see evetpm.md for the canonical table.tpmmgritself touches the device key (for cert signing viaetpm.TpmPrivateKey), the quote key (attestation), NV0x1500000(device-cert backup), and NV0x1600000(the credential used as the device-keyuserAuth); the sealed disk-key NV pair is owned byvaultmgr. - on-disk
/config/tpm_credential— TPM credentials cache, mirrored from NV index0x1600000. Created on first boot; used as the device-keyuserAuth— seeTpmPasswdHdlin evetpm.md./config/device.cert.pem,/config/device.key.pem— device identity cert and (for non-TPM platforms) its private key./persist/certs/ecdh.cert.pem,attest.cert.pem,ek.cert.pem— published edge-node certs./persist/certs/ecdh.key.pem,attest.key.pem— soft private keys, written when the TPM path is unavailable or the per-cert TPM path fails and the dispatcher falls back to soft.
tpmmgr publishes:
EdgeNodeCert(persistent) — one entry per cert type (CertTypeEcdhXchange,CertTypeRestrictSigning,CertTypeEk). Carries the cert PEM, a sha256-first-16 hash asCertID, anIsTpmflag indicating whether the corresponding key actually lives in the TPM, and (for the EK only) aMetaDataItemsslice containing the EK'sTPM2B_PUBLICblob. Consumed byzedagent, forwarded to the controller.AttestQuote— one entry per nonce, keyed on the nonce itself. Carries the signed quote (ECDSA-SHA256), the signature, and the raw PCR values (SHA256 bank) read separately for the controller's use. Only produced when a usable TPM is present. The read range and the in-quote range are separately defined — see Attestation quote. Consumed byzedagent's attestation FSM, then unpublished after the controller confirms receipt.TpmSanityStatus— keyed on the TPM device path.Statusis one ofMaintenanceModeReasonNone(TPM healthy),MaintenanceModeReasonTpmEncFailure(encrypt/decrypt round-trip failed), orMaintenanceModeReasonTpmQuoteFailure(Quotecmd failed). Consumed bynodeagentto drive Maintenance Mode.
Components
tpmmgr lives in a single source file (cmd/tpmmgr/tpmmgr.go) with
no internal sub-packages. The logical responsibilities are partitioned
across function groups as follows.
Lifecycle / pubsub wiring (long-running service)
Run() blocks on wait.WaitForOnboarded(), subscribes
ConfigItemValueMap and AttestNonce, creates the publications,
and immediately publishes the three edge-node certificates that
must reach the controller for attestation to work:
ecdh.cert.pemasCertTypeEcdhXchangeattest.cert.pemasCertTypeRestrictSigningek.cert.pemasCertTypeEkwith the EK'sTPM2B_PUBLICattached viagetEkCertMetaData()
IsTpm is set per-cert: true only if the TPM is enabled and no soft
fallback key file is present (etpm.EcdhKeyFile for ECDH,
quoteKeyFile for the quote cert). The EK is always TPM-backed and
is not published when there is no usable TPM.
After GCInitialized, Run() enters its event loop driving an
hourly tpmSanityCheckTicker.
Single-shot CLI (runCommand)
Subcommands and the boot script that invokes each:
| Subcommand | Caller | What it does |
|---|---|---|
createDeviceCert |
device-steps.sh (TPM present, no soft key file) |
TPM-rooted device-identity provisioning — see Key + cert creation. |
createSoftDeviceCert |
device-steps.sh (no TPM, or createDeviceCert failed) |
Soft device-identity provisioning. |
createCerts |
device-steps.sh (TPM present) |
TPM-rooted ECDH / quote / EK certs; per-cert soft fallback. |
createSoftCerts |
device-steps.sh (no TPM) |
Soft ECDH and quote certs (no EK). |
genCredentials / readCredentials |
startup | Generate/read the UUID at /config/tpm_credential ↔ NV 0x1600000. Used as the device-key userAuth (see Key Input/Output). |
saveTpmInfo <file> |
device-steps.sh |
Dump TPM vendor / firmware description (etpm.FetchTpmHwInfoDescription) for diagnostic bundles. |
printCapability / printPCRs |
manual (debugging) | Print vendor info; read all PCRs (SHA256) and a sample quote. |
testTpmEcdhSupport / testEcdhAES / testEncryptDecrypt |
manual / CI (debugging) | Exercise the ECDH, ECDH-AES, and EncryptDecryptUsingTpm round trips (the same primitives vaultmgr and the controller cipher path use). |
Key + cert creation (TPM-rooted vs soft)
For each non-EK cert (device, quote, ECDH), tpmmgr has paired
create*OnTpm / create*Soft implementations. The TPM-rooted variant
signs via etpm.TpmPrivateKey (which routes through the device key
at 0x817FFFFF); the soft variant generates an in-memory ECDSA P-256
key with ecdsa.GenerateKey and writes the matching .key.pem next
to the cert. The dispatchers createQuoteCert / createEcdhCert
prefer TPM and fall back to soft on failure. createDeviceCert does
not fall back internally — device-steps.sh catches the non-zero
exit and invokes createSoftDeviceCert.
createOtherKeys(override bool) bulk-creates the EK / SRK / AK /
quote / ECDH keys at their well-known handles. override=true
(createDeviceCert) wipes and recreates from a fresh owner;
override=false (createCerts) only fills empty slots. The EK has
no soft variant and is therefore not published when there is no
usable TPM.
Attestation quote (getQuote / handleAttestNonceImpl)
getQuote(nonce) validates nonce length (12–32 bytes), reads every
PCR 0..23 in the SHA256 bank, and signs PCRs 0..15
(etpm.PcrListForQuote) via tpm2.Quote(TpmQuoteKeyHdl, …). The raw
0..23 values ride along in AttestQuote.PCRs (unsigned), letting the
controller inspect the dynamic-PCR range without re-deriving the
quote; PCRs 16–23 are deliberately excluded from the signed range.
The signed selection itself is fixed — disk-key sealing in vaultmgr
is a separate path with its own controller-configurable selection
(see PCR selections in evetpm.md).
handleAttestNonceImpl calls getQuote and publishes
AttestQuote{Nonce, SigType:EcdsaSha256, Signature, Quote, PCRs}
keyed on the nonce; handleAttestNonceDelete unpublishes on delete.
TPM sanity check
tpmSanityCheck() runs every hour from the long-running service.
It exists because some TPM failures only surface when the device
needs an upgrade or the controller needs a quote — i.e. far too
late. The check verifies the two operations whose failure modes are
not caught by ordinary device traffic:
etpm.EncryptDecryptUsingTpmround-trip — this is whatvaultmgruses to wrap/unwrap vault keys; failure means a PCR-changing upgrade can no longer be rescued by the controller, which would maroon the device on the old image.tpm2.Quotewith a random nonce — this is what attestation uses; failure means the controller cannot verify integrity even iftpmmgris otherwise running.
The result is published as TpmSanityStatus keyed on the TPM device
path. nodeagent watches this publication and lifts the corresponding
maintenance reason when the status flips back to
MaintenanceModeReasonNone.
The other TPM-failure paths are caught implicitly: vault seal/unseal
failure sets VaultStatus=ERROR (which nodeagent handles), and
device-key signing is exercised on every onboarding handshake.
TSS 2.0 backend
TPM2 commands go through Google's
go-tpm (Apache 2.0). EVE pulls in
both the legacy/tpm2 API (used by tpmmgr and most of evetpm) and
the newer top-level tpm2 (used by pkg/pillar/evetpm/enc_seal.go for
encrypted disk-key seal/unseal, where parameter encryption is only
exposed on the v2 API). The full file-by-file split and the v1→v2
migration status are documented in
pkg/pillar/evetpm/evetpm.md.
Control-flow
There are four largely independent paths through tpmmgr.
1. First-boot provisioning (single-shot CLI)
device-steps.sh: device.cert.pem missing
if /dev/tpmrm0 present and /config/device.key.pem absent:
tpmmgr createDeviceCert
├─ initializeDirs()
├─ genCredentials() /config/tpm_credential + NV 0x1600000
├─ readDeviceCert() re-read previous cert from NV (idempotent)
└─ on miss:
├─ createDeviceKey() CreatePrimary at owner hierarchy,
│ EvictControl into TpmDeviceKeyHdl
├─ createDeviceCertOnTpm() x509 cert signed via TpmPrivateKey
├─ writeDeviceCert() /config/device.cert.pem + NV 0x1500000
└─ createOtherKeys(override=true) EK/SRK/AK/quote/ECDH
on failure:
tpmmgr createSoftDeviceCert soft ECDSA, /config/device.{key,cert}.pem
else:
tpmmgr createSoftDeviceCert
Then, once /config/server and /config/root-certificate.pem are
present:
device-steps.sh: provision additional certs
if /dev/tpmrm0 present and /config/device.key.pem absent:
tpmmgr createCerts
├─ createOtherKeys(override=false) create only what's missing
├─ createEcdhCert() TPM-rooted, fall back to soft
├─ createQuoteCert() TPM-rooted, fall back to soft
└─ createEkCert() TPM-rooted, no soft fallback
else:
tpmmgr createSoftCerts ECDH + quote, soft only
2. Boot-time long-running service (Run())
Run()
└─ wait.WaitForOnboarded()
└─ subscribe ConfigItemValueMap, AttestNonce
└─ create publications: AttestQuote, EdgeNodeCert (persistent), TpmSanityStatus
└─ publishEdgeNodeCertToController(ecdh.cert.pem, CertTypeEcdhXchange, IsTpm=…)
└─ publishEdgeNodeCertToController(attest.cert.pem, CertTypeRestrictSigning, IsTpm=…)
└─ publishEdgeNodeCertToController(ek.cert.pem, CertTypeEk, IsTpm=true,
metaData=getEkCertMetaData()) TPM2B_PUBLIC
└─ wait for GCInitialized
└─ if /config/tpm_credential missing: readCredentials() recover from NV
└─ start tpmSanityCheckTicker (1h)
└─ event loop:
AttestNonce arrives → handleAttestNonceImpl
→ getQuote() → publish AttestQuote
AttestNonce deleted → handleAttestNonceDelete → unpublish
tpmSanityCheckTicker fires → periodicTpmSanityCheck
→ publish TpmSanityStatus
3. Attestation quote
zedagent.AttestNonce{Nonce, Requester} arrives
→ handleAttestNonceImpl
→ getQuote(Nonce)
├─ validate len(Nonce) ∈ [12,32]
├─ for i in 0..23: tpm2.ReadPCR(i, AlgSHA256)
└─ tpm2.Quote(TpmQuoteKeyHdl, Nonce, PcrListForQuote=[0..15])
→ asn1.Marshal({R,S})
→ publish AttestQuote{Nonce, SigType:EcdsaSha256, Signature, Quote, PCRs}
zedagent picks up AttestQuote and forwards it to controller.
After controller acks: zedagent removes AttestNonce
→ handleAttestNonceDelete → unpublish AttestQuote
4. Periodic TPM sanity check
tpmSanityCheckTicker fires (every 1h)
→ periodicTpmSanityCheck
if !etpm.IsTpmEnabled(): return
→ tpmSanityCheck()
├─ EncryptDecryptUsingTpm(msg, true) encrypt
├─ EncryptDecryptUsingTpm(enc, false) decrypt
├─ bytes.Equal(msg, decrypted)?
└─ getQuote(random nonce)
on failure:
→ publish TpmSanityStatus{Status:TpmEncFailure | TpmQuoteFailure,
ErrorAndTime{...}}
on success:
→ publish TpmSanityStatus{Status:None}
nodeagent watches TpmSanityStatus and adds/removes the matching
maintenance reason. A device whose TPM has degraded — but is still
running — therefore goes into Maintenance Mode within an hour, well
before the next baseos upgrade attempt.
Debugging
PubSub
ls /persist/status/tpmmgr/EdgeNodeCert/ # one file per cert type
cat /run/tpmmgr/AttestQuote/*.json | jq # only present during attestation
cat /run/tpmmgr/TpmSanityStatus/*.json | jq
EdgeNodeCert is the canonical "what does this device claim to be?"
output. TpmSanityStatus.Status==MaintenanceModeReasonNone and a
recent ErrorAndTime cleared is the "TPM healthy" indicator.
Files of interest
/dev/tpmrm0— resource-managed TPM device/config/device.cert.pem,/config/device.key.pem— identity cert (key file only present on no-TPM platforms)/config/tpm_credential— TPM credentials cache, mirror of NV0x1600000/persist/certs/{ecdh,attest,ek}.cert.pem— published certs/persist/certs/{ecdh,attest}.key.pem— soft private keys, if any/persist/status/tpm_measurement_seal_success,/persist/status/tpm_measurement_unseal_fail— written bypkg/pillar/evetpm(not bytpmmgritself); used byFindMismatchingPCRsto diagnose why a vault unseal failed
Useful CLIs (from a pillar shell)
/opt/zededa/bin/tpmmgr printCapability # vendor / firmware info
/opt/zededa/bin/tpmmgr printPCRs # PCR 0..23 SHA256 + sample quote
/opt/zededa/bin/tpmmgr testTpmEcdhSupport # ECDHKeyGen + ECDHZGen round trip
/opt/zededa/bin/tpmmgr testEncryptDecrypt # what vaultmgr's wrap/unwrap relies on
/opt/zededa/bin/tpmmgr testEcdhAES # full ECDH-AES exchange against ecdh.cert.pem
/opt/zededa/bin/tpmmgr saveTpmInfo /tmp/tpm.txt
Logs
tpmmgr's log records ship through newlogd like every other
agent. On a running device, recent (not-yet-uploaded) batches land
under /persist/newlog/devUpload/ as gzipped JSON; filter on
source=tpmmgr:
zcat /persist/newlog/devUpload/*.gz | jq -c 'select(.source=="tpmmgr")'
Once uploaded, the same records are available in the controller's log store.
Useful grep patterns
These are literal substrings from log calls in
pkg/pillar/cmd/tpmmgr/tpmmgr.go; they have no printf directives
and can be fed directly to grep (or jq 'select(.msg | contains(...))').
"TPM Quote Nonce" – attestation quote requested by zedagent
"Received quote request from" – same line, includes the Requester string
"publishing quote for nonce" – AttestQuote being published
"Unpublishing quote for nonce" – AttestQuote being removed after controller ack
"TPM sanity check failed" – periodic 1h check failed (errored detail follows)
"failed to encrypt key using TPM" – TpmEncFailure path (wrapped into TpmSanityStatus.ErrorAndTime)
"failed to get quote using TPM" – TpmQuoteFailure path (wrapped into TpmSanityStatus.ErrorAndTime)
"readDeviceCert success" – first-boot path found cert in NV index 0x1500000
"readDeviceCert failed" – first-boot path actually creating the device key (subsequent log line: "generating new key and cert")
"CreatePrimary failed" – TPM owner-hierarchy refusal; usually means BIOS reset needed
"Error in creating Endorsement Key Certificate" – EK cert creation failed (no soft fallback; cert simply not published)
"publishEdgeNodeCertToController failed: no cert file" – cert PEM missing on disk
Forcing transitions for development
- Trigger the sanity-check failure path: replace
/dev/tpmrm0with aswtpminstance whose state has been damaged (e.g. force a key handle to be evicted). The nexttpmSanityCheckTickertick will publishTpmSanityStatus{Status:TpmEncFailure}, and within 10snodeagentwill addMaintenanceModeReasonTpmEncFailuretoNodeAgentStatus.LocalMaintenanceModeReasons. - Force soft fallback on an already-onboarded device: drop a
(dummy)
/config/device.key.pemfile.etpm.IsTpmEnabled()is defined asdevice.cert.pem exists AND device.key.pem does NOT exist, so the soft key file makes it returnfalse; subsequentEdgeNodeCertpublications carryIsTpm=false. This does not retro-actively unwind earlier TPM state — keys at the well-known handles remain, and an EK cert that was already published stays in/persist/certs/. To exercise the no-TPM provisioning paths from scratch, omit/dev/tpmrm0before the firstdevice-steps.shrun on a fresh install. - Re-issue device cert without re-onboarding: delete
/config/device.cert.pemand reboot.device-steps.shwill re-invoketpmmgr createDeviceCert, which finds the previous cert in NV0x1500000viareadDeviceCert()and restores it — no re-onboarding required.
Further reading
- EVE Security Architecture —
device identity, onboarding, attestation, and the broader EVE
security foundation that
tpmmgrplugs into. - Trusted Computing Group — the governing body for TPM specifications.
go-tpm— the Go librarytpmmgrandpkg/pillar/evetpmuse to drive the TPM (Apache 2.0).