Nomad
Key Management
Nomad servers maintain an encryption keyring used to encrypt Variables and sign task workload identities. The servers encrypt these data encryption keys (DEK) and store the wrapped keys in Raft.
The key encryption key (KEK) used to encrypt the DEK is controlled by the
keyring
provider. When using an external KMS or Vault transit encryption
providers, the KEK is securely stored outside of Nomad. For the default AEAD
provider, the KEK is stored in cleartext in Raft.
Under normal operations the keyring is entirely managed by Nomad, but this section provides administrators additional context around key replication and recovery.
Key Rotation
Only one key in the keyring is "active" at any given time, and all encryption and signing operations happen on the leader. Nomad automatically rotates the active encryption key every 30 days. When a key is rotated, the existing keys are marked as "inactive" but not deleted, so they can be used for decrypting previously encrypted variables and verifying workload identities for existing allocations.
If you believe key material has been compromised, you can execute nomad
operator root keyring rotate -full
. A new "active" key will be created and
"inactive" keys will be marked "rekeying". Nomad will asynchronously decrypt and
re-encrypt all variables with the new key. As each key's variables are encrypted
with the new key, the old key will marked as "deprecated".
Key Decryption
When a leader is elected, it creates the keyring if it does not already exist. When a key is added, the new wrapped key material is replicated via Raft. As each server replicates the new key, it will start a task to decrypt the key material. Until this task completes, the server will not be able to serve requests that require this key.
Key Redaction in Raft Snapshots
The default AEAD keyring
configuration stores the KEK in Raft. Raft snapshots
will contain the cleartext KEK. The nomad operator snapshot save
command has a
-redact
option that will remove the key material when creating a snapshot. The
nomad operator snapshot redact
command will remove key material from an
existing snapshot.
Redacting key material is not required when using an external KMS.
Legacy Keystore
Versions of Nomad prior to 1.9.0 stored only key metadata in raft, but the
encryption key material was stored in a separate file in the keystore
subdirectory of the Nomad data directory. These files have the extension
.nks.json
. The key material in each file is wrapped in a unique key encryption
key (KEK) that is not shared between servers.
Each server runs a key replication process that watches for changes to the state store and will fetch the key material from the leader asynchronously, falling back to retrieving from other servers in the case where a key is written immediately before a leader election. Nomad 1.9.0 and above can replicate keys from older servers.
However, this means that to restore an older cluster from snapshot you need to
also provide the keystore directory with the .nks.json
key files on at least
one server. The .nks.json
key files are unique per server, but only one
server's key files are needed to recover the cluster. Operators should continue
to include these files as part of your organization's backup and recovery
strategy for the cluster until the cluster is fully upgraded to Nomad 1.9.0 and
at least one root_key_gc_interval
has passed.
If you are recovering an older Raft snapshot onto a new cluster without running
workloads, you can skip restoring the keyring and run nomad operator root
keyring rotate
once the servers have joined.