Overview
The Vault 1.16.x upgrade guide contains information on deprecations, important or breaking changes, and remediation recommendations for anyone upgrading from Vault 1.15. Please read carefully.
Important changes
External plugin variables take precedence over system variables
Vault gives precedence to plugin environment variables over system environment variables when loading external plugins. The behavior for builtin plugins and plugins that do not specify additional environment variables is unaffected.
For example, if you register an external plugin with SOURCE=child
in the
env parameter but the main Vault
process already has SOURCE=parent
defined, the plugin process starts
with SOURCE=child
.
Refer to the plugin management page for more details on plugin environment variables.
Avoid conflicts with containerized plugins
Containerized plugins do not inherit system-defined environment variables. As a result, containerized plugins cannot have conflicts with Vault environment variables.
How to opt out
To opt out of the precedence change, set the
VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
environment variable to true
for the
main Vault process:
$ export VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING=true
Setting VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
to true
tells Vault to:
- prioritize environment variables from the Vault server environment whenever the system detects a variable conflict.
- report on plugin variable conflicts during the unseal process by printing warnings for plugins with conflicting environment variables or logging an informational entry when there are no conflicts.
For example, assume you set VAULT_PLUGIN_USE_LEGACY_ENV_LAYERING
to true
and have an environment variable SOURCE=parent
.
If you register an external plugin called myplugin
with SOURCE=child
, the
plugin process starts with SOURCE=parent
and Vault reports a conflict for
myplugin
.
LDAP auth entity alias names no longer include upndomain
The userattr
field on the LDAP auth config is now used as the entity alias.
Prior to 1.16, the LDAP auth method would detect if upndomain
was configured
on the mount and then use <cn>@<upndomain>
as the entity alias value.
The consequence of not configuring this correctly means users may not have the correct policies attached to their tokens when logging in.
How to opt out
To opt out of the entity alias change, update the userattr
field on the config:
userattr="userprincipalname"
Refer to the LDAP auth method (API) page for more details on the configuration.
JWT auth login requires bound audiences on the role
JWT auth roles of type "jwt" require the bound_audiences
claim to match at
least one of the JWT's aud
claims. Prior to 1.16.3, the JWT auth method would
ignore token aud
claims that were not a list of strings.
The consequence of not configuring this correctly means users may not be able
to login to Vault. To fix the issue, update the role's bound_audiences
parameter to match the aud
claim on the JWT.
Refer to the JWT auth method (API) page for more details on the configuration.
Known issues and workarounds
Error configuring the JWT auth method
Affected versions
- 1.16.1
Issue
An error will occur when configuring the built-in jwt auth method. This will affect new mounts and updates to existing mounts. Existing mounts should not encounter an error if no modifications are made.
See this issue for more details.
This issue is addressed in Vault 1.16.2 and later.
Workaround
Do not attempt to update an existing mount's config. New mounts can run the plugin as an external plugin to avoid the error.
Error logging in with LDAP auth method when anonymous group search is enabled
Affected versions
- 1.16.0
Issue
Depending on their LDAP configuration, some customers may encounter an error when attempting to login with ldap auth method when anonymous group search is enabled. See this issue for more details.
This issue was resolved in 1.16.1
.
Workaround
There is no workaround.
Error logging in with LDAP auth method
Affected versions
- 1.16.0
Issue
Depending on their LDAP configuration, some customers may encounter an error when attempting to login with ldap auth method. Active Directory users are affected by this bug. See this issue for more details.
This issue was resolved in 1.16.1
.
Workaround
There is no workaround.
Existing clusters do not show the current Vault version in UI by default
Affected versions
- 1.16+
Issue
Previous versions of the Vault UI fetched version information from
un-authenticated endpoints like sys/health
and sys/seal-status
. Since
introducing the redact_version
TCP listener parameter, version information may
no longer be available under some configurations. As a result, the Vault UI now
uses a new, authenticated endpoint (sys/internal/ui/version
) to fetch
version information.
By default, all new Vault servers created with v1.16+ include the following rule, in the automatically-generated policy:
# Allow a token to look up the Vault version. This path is not subject to# redaction like the unauthenticated endpoints that provide the Vault version.path "sys/internal/ui/version" { capabilities = ["read"]}
However, the default policy for existing Vault servers does not update automatically during the upgrade. You must updated the policy manually in order for the Vault version to be displayed in the Vault UI.
No other functionality in the Vault UI is affected be this issue.
You can use the Vault CLI to update the default policy and allow the Vault IU to query the Vault server for version information:
$ vault policy read default | cat - <<< '# Allow a token to look up the Vault version. This path is not subject to# redaction like the unauthenticated endpoints that provide the Vault version.path "sys/internal/ui/version" { capabilities = ["read"]}' > default-policy.hcl$ vault policy write default ./default-policy.hcl
Default lease count quota enabled when upgrading from Vault versions before 1.9
Affected versions
- 1.16+
Issue
Vault began tracking version history as of version 1.9. As of version 1.16, new Vault installs automatically enable a lease count quota by consulting the version history. If the version history is empty on upgrade, Vault treats the upgrade as a new install and automatically enables a default lease count quota.
Before you upgrade Vault from a version prior to 1.9 to versions 1.16+,
you should check the current number of unexpired leases via the
vault.expire.num_leases
metric.
If the number of unexpired leases is below the default lease count quota, value of 300000 no extra steps are required.
If the number of unexpired leases is greater than the default threshold of 300000, there is a two step workaround to safely upgrade without the default lease count quota:
- Upgrade to any Vault version prior to 1.16 (between 1.9 and 1.15) to populate the version store.
- Upgrade to Vault version 1.16+.
You can review, modify, and delete the global default quota at any point with
the
/sys/quotas/lease-count/default
endpoint:
$ vault read sys/quotas/lease-count/default$ vault delete sys/quotas/lease-count/default$ vault write sys/quotas/lease-count/default max_leases=<# of max leases>
Refer to Protecting Vault with Resource Quotas for a step-by-step tutorial on quota tuning.
Refer to Lease Explosions for more information on lease management.
PKI OCSP GET requests can return HTTP redirect responses
If a base64 encoded OCSP request contains consecutive '/' characters, the GET request will return a 301 permanent redirect response. If the redirection is followed, the request will not decode as it will not be a properly base64 encoded request.
As a workaround, OCSP POST requests can be used which are unaffected.
Impacted versions
Affects all current versions of 1.12.x, 1.13.x, 1.14.x, 1.15.x, 1.16.x
Azure secrets engine role creation failing
Affected versions
- 1.16.0, 1.16.1, 1.16.2
Issue
Creating Azure Secrets engine roles by specifying the Azure App Registration
Object ID as the application_object_id causes Vault to error out stating error
loading Application: no application found. This was introduced by
vault-plugin-secrets-azure
v0.17.0, which incorrectly queries for an application
by client_id and not application_object_id.
Workaround
Users can pass a client_id instead of an application_object_id into the application_object_id parameter. However, after upgrading to a version with a fix (1.16.3+), the user will need to remember to switch back to the application_object_id.
Performance Standbys revert to Standby mode on unseal
Affected versions
- 1.14.12
- 1.15.8
- 1.16.2
Issue
If you previously set a value for retention_months
via the
sys/internal/counters/config
endpoint, upgrading to Vault Enterprise versions 1.14.12, 1.15.8, and 1.16.2
will cause Performance Standby
nodes to revert to Standby mode.
Adding nodes with Vault Enterprise versions 1.14.12, 1.15.8, or 1.16.2 to a
cluster with an older versioned leader will see any previously set
retention_months
value and attempt to write the new minimum value of 48
. The
storage write will result in a read-only error:
[ERROR] core: performance standby post-unseal setup failed: error="cannot write to readonly storage"
You can verify the status of your nodes by checking the /sys/health endpoint.
Deployments that rely on scaling across Performance Standbys will now forward all requests to the active node, increasing the utilization of the active node.
Post-upgrade cluster membership
During the last step of a full upgrade, the old leader steps down, causing one of the Standby nodes to become leader.A fix for the read-only storage error has been prioritized and escalated. The fix will be in releases 1.14.13, 1.15.9 and 1.16.3.
Important
If you have already upgraded to versions 1.14.12, 1.15.8, or 1.16.2, please refer to the workaround section for options.Workaround
Once the leader of the cluster has been updgraded to version 1.14.12, 1.15.8, or
1.16.2, the workaround is to update the retention_months
value on the active
node via the
sys/internal/counters/config
endpoint:
$ vault write sys/internal/counters/config retention_months=48
This storage entry will be written to all nodes in the cluster, allowing them to immediately unseal as Performance Standbys.
After the new retention_months
value is written to storage on the active node,
adding new nodes to the cluster will not cause the read-only error.
Sending SIGHUP to vault standby node causes panic
Affected versions
- 1.13.4+
- 1.14.0+
- 1.15.0+
- 1.16.0+
Issue
Sending a SIGHUP to a vault standby node running an enterprise build can cause a panic if there is a change to the license, or reporting configuration. Active and performance standby nodes will perform fine. It is recommended that operators stop and restart vault nodes individually if configuration changes are required.
Workaround
Instead of issuing a SIGHUP, users should stop individual vault nodes, update the configuration or license and then restart the node.
New nodes added by autopilot upgrades provisioned with the wrong version
Affected versions
- 1.15.3 - 1.15.9
- 1.16.1 - 1.16.3
Issue
If autopilot_upgrade_version
is not explicitly set in the Vault configuration file in the storage
section, new non-active nodes will retain their original Vault version as opposed to the new version.
Workaround
Set the desired version in the configuration file as autopilot_upgrade_version=<version string>
. This will
allow all nodes to receive the proper version to upgrade to.