Host aggregates can be regarded as a mechanism to further partition an availability zone; while availability zones are visible to users, host aggregates are only visible to administrators. Host aggregates started out as a way to use Xen hypervisor resource pools, but have been generalized to provide a mechanism to allow administrators to assign key-value pairs to groups of machines. Each node can have multiple aggregates, each aggregate can have multiple key-value pairs, and the same key-value pair can be assigned to multiple aggregates. This information can be used in the scheduler to enable advanced scheduling, to set up Xen hypervisor resource pools or to define logical groups for migration. For more information, including an example of associating a group of hosts to a flavor, see Host aggregates and availability zones.
Availability Zones are the end-user visible logical abstraction for partitioning a cloud without knowing the physical infrastructure. That abstraction doesn’t come up in Nova with an actual database model since the availability zone is actually a specific metadata information attached to an aggregate. Adding that specific metadata to an aggregate makes the aggregate visible from an end-user perspective and consequently allows to schedule upon a specific set of hosts (the ones belonging to the aggregate).
That said, there are a few rules to know that diverge from an API perspective between aggregates and availability zones:
default_availability_zone
)Warning
That last rule can be very error-prone. Since the user can see the list of availability zones, they have no way to know whether the default availability zone name (currently nova) is provided because an host belongs to an aggregate whose AZ metadata key is set to nova, or because there is at least one host not belonging to any aggregate. Consequently, it is highly recommended for users to never ever ask for booting an instance by specifying an explicit AZ named nova and for operators to never set the AZ metadata for an aggregate to nova. That leads to some problems due to the fact that the instance AZ information is explicitly attached to nova which could break further move operations when either the host is moved to another aggregate or when the user would like to migrate the instance.
Note
Availability zone name must NOT contain ‘:’ since it is used by admin users to specify hosts where instances are launched in server creation. See Select hosts where instances are launched for more detail.
There is a nice educational video about availability zones from the Rocky summit which can be found here: https://www.openstack.org/videos/vancouver-2018/curse-your-bones-availability-zones-1
There are several ways to move a server to another host: evacuate, resize, cold migrate, live migrate, and unshelve. Move operations typically go through the scheduler to pick the target host unless a target host is specified and the request forces the server to that host by bypassing the scheduler. Only evacuate and live migrate can forcefully bypass the scheduler and move a server to a specified host and even then it is highly recommended to not force and bypass the scheduler.
With respect to availability zones, a server is restricted to a zone if:
POST /servers
request
containing the availability_zone
parameter.availability_zone
parameter but the API service is configured for
default_schedule_zone
then by default the server will
be scheduled to that zone.availability_zone
with the POST /servers/{server_id}/action
request
using microversion 2.77 or greater.If the server was not created in a specific zone then it is free to be moved to other zones, i.e. the AvailabilityZoneFilter is a no-op.
Knowing this, it is dangerous to force a server to another host with evacuate or live migrate if the server is restricted to a zone and is then forced to move to a host in another zone, because that will create an inconsistency in the internal tracking of where that server should live and may require manually updating the database for that server. For example, if a user creates a server in zone A and then the admin force live migrates the server to zone B, and then the user resizes the server, the scheduler will try to move it back to zone A which may or may not work, e.g. if the admin deleted or renamed zone A in the interim.
The cinder.cross_az_attach
configuration option can be
used to restrict servers and the volumes attached to servers to the same
availability zone.
A typical use case for setting cross_az_attach=False
is to enforce compute
and block storage affinity, for example in a High Performance Compute cluster.
By default cross_az_attach
is True meaning that the volumes attached to
a server can be in a different availability zone than the server. If set to
False, then when creating a server with pre-existing volumes or attaching a
volume to a server, the server and volume zone must match otherwise the
request will fail. In addition, if the nova-compute service creates the volumes
to attach to the server during server create, it will request that those
volumes are created in the same availability zone as the server, which must
exist in the block storage (cinder) service.
As noted in the Implications for moving servers section, forcefully moving a server to another zone could also break affinity with attached volumes.
Note
cross_az_attach=False
is not widely used nor tested extensively
and thus suffers from some known issues:
The OSAPI Admin API is extended to support the following operations:
Using the nova command you can create, delete and manage aggregates. The following section outlines the list of available commands.
* aggregate-list Print a list of all aggregates.
* aggregate-create <name> [<availability_zone>] Create a new aggregate with the specified details.
* aggregate-delete <aggregate> Delete the aggregate by its ID or name.
* aggregate-show <aggregate> Show details of the aggregate specified by its ID or name.
* aggregate-add-host <aggregate> <host> Add the host to the aggregate specified by its ID or name.
* aggregate-remove-host <aggregate> <host> Remove the specified host from the aggregate specified by its ID or name.
* aggregate-set-metadata <aggregate> <key=value> [<key=value> ...]
Update the metadata associated with the aggregate specified by its ID or name.
* aggregate-update [--name <name>] [--availability-zone <availability-zone>] <aggregate>
Update the aggregate's name or availability zone.
* host-list List all hosts by service.
* hypervisor-list [--matching <hostname>] [--marker <marker>] [--limit <limit>]
List hypervisors.
* host-update [--status <enable|disable>] [--maintenance <enable|disable>] <hostname>
Put/resume host into/from maintenance.
* service-enable <id> Enable the service.
* service-disable [--reason <reason>] <id> Disable the service.
Except where otherwise noted, this document is licensed under Creative Commons Attribution 3.0 License. See all OpenStack Legal Documents.