This document explains how to reserve blocks of capacity by asking your account team to create a future reservation request for you. For other ways to get compute resources in AI Hypercomputer, see Choose a consumption option.
To help ensure that your workloads have the necessary resources, request a future reservation from Google. This process lets you reserve blocks of capacity for a defined duration, starting on a specific date and time that you choose. You can then use the reserved capacity to create virtual machine (VM) that match the reservation until the reservation period ends.
Limitations
Before you reserve blocks of capacity, consider the following:
You can reserve A4 and A3 Ultra machine types only.
After Google creates a draft future reservation request for you, the following limitations apply:
You can't modify the request details, including the share type.
After the request is submitted, approved, and its state changes to
PROVISIONING
, you can't cancel or delete it. You commit to pay for the requested capacity from the request's start time.
After Compute Engine creates an on-demand reservation to fulfill your requested capacity, the following limitations apply:
You can use or modify the reservation only after the request start time.
You can use the reservation only by specifically targeting it.
You can modify the reservation only to enable or disable usage in Vertex AI.
You can't delete the reservation. Compute Engine deletes it, and any VMs using it, at the request end time.
Before you begin
Select the tab for how you plan to use the samples on this page:
Console
When you use the Google Cloud console to access Google Cloud services and APIs, you don't need to set up authentication.
gcloud
In the Google Cloud console, activate Cloud Shell.
At the bottom of the Google Cloud console, a Cloud Shell session starts and displays a command-line prompt. Cloud Shell is a shell environment with the Google Cloud CLI already installed and with values already set for your current project. It can take a few seconds for the session to initialize.
REST
To use the REST API samples on this page in a local development environment, you use the credentials you provide to the gcloud CLI.
After installing the Google Cloud CLI, initialize it by running the following command:
gcloud init
If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity.
For more information, see Authenticate for using REST in the Google Cloud authentication documentation.
Required roles
To get the permissions that
you need to create a future reservation request,
ask your administrator to grant you the
Compute Future Reservation User (roles/compute.futureReservationUser
)
IAM role on the project.
For more information about granting roles, see Manage access to projects, folders, and organizations.
This predefined role contains the permissions required to create a future reservation request. To see the exact permissions that are required, expand the Required permissions section:
Required permissions
The following permissions are required to create a future reservation request:
-
To allow Compute Engine to auto-create reservations:
compute.reservations.create
on the project -
To create a future reservation request:
compute.futureReservations.create
on the project -
To specify an instance template:
compute.instanceTemplates.useReadOnly
on the instance template
You might also be able to get these permissions with custom roles or other predefined roles.
Quota
As part of the future reservation request process, Google manages quota for your reserved resources. You don't need to request quota. At the start time of your approved future reservation, Google increases your quota if you lack it for the reserved resources.
Overview
To reserve blocks of capacity, complete the following steps:
Request capacity through your account team. Contact your account team to specify the type and number of resources that you want to reserve.
Review and submit a draft reservation request. After Google creates a draft future reservation request, review it. If it looks correct, then submit the request for review. Google Cloud immediately approves it.
Request capacity through your account team
Contact your account team and provide the following information for Google to create a draft future reservation request:
Project number: the number of the project where your account team creates the request and Compute Engine provisions the capacity.
Machine type: whether you want to reserve A4 (
a4-highgpu-8g
) or A3 Ultra (a3-ultragpu-8g
) machine types.Total count: the total number of VMs to reserve. You can only reserve multiples of two VMs. Block sizes and VM count per block vary based on machine type and availability. Your account team can provide more details for your request.
Zone: the zone where you want to reserve capacity. To review the available regions and zones for a GPU machine type, see GPU regions and zones.
Start time: the start time of the reservation period. You can start using the reserved capacity then. Format the start time as a RFC 3339 timestamp:
YYYY-MM-DDTHH:MM:SSOFFSET
Replace the following:
YYYY-MM-DD
: a date formatted as a four-digit year, two-digit month, and a two-digit day of the month, separated by hyphens (-
).HH:MM:SS
: a time formatted as a two-digit hour using a 24-hour time, two-digit minutes, and two-digit seconds, separated by colons (:
).OFFSET
: the time zone formatted as an offset of Coordinated Universal Time (UTC). For example, to use the Pacific Standard Time (PST), specify-08:00
. To use no offset, specifyZ
.
End time: the end time of the reservation period. Format it as an RFC 3339 timestamp. Compute Engine automatically deletes the auto-created reservation, and any VMs using it, at this time.
Share type: whether only your project can use the auto-created reservation (
LOCAL
), or other projects can use the reservation (SPECIFIC_PROJECTS
). This property can't change after you submit the request. To share reserved capacity with other projects in your organization, do the following:If you haven't already, then ensure that the project where Google creates the request is allowed to create shared reservations.
Provide the numbers of the projects to share the reserved capacity with. You can specify up to 100 projects in your organization.
Reservation name: the name of the reservation that Compute Engine automatically creates to deliver your reserved capacity. Compute Engine only creates specifically targeted reservations.
Commitment name: if your reservation period is one year or longer, then you must purchase and attach a resource-based commitment to your reserved resources. You can purchase a commitment with a 1-year or 3-year plan. If you share the reserved capacity with other projects, then those projects get discounts only if they use the same Cloud Billing account as the project where you reserve capacity. For details, see Enable CUD sharing for resource-based commitments.
When Google creates the draft future reservation request, your account team contacts you.
Review and submit a draft reservation request
After you provide the type and amount of resources to reserve to your account team, Google creates a draft future reservation request. You can review the draft request and, if correct, submit it for review. You must submit the request before the request start time.
To review and submit a draft future reservation request, select one of the following options:
Console
In the Google Cloud console, go to the Reservations page.
Click the Future reservations tab. The Future Reservations table lists each future reservation request in your project, and each table column describes a property.
In the Name column, click the name of the draft request that Google created for you. A page that gives the details of the future reservation request opens.
In the Basic information section, verify that the request details, such as Dates and Share type, are correct. Also, if you requested a commitment, ensure that it's specified. If any of these details are incorrect, then contact your account team.
If everything looks accurate, click Submit. Google Cloud approves your request within a few minutes, and then Compute Engine creates an empty reservation with your requested resources.
gcloud
To view a list of future reservation requests in your project, use the
gcloud beta compute future-reservations list
command with the--filter
flag set toPROCUREMENT_STATUS=DRAFTING
:gcloud beta compute future-reservations list --filter=PROCUREMENT_STATUS=DRAFTING
In the command output, look for the reservation request that has the name that you provided to your account team.
To view the details of the draft request, use the
gcloud beta compute future-reservations describe
command:gcloud beta compute future-reservations describe FUTURE_RESERVATION_NAME \ --zone=ZONE
Replace the following:
FUTURE_RESERVATION_NAME
: the name of the draft future reservation request.ZONE
: the zone where Google created the request.
In the command output, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you purchased a commitment, ensure that it's specified. If the details are incorrect, then contact your account team.
To submit the draft request for review, use the
gcloud beta compute future-reservations update
command with the--planning-status
flag set toSUBMITTED
:gcloud beta compute future-reservations update FUTURE_RESERVATION_NAME \ --planning-status=SUBMITTED \ --zone=ZONE
Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.
REST
To view a list of future reservation requests in your project, make a
GET
request to the betafutureReservations.list
method. In the request URL, include thefilter
query parameter and set it tostatus.procurementStatus=DRAFTING
:GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations?filter=status.procurementStatus=DRAFTING
Replace the following:
PROJECT_ID
: the ID of the project where Google created the draft future reservation request.ZONE
: the zone where request exists.
In the request output, look for the reservation request that has the name that you provided to your account team.
To view the details of the draft request, make a
GET
request to the betafutureReservations.get
method:GET https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME
Replace
FUTURE_RESERVATION_NAME
with the name of the draft future reservation request.In the response, verify that the request details, such as the reservation period and share type, are correct. Additionally, if you requested a commitment, ensure that it's specified. If the details are incorrect, then contact your account team.
To submit the draft request for review, make a
PATCH
request to the betafutureReservations.update
method. In the request URL, include theupdateMask
query parameter and set it toplanningStatus
:PATCH https://compute.googleapis.com/compute/beta/projects/PROJECT_ID/zones/ZONE/futureReservations/FUTURE_RESERVATION_NAME?updateMask=planningStatus { "name": "FUTURE_RESERVATION_NAME", "planningStatus": "SUBMITTED" }
Within a few minutes, Google Cloud approves your request, and then Compute Engine creates an empty reservation with your requested resources.