Pagination
Number	158
Permalink	aip.bybutter.com/158
State	Approved
Created	2019-02-18
Updated	2019-02-18

AIP-158

Pagination

APIs often need to provide collections of data, most commonly in the List standard method. However, collections can often be arbitrarily sized, and also often grow over time, increasing lookup time as well as the size of the responses being sent over the wire. Therefore, it is important that collections be paginated.

Guidance

RPCs returning collections of data must provide pagination at the outset, as it is a backwards-incompatible change to add pagination to an existing method.

// The request structure for listing books.
message ListBooksRequest {
  // The parent, which owns this collection of books.
  // Format: publishers/{publisher}
  string parent = 1 [
    (google.api.field_behavior) = REQUIRED,
    (google.api.resource_reference) = {
      child_type: "library.googleapis.com/Book"
    }];

  // The maximum number of books to return. The service may return fewer than
  // this value.
  // If unspecified, at most 50 books will be returned.
  // The maximum value is 1000; values above 1000 will be coerced to 1000.
  int32 page_size = 2;

  // A page token, received from a previous `ListBooks` call.
  // Provide this to retrieve the subsequent page.
  //
  // When paginating, all other parameters provided to `ListBooks` must match
  // the call that provided the page token.
  string page_token = 3;
}

// The response structure from listing books.
message ListBooksResponse {
  // The books from the specified publisher.
  repeated Book books = 1;

  // A token that can be sent as `page_token` to retrieve the next page.
  // If this field is omitted, there are no subsequent pages.
  string next_page_token = 2;
}

Request messages for collections should define an int32 page_size field, allowing users to specify the maximum number of results to return.
- The page_size field must not be required.
- If the user does not specify page_size (or specifies 0), the API chooses an appropriate default, which the API should document. The API must not return an error.
- If the user specifies page_size greater than the maximum permitted by the API, the API should coerce down to the maximum permitted page size.
- If the user specifies a negative value for page_size, the API must send an INVALID_ARGUMENT error.
- The API may return fewer results than the number requested (including zero results), even if not at the end of the collection.
Request messages for collections should define a string page_token field, allowing users to advance to the next page in the collection.
- The page_token field must not be required.
- If the user changes the page_size in a request for subsequent pages, the service must honor the new page size.
- The user is expected to keep all other arguments to the RPC the same; if any arguments are different, the API should send an INVALID_ARGUMENT error.
The response must not be a streaming response.
Response messages for collections should define a string next_page_token field, providing the user with a page token that may be used to retrieve the next page.
- The field containing pagination results should be the first field in the message and have a field number of 1. It should be a repeated field containing a list of resources constituting a single page of results.
- If the end of the collection has been reached, the next_page_token field must be empty. This is the only way to communicate "end-of-collection" to users.
- If the end of the collection has not been reached (or if the API can not determine in time), the API must provide a next_page_token.
Response messages for collections may provide an int32 total_size field, providing the user with the total number of items in the list.
- This total may be an estimate (but the API should explicitly document that).

Skipping results

The request definition for a paginated operation may define an int32 skip field to allow the user to skip results.

The skip value must refer to the number of individual resources to skip, not the number of pages.

For example:

A request with no page token and a skip value of 30 returns a single page of results starting with the 31st result.
A request with a page token corresponding to the 51st result (because the first 50 results were returned on the first page) and a skip value of 30 returns a single page of results starting with the 81st result.

If a skip value is provided that causes the cursor to move past the end of the collection of results, the response must be 200 OK with an empty result set, and not provide a next_page_token.

Opacity

Page tokens provided by APIs must be opaque (but URL-safe) strings, and must not be user-parseable. This is because if users are able to deconstruct these, they will do so. This effectively makes the implementation details of your API's pagination become part of the API surface, and it becomes impossible to update those details without breaking users.

Warning: Base-64 encoding an otherwise-transparent page token is not a sufficient obfuscation mechanism.

For page tokens which do not need to be stored in a database, and which do not contain sensitive data, an API may obfuscate the page token by defining an internal protocol buffer message with any data needed, and send the serialized proto, base-64 encoded.

Page tokens must be limited to providing an indication of where to continue the pagination process only. They must not provide any form of authorization to the underlying resources, and authorization must be performed on the request as with any other regardless of the presence of a page token.

Expiring page tokens

Many APIs store page tokens in a database internally. In this situation, APIs may expire page tokens a reasonable time after they have been sent, in order not to needlessly store large amounts of data that is unlikely to be used. It is not necessary to document this behavior.

Note: While a reasonable time may vary between APIs, a good rule of thumb is three days.

Backwards compatibility

Adding pagination to an existing RPC is a backwards-incompatible change. This may seem strange; adding fields to proto messages is generally backwards compatible. However, this change is behaviorally incompatible.

Consider a user whose collection has 75 resources, and who has already written and deployed code. If the API later adds pagination fields, and sets the default to 50, then that user's code breaks; it was getting all resources, and now is only getting the first 50 (and does not know to advance pagination). Even if the API set a higher default limit, such as 100, the user's collection could grow, and then the code would break.

For this reason, it is important to always add pagination to RPCs returning collections up front; they are consistently important, and they can not be added later without causing problems for existing users.

Warning: This also entails that, in addition to presenting the pagination fields, they must be actually implemented with a non-infinite default value. Implementing an in-memory version (which might fetch everything then paginate) is reasonable for initially-small collections.

Changelog

2020-05-13: Added guidance for skipping results.
2020-08-24: Clarified that responses are not streaming responses.
2020-06-24: Clarified that page size is always optional for users.
2019-02-12: Added guidance on the field being paginated over.
2019-08-01: Changed the examples from "shelves" to "publishers", to present a better example of resource ownership.
2019-07-19: Update the opacity requirement from "should" to "must".