Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pagination to list APIs. #422

Open
sgotti opened this issue Aug 10, 2023 · 10 comments
Open

Add pagination to list APIs. #422

sgotti opened this issue Aug 10, 2023 · 10 comments
Assignees
Labels
enhancement New feature or request

Comments

@sgotti
Copy link
Member

sgotti commented Aug 10, 2023

Current "list" APIs (with few exceptions) return all the results. This was meant to be only temporary since a sort of pagination is always needed to avoid returning too much data.

There are different options:

  • classic pagination style (see github v3 api). Classic pagination has the downside of becoming slow on the query side when asking for deeper pages (since the sql query will require the use of offset). To avoid this, some APIs, limit the maximum page number.
  • cursor based pagination. Here we can return data after the provided id and could use an opaque cursor that could contain data about the last id and also the API arguments (sorting by, filtering etc...). This will also avoid the use of sql offset in underlying sql queries. One of the cons is that this doesn't provide the ability to jump to a specific page but can only fetch the elements serially.

I'll prefer cursor based pagination (like already done for user/project get runs APIs).

NOTES

This will break current API but the APIs are not yet stable.

Additional required Actions

  • Due to API breakage changes to also agola-web are required since it currently doesn't expect paginated results and should handle them (add a "load more" button, automatically fetch when scrolling, fetch all the whole data doing multiple calls etc...).
@sgotti sgotti added the enhancement New feature or request label Aug 10, 2023
@alessandro-sorint
Copy link
Contributor

We can use a Cursor with this data

StartID    string
PointsNext bool
OrderBy    string
SortOrder  SortOrder
Filters    map[string]string
Limit      int

StartID contain the starting data id(excluded) to fetch
PointsNext if the user ask for next page is true
Filters can contains some filter, else it is empty

When the client will call the gateway API it will can provide some query values like limit, orderby, asc and some filters(if expected), for example:
/projects?orderby=name&asc&limit=10 or /projects?limit=10
There are some APIs(example /runs?start&limit=10) that use start query value that, if we implement cursor pagination, we can remove start(because it can use cursor to ask next page): @sgotti do you think is better to maintain start or remove it?

There are some agola cmd list. We should also add here the cursor asking him in the args. Do you can confirm?

The APIs will return the data and pagination for example:

{
   Orgs           []*OrgResponse
   PaginationInfo *htypes.PaginationInfo 
}

where PaginationInfo contains:

{
   NextCursor string
   PrevCursor string
}

The client will use for the next calls only NextCursor for ask the next page or PrevCursor for ask the previous page, if the values ​​are present(for example the first page don't have PrevCursor).
NextCursor and PrevCursor contain a base64 string of the Cursor described above.

The Gateway take the query values(limit, orderby,ecc.), or the cursor(that contains all the data needed), and use them to call the configstore/runservice.
For example /projects?cursor=eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0%3D

The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors.
The Gateway only copy the PaginationInfo to the reponse(and the data).

There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.

@sgotti
Copy link
Member Author

sgotti commented Aug 30, 2023

PointsNext if the user ask for next page is true

I'm not sure I understand this.

we can remove start(because it can use cursor to ask next page)

We could also keep start if useful for some api to fetch from a specific id with specific option. Then for the next results the cursor could be used.

There are some agola cmd list. We should also add here the cursor asking him in the args. Can you confirm?

The APIs will return the data and pagination for example:

{
   Orgs           []*OrgResponse
   PaginationInfo *htypes.PaginationInfo 
}

where PaginationInfo contains:

{
   NextCursor string
   PrevCursor string
}

The output should contain a cursor. But there will only be one cursor. It doesn't makes sense to have a prev or next cursor. It'll just be a cursor that point to the next batch of data based on the original query. If the original query has a sort order the cursor will use that.

The decode and management of the cursor is done by the configstore/runservice that will get the data and create the pagination(PaginationInfo) with the cursors.

It's the gateway that manages the exposed API and so it MUST also manage the cursor, since it could call multiple underlying services to create a response. The underlying services don't require a cursor since they aren't exposed.

There are some APIs where it is not possible to implement cursor pagination, as secrets and variables, because they do tree querys when the client use tree parameter.

I don't see why this won't work. Can you explain it?

Remember that if you're going to implement this, also agola-web should be changed and the related PRs should be created to be able to test them.

@alessandro-sorint
Copy link
Contributor

alessandro-sorint commented Aug 30, 2023

I remove PointsNext since we maintain only one cursor for the next page, so the cursor contains this data:

StartID    string
OrderBy    string
SortOrder  SortOrder
Filters    map[string]string
Limit      int

The response of the gateway can be, for example:

{
   Orgs           []*OrgResponse
   Cursor string
}

or we can create a generic response like this:

{
   Data any
   Cursor string
}

What do you like is better?

We keep start for some api where is already implemented.

@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user.
For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=

@sgotti
Copy link
Member Author

sgotti commented Aug 30, 2023

Filters map[string]string

Don't be rigid on this. Filters, but also the other fields could be different based on the api.

What do you like is better?

The latter doesn't make sense.

@sgotti we should add the cursor to the command line list? And the command result must show the cursor to the user.
For example: agola org member list --cursor eyJzdGFydF9pZCI6IjlhNTczOTVkLWZhZGYtNDU1NS1hZDdmLWMwMjRjNjlkZDA2OCIsInBvaW50c19uZXh0Ijp0cnVlLCJvcmRlcl9ieSI6Im5hbWUiLCJzb3J0X29yZGVyIjowLCJmaWx0ZXJzIjpudWxsLCJsaW1pdCI6Mn0=

By default the client could list all the entries by fetching all of them doing multiple calls to the api.

@alessandro-sorint
Copy link
Contributor

Filters map[string]string

Don't be rigid on this. Filters, but also the other fields could be different based on the api.
@sgotti I can define the Cursor as map[string]interface{}

@sgotti
Copy link
Member Author

sgotti commented Aug 30, 2023

@sgotti I can define the Cursor as map[string]interface{}

There's no need to make cursor the same type for every api.

@alessandro-sorint
Copy link
Contributor

@sgotti when the user call the last page do you think is better to return an empty cursor(so it know there are no other pages to ask) or the cursor with the last data id(the same as the other calls)?

  • In the first case the gateway must call the configstore/runservice APIs with a limit+1 to check if there are other pages.
  • In the second case Instead, the client will understand there are no other data when the gateway return no data or len data < limit.

@sgotti
Copy link
Member Author

sgotti commented Aug 31, 2023

@sgotti when the user call the last page do you think is better to return an empty cursor

I'll return an empty cursor.

In the first case the gateway must call the configstore/runservice APIs with a limit+1 to check if there are other pages.

The underlying services could return if there's more data themself.

@alessandro-sorint
Copy link
Contributor

@sgotti with the APIs list secrets/variables I think the pagination can not work when the client use the removeoverridden parameter, because this filter is implemented by the gateway.
For example we have this Secrets:

{
   "name": "secret1","parent_path": "user/user01",
   "name": "secret1","parent_path": "user/user01/subgroup1",
   "name": "secret1","parent_path": "user/user01/subgroup2",
   "name": "secret2","parent_path": "user/user01",
}

If the client use parameter limit=2 the Gateway should return {"name": "secret1","parent_path": "user/user01"} and {"name": "secret2","parent_path": "user/user01"}, but calling the configstore whith limit=2, and next making the overriden filter it will return only {"name": "secret1","parent_path": "user/user01"} that is wrong(it should return also {"name": "secret2","parent_path": "user/user01"}).
We should avoid this implementing the limit by the gateway(instead of by the configstore), but it can be much complex and with too many calls to the configstore.

I suggest to avoid/ignore limit when the client use removeoverridden parameter, or avoid limit for this APIs(secrets/variables).

@sgotti
Copy link
Member Author

sgotti commented Sep 1, 2023

@alessandro-sorint I'm not sure I understand. If the api returns reproducible output as it should, then the cursor will contain the information on the next batch of data. How you get them is an implementation detail.

Limit int

I don't think Limit should be part of the cursor but should be set every time by the call and on the server side there'll be an hard limit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants