Document `peps.json` and move it to the root #2584

AA-Turner · 2022-05-07T10:45:20Z

We currently have https://peps.python.org/api/peps.json, an undocumented file.

It seems that people have usecases for the file, even without official support (see #2567, #2583). I suggest moving the file to the root¹ for consistency with peps.rss and easier discoverability, and simultaneously documenting the file, and the minimum guarantees we provide about it.

In the issue that created it, Cam noted:

Maybe put this under an /api/peps endpoint (at least once we're ready to more publicly expose it)? Then if there was need/desire in the future, we could have an authors endpoint, sub-endpoints api/peps/N to get a single PEP's metadata, etc.

I think the api/ is uneeded here -- to those use cases, we could instead generate peps.python.org/pep-NNNN.json, and peps.python.org/authors.json.

I'd be interested in views.

A

i.e. to https://peps.python.org/peps.json ↩

The text was updated successfully, but these errors were encountered:

AA-Turner · 2022-05-07T10:45:55Z

cc: @hugovk @CAM-Gerlach

Rosuav · 2022-05-07T10:59:10Z

👍 No need for the /api in the URL, the fact that it's a .json file implies that it's machine readable.

hugovk · 2022-05-07T13:20:57Z

I don't have a strong opinion, but slightly prefer including api in the base URL.

https://pypi.org/project/pepotron/ is already using the API, so a move would cause 404s. Although I can fix and release it immediately, and to be honest doubt many people are using it, but it's still a break.

api.example.com and example.com/api are very common patterns for REST APIs. Looking up some best practices, I don't see anything saying if api should be there, but it's often used in examples.

A benefit is we can also have the docs at https://peps.python.org/api/ (for example like https://pypistats.org/api/).

AA-Turner · 2022-05-07T21:01:11Z

for REST APIs

I would argue that this is the wrong conceptual framing to use -- we are not in the business of providing a full programmatic APIs for PEPs, but simply a representation of the index as a JSON file for easier parsing and use.

In this spirit, the suggested authors.json and pep-NNNN.json files are technically unneeded, as one can parse all that data from the existing peps.json -- we would be providing it as a usability boon, rather than as building out an API.

I would very much encourage someone to run a "PEP API" service if wanted, but I really don't want to be in the business of anything beyond serving static files in this repo, and I think api has connotations more in line with a bigger, or more fully featured entity than what we're actually providing.

I'm aware some things would break if we moved the URL, but it was explicitly introduced as experimental and undocumented, so anybody making use of the file is doing so at their own risk -- I don't think breakage should be a big argument here. (It will be a different story when we document it, which is why I'd like to do so sooner rather than later).

A

CAM-Gerlach · 2022-05-08T02:30:07Z

My opinion isn't that strong either, but I agree with @hugovk in also preferring api in the URL, In addition to the reasons Hugo mentioned, it draws a clear boundary between the regular user-facing content, which may include files in various formats including JSON, and the machine-readable API that is expected to be relatively stable (once we document it). This allows other users, like PEP-o-tron and @pfmoore 's tools, to know what they can rely on, while leaving us more free to change things elsewhere without worrying about breakage.

In any case, before/as part of formally documenting the API, we should provide the data in a more structured, easily-consumable form that is abstracted from that in the source; instead of requiring all tools (including our own) to independently parse the authors, post history, dates, etc. that are each in one of several different non-standard formats. This would allow us to make further changes to the user input format (simplifying it, being more flexible in what we allow, accepting URLs instead of emails for authors, etc) without having to worry about other tools being able to read it easily. It would also help address @AA-Turner 's concerns originally raised on #2358 regarding a lack of structure in the data, difficulty in tools reading it and the format/parsing being tied to reST/Sphinx.

In fact, as I've already been thinking about lately and discussed with @JelleZijlstra and @warsaw at PyCon, right now we parse the headers three different places with three different sets of logic, and instead should just use the structured format above (with the parsing presumably in the PEP class) for all of them, which would be a lot simpler and more DRY, reliable, maintainable and extensible overall. But as that's getting a bit ahead of ourselves, I've opened #2587 for that.

CAM-Gerlach · 2022-05-08T02:54:26Z

we are not in the business of providing a full programmatic APIs for PEPs

Using the royal "we", are we? 😂

I really don't want to be in the business of anything beyond serving static files in this repo

I don't think anyone here is suggesting anything otherwise—a API is just a machine-readable interface to access some data or functionality, and does not require server-side interactivity, and that is exactly what is being proposed here. For example, the FSF API (full disclosure, I'm one of the maintainers) operates in essentially the same way as ours does.

What calling something an "API" fundamentally conveys is not a particular mechanism (REST, SOAP, etc) but rather that the data is machine-readable and reasonably stable enough to be used programmatically, which it seems there's already interest in despite not documenting or publicizing this at all. The value is enabling the wider ecosystem being able to easily and reliably consume and enrich PEP metadata for a variety of uses with minimal friction so long as they use what's under api/, while conversely giving us more freedom with the internals and user-facing GUI.

I'm aware some things would break if we moved the URL, but it was explicitly introduced as experimental and undocumented, so anybody making use of the file is doing so at their own risk -- I don't think breakage should be a big argument here.

Agreed there, but once we do, keeping anything we expect not to break under api (or some other subdir, if we want to bikeshed the name) makes it more clear where that can be expected to hold and where it doesn't.

pfmoore · 2022-05-08T10:23:11Z

I would argue that this is the wrong conceptual framing to use -- we are not in the business of providing a full programmatic APIs for PEPs, but simply a representation of the index as a JSON file for easier parsing and use.

As someone who's just found out about the JSON file and is now using it rather than scraping the HTML page, I can confirm that all I need is "a representation of the index as a JSON file for easier parsing and use". I really don't care whether you call it that, or an "API". I don't have any preconceptions about what an "API" might be beyond "a representation as a JSON file" - so call it what you like. But please don't remove it just because of terminology.

onerandomusername · 2022-06-20T19:20:30Z

I'm in the same boat as pfmoore here. I'm using undocumented files and have to keep updating my code to stay with the changes, and it would be great if there was a supported json representation of each pep.

Right now I'm using the sphinx generated objects.inv file at the website root to get a list of all peps in the repo and then when a user requests the pep number, fetching the html file of the pep and parsing the headers and information out of it.

In the end, I use the majority of the headers and the html generated content of the pep.

The end result for me ends up being something like this:

So while I can continue doing what I am currently doing, patching it when inevitable updates to the website html occur, it would be beneficial to have an API (even of static json files!) that provides access to all of the content of each pep.

hugovk · 2023-10-12T05:57:49Z

Revisiting this.

Let's document the API at /api/index.html.
If we move the file to the root, we must also transparently redirect from the old to the new, perhaps indefinitely.

AA-Turner added the enhancement label May 7, 2022

AA-Turner self-assigned this May 7, 2022

AA-Turner mentioned this issue May 7, 2022

Is there a supported API/method for getting a list of PEP numbers and their titles? #2583

Closed

CAM-Gerlach added the infra Core infrastructure for building and rendering PEPs label May 7, 2022

CAM-Gerlach mentioned this issue May 8, 2022

Decouple and unify PEP header processing for rendering, PEP 0, JSON, RSS and linting #2587

Open

wookie184 mentioned this issue May 8, 2022

Simplify PEP cog to use PEP API python-discord/bot#2166

Closed

hugovk mentioned this issue Jun 8, 2022

Add support for topic indices #2579

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Document `peps.json` and move it to the root #2584

Document `peps.json` and move it to the root #2584

AA-Turner commented May 7, 2022 •

edited

AA-Turner commented May 7, 2022

Rosuav commented May 7, 2022

hugovk commented May 7, 2022

AA-Turner commented May 7, 2022 •

edited

CAM-Gerlach commented May 8, 2022

CAM-Gerlach commented May 8, 2022

pfmoore commented May 8, 2022

onerandomusername commented Jun 20, 2022

hugovk commented Oct 12, 2023

Document peps.json and move it to the root #2584

Document peps.json and move it to the root #2584

Comments

AA-Turner commented May 7, 2022 • edited

Footnotes

AA-Turner commented May 7, 2022

Rosuav commented May 7, 2022

hugovk commented May 7, 2022

AA-Turner commented May 7, 2022 • edited

CAM-Gerlach commented May 8, 2022

CAM-Gerlach commented May 8, 2022

pfmoore commented May 8, 2022

onerandomusername commented Jun 20, 2022

hugovk commented Oct 12, 2023

Document `peps.json` and move it to the root #2584

Document `peps.json` and move it to the root #2584

AA-Turner commented May 7, 2022 •

edited

AA-Turner commented May 7, 2022 •

edited