Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lookup of name in annotation fails on reparented object #295

Open
mthuurne opened this issue Nov 27, 2020 · 6 comments · May be fixed by #723
Open

Lookup of name in annotation fails on reparented object #295

mthuurne opened this issue Nov 27, 2020 · 6 comments · May be fixed by #723
Labels
Milestone

Comments

@mthuurne
Copy link
Contributor

twisted.logger.extractField() has return type typing.Any, but in the output this is not linked to the standard library documentation, despite the intersphinx inventory being available and other types being linked successfully.

What I think is happening here is that extractField() is implemented in twisted.logger._flatten, which imports Any from the typing module in the standard library. However, by the time the link is resolved, the extractField() has been moved into the twisted.logger module, it's public location. But this module does not import Any, so the lookup fails.

Note that doing the lookup before the move isn't correct either: when we link to other objects that get moved around, that link would be broken after the move. A similar problem occurs if we'd keep a reference to the original parent location and attempt to do a lookup there after all moves have happened, since the target would have been moved away then. I think the only way to do it correctly is to look up the target Documentable in a code model that reflects the actual source and when presenting ask that Documentable where in the output it resides and link there.

In an earlier discussion of problems related to cyclic imports, I wrote:

As for a long term solution, I'm starting to think that reparenting is a bad idea: it means we lose information about how the code is structured. Instead of altering the code model, we could create layer on top of it that contains the documentation structure. That would also allow us to thin down the interface of the model classes a bit, by moving presentation data out of the code model.

Linking annotations containing imported names would then work as follows:

  • look up the original code object in the same way Python does (trace it back recursively to source location)
  • look up where the code object ended up in the documentation layer (might not match source location if name was re-exported)
  • link to the documentation location

@adiroiban replied:

+1 yes. I think that we should always have a layer with the true structure.. and with that we can have a public API layer which does it best to remove duplicates and present the doc links using "canonical" paths.

I think that's a good way to look at it: objects have an implementation location, being the source file they're defined in, and a public API location, which is where the user will read about them in the documentation and import them from.

There are more issues related to reparenting, for example #184, in which the same name is re-exported in more than one location. So the mapping between implementation layer and public API layer might not be 1:1.

@tristanlatr
Copy link
Contributor

Keeping a layer with the "true structure" will also allow to display the Class in the module summary where they are actually defined, but link to where they are exported. It can be confusing to see an empty module because all classes are exported.

@mthuurne
Copy link
Contributor Author

mthuurne commented Nov 30, 2020

Keeping a layer with the "true structure" will also allow to display the Class in the module summary where they are actually defined, but link to where they are exported. It can be confusing to see an empty module because all classes are exported.

Agreed.

I've had several cases when I was looking at a class or function in Twisted and I wondered "where is this published?" Although it's usually not too hard to find out with the right grep pattern, being able to look it up in a private module's documentation page would be a lot easier.

@tristanlatr
Copy link
Contributor

tristanlatr commented Aug 21, 2021

I've think about this a bit and I think we could maintain 2 different models concurrently.

One model that would strictly represent the python sources and would include all python language related processing (expandName, etc). This model does not have to be based on the current model.Documentable, actually I think we should adopt docspec as the most "raw" object tree to start with. We should add some post processing on the docspec objects to add some information like zope interface or attrs, this is what I'm starting with pydocspec.

Then, the current model could be simplified (to defer to the underlying objects) and focused on the public API location of the objects. The reparenting process itself could be kept, but it should be enhanced to understand multiple re-exporting such that it fixes #184.

A new attribute could be added : Documentable.spec: docspec.ApiObject, and the reverse link from the pydocspec module should be done as well, such that we can link in a model that represents the true structure and then return the url of the documentable in the "api documentation" model.

The workflow for constructing the model should be changed from :

AST - (ASTBuilder) -> model.Module

To :

AST - (ASTBuilder) ->pydocspec.Module - (DocspecBuilder) -> model.Module

Tell me what you think,

Thanks

@mthuurne @adiroiban

@tristanlatr
Copy link
Contributor

tristanlatr commented May 23, 2022

Forgot about pydocspec, I think I'll try to go with something like I described here: #430 (comment)

I'll try to make it the more simple as possible but first we should settle on a rationale about re-exporting.

Here's what I suggest :

  • reparenting should not influence name resolution: expandName() and resolveName() in the context of the reparented object.
    The ExportedName objects should act like aliases to the reparented object.
  • reparenting process should:
    • move the Documentable object to the exported location and
      leave and ExportedName object in place of the implementation location (these objects will act like place holder to redirect to the re-exported location).
      This needs to be a recursive operation since the reparenting itselft is recursive.
  • when there is two or more re-exports, the name with most top level looking is choosen to be main object, and others are aliases (let's call them ExportedNameAlias).
  • when a re-exported name is re-exported again, the new re-export should point to the implementation object directly rather than the first re-exported name.
  • the docstring linker should use the implementation location as the Documentable context, but probably further tweaks are needed to generate optimized URLs when possible (for instance when we are reparenting a module or à class) and full URLs when required.
  • re-export names that are not part of the current system should create a (documented) alias

So all in all, we need to create proper Documentable entries for aliases and the linker needs few improvements before fixing this issue.

@tristanlatr
Copy link
Contributor

Here's what I suggest [...]

To be clear: when re-exporting a class, all it's members to a new module, we need both original class context as well as it's parent to resolve names, so ExportedName objects must recurse on members just like the reparenting does, this will create a lots of nested ExportedName instances, so we should not try to document them all, only the top-level one, but keeping the original full structure, I think it's possible by making these nested ExportedName hidden.

Such approach could sound rather complex, but it's probably the best way we have with our current model.

The other path to look into (as @mthuurne said) is to resolve the annotation and docstrings links at a time when no reparenting has been taken place yet. But there is another issue with that, the linker currently generates relative URLs when it can to avoid reloading the same page. If a function links to another function in the same module, the link will be relative, but, if this function get re-exported, the link will then be broken. So we need the re-exporting information to compute the right relative links.

(I have started a draft for improving the linker in branch better-docstring-linker-context-inferface, this will allow better control on relative URLs)

I'm starting to think that reparenting is a bad idea: it means we lose information about how the code is structured. Instead of altering the code model, we could create layer on top of it that contains the documentation structure. That would also allow us to thin down the interface of the model classes a bit, by moving presentation data out of the code model.

To answer these concerns, the proposed approach will keep the original true structure of the code, but will move around what's going to be the actual documented object and replaces it. But yes it would be good to have the distinction in between the model and the view-model, but I don't see a clear path towards such design.

Tell me what you think,

@tristanlatr
Copy link
Contributor

Disregard my three last comments.
I'm now pretty confident I know how to solve this issue with the path of least resistance.

See #723

This was referenced Sep 30, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
2 participants