Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poxy runs into "File name too long" OSError #21

Open
1 task done
tim-janik opened this issue Feb 28, 2023 · 13 comments
Open
1 task done

Poxy runs into "File name too long" OSError #21

tim-janik opened this issue Feb 28, 2023 · 13 comments
Assignees
Labels
bug Something isn't working

Comments

@tim-janik
Copy link

Environment

version and/or commit hash:

Poxy via pip install on Ubuntu 22.04:

$ poxy --version
0.12.3

Describe the bug

I just tried to get a basic config going to generate docs from:

https://github.com/tim-janik/anklang/tree/trunk/ase

But that soon runs into (warnings omitted):

$ poxy
Poxy v0.12.3
Reading /opt/src/anklang/poxy.toml
Generating XML files with Doxygen 1.9.1
Post-processing XML files

*************

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/poxy/main.py", line 339, in main
    run(
  File "/usr/local/lib/python3.10/dist-packages/poxy/run.py", line 1621, in run
    postprocess_xml(context)
  File "/usr/local/lib/python3.10/dist-packages/poxy/run.py", line 726, in postprocess_xml
    if f.exists():
  File "/usr/lib/python3.10/pathlib.py", line 1290, in exists
    self.stat()
  File "/usr/lib/python3.10/pathlib.py", line 1097, in stat
    return self._accessor.stat(self, follow_symlinks=follow_symlinks)
OSError: [Errno 36] File name too long: '/tmp/poxy/opt_src_anklang/xml/namespace_ase_1_1_writ_converter_3_01_t_00_01_r_e_q_u_i_r_e_sv_3_01_9std_1_1is__base__of_3_01_serializable_00_01_t_01_4_1_1value_01_6_6_9_jsonipc_1_1_derives_vector_3_01_t_01_4_1_1value_01_6_6_9_jsonipc_1_1_derives_shared_ptr_3_01_t_01_4_1_1value_01_6_6std_1_1is__class_3_01_t_01_4.xml'

*************

You appear to have triggered an internal bug!
Please re-run poxy with --bug-report and file an issue at github.com/marzer/poxy/issues
Many thanks!

*************

Additional information

The poxy.toml file is basically copyied from the toml example.

--bug-report

  • I have attached the zip file generated when using the --bug-report option

poxy_bug_report.zip

@tim-janik tim-janik added the bug Something isn't working label Feb 28, 2023
@marzer
Copy link
Owner

marzer commented Mar 1, 2023

Oh, wow, what an enormous file name! That's an xml file generated by doxygen, not by poxy specifically, so I'm actually not sure how to avoid that, short of simplifying/refactoring the source C++ to make the doxygen symbol path is shorter.

Can you show me some of the C++ code that is generating this? Specifically it looks like you're using a macro REQUIRES to do some SFINAE; any snippet with that would be helpful. Never mind, I did some digging myself, see below.

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

(cc @mosra - get a load of this behemoth filename Doxygen generated 😅)

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

Ok so I did some digging into your serialize.hh and found the culprit:

https://github.com/tim-janik/anklang/blob/3e64716fc0bc6dbb444f8bef12983b05f31af4d4/ase/serialize.hh#L454-L459

Since these are template specializations there's no way to make their Doxygen symbols shorter without also losing information (as the specialization's definition is it's identity) - you can't do any clever preprocessor tricks to simplify it without also losing unique specializations.

That being said, since practically all template specializations are used simply as a form of static polymorphism and don't actually change interfaces, you can typically avoid emitting documentation for them entirely. My suggestion here would be to tell doxygen to ignore them entirely using a @cond/@endcond block, e.g.:

/// @cond

template<typename T>
struct WritConverter<T, /* specialization 1 */> { /* ... */};

template<typename T>
struct WritConverter<T, /* specialization 2 */> { /* ... */};

/// @endcond

@tim-janik
Copy link
Author

Oh, wow, what an enormous file name! That's an xml file generated by doxygen, not by poxy specifically, so I'm actually not sure how to avoid that, short of simplifying/refactoring the source C++ to make the doxygen symbol path is shorter.

Can you show me some of the C++ code that is generating this? Specifically it looks like you're using a macro REQUIRES to do some SFINAE; any snippet with that would be helpful.

It is the REQUIRESv<> macro, e.g. here: https://github.com/tim-janik/anklang/blob/trunk/ase/serialize.hh#L387
I can hack around that atm with:

sed 's/REQUIRESv<[^>]*>/REQUIRES<true>/' -i ase/serialize.hh

Which actually allows me to generate documentation for the ase/ dir with poxy.

PS: I just fail to get any file listed in the /files.html index, even adding @file doesn't help. Short of extract_all=true, but that includes anon namespaces even though poxy sets EXTRACT_ANON_NSPACES = NO. Do you need a separate bug report for that or am I missing some config option?

@tim-janik
Copy link
Author

Ok so I did some digging into your serialize.hh and found the culprit:

https://github.com/tim-janik/anklang/blob/3e64716fc0bc6dbb444f8bef12983b05f31af4d4/ase/serialize.hh#L454-L459

Since these are template specializations there's no way to make their Doxygen symbols shorter without also losing information (as the specialization's definition is it's identity) - you can't do any clever preprocessor tricks to simplify it without also losing unique specializations.

Yeah, is there a way to turn that into a Doxygen bug instead?
E.g. is poxy using a config that triggers Doxygen into genrating too long file names?

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

It is the REQUIRESv<> macro

Looks like we replied at much the same time. See my suggestion above.

PS: I just fail to get any file listed in the /files.html index, even adding https://github.com/file?type=source doesn't help.

That's an m.css feature, see here. In short, entities need to have some documentation (a brief, detail, a remark, function args, something) in order to appear in the generated HTML. In the case of files, merely using @file is insufficient, since that just tells doxygen "the following commands relate to this file", and isn't itself a piece of documentation - you'd need a @brief or similar, e.g.:

/// @file
/// @brief This is where the serializer lives.

E.g. is poxy using a config that triggers Doxygen into genrating too long file names?

Nothing specific - I'm not (to my knowledge) overriding any options that would impact filename length. My guess is that python's filename handling in Pathlib is stricter on file name length than it needs to be, hence why Doxygen is OK but poxy falls down here. That's a guess, though. I haven't encountered this before. It's also reasonable to suggest that it is a bug in Doxygen for them to be generating filenames that long instead of using a hash or something because holy crap, but that doesn't help here 😅

I can do some long symbol renaming in the XML preprocess step but it's a nontrivial fix so it will have to wait until this weekend at the earliest. Is the @cond/@endcond workaround a reasonable one in the short-term?

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

OH! I've just realized that doxygen has the SHORT_NAMES option, which might actually already be a nice built-in fix for this. I'll do some testing with it at some point over the next few days.

@wroyca
Copy link
Contributor

wroyca commented Mar 1, 2023

It's also reasonable to suggest that it is a bug in Doxygen for them to be generating filenames that long instead of using a hash or something because holy crap, but that doesn't help here

I can do some long symbol renaming in the XML preprocess step but it's a nontrivial fix

😅

I wonder if we can actually implement something like hashing on our end here--I'm unaware of doxygen internals specifically, but surely we can intercept in the middle?

OH! I've just realized that doxygen has the SHORT_NAMES option, which might actually already be a nice built-in fix for this. I'll do some testing with it at some point over the next few days.

This is pretty neat! I'll have to remember this for future uses

@tim-janik
Copy link
Author

PS: I just fail to get any file listed in the /files.html index, even adding https://github.com/file?type=source doesn't help.

That's an m.css feature, see here. In short, entities need to have some documentation (a brief, detail, a remark, function args, something) in order to appear in the generated HTML. In the case of files, merely using @file is insufficient, since that just tells doxygen "the following commands relate to this file", and isn't itself a piece of documentation - you'd need a @brief or similar, e.g.:

/// @file
/// @brief This is where the serializer lives.

Thanks for that. But even pasting your snippet literally into my header file still yields an empty files.html index. Can you please point me to any working example of poxy generating a non-empty files.html ?

I can do some long symbol renaming in the XML preprocess step but it's a nontrivial fix so it will have to wait until this weekend at the earliest. Is the @cond/@endcond workaround a reasonable one in the short-term?

Thanks, there is really no need to rush anything for me, I'm just evaluating poxy atm, not using it in production or so ;-)
Also, it is really just one template case, the one in: https://github.com/tim-janik/anklang/blob/trunk/ase/serialize.hh#L455
I'll probably just comment it out for Doxygen based generations, as you said that won't affect the documentation much.

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

Can you please point me to any working example of poxy generating a non-empty files.html ?

Here you go:

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

Depending on how your project is structured you may also need a documentation file in the containing directory for the child files to appear, e.g. https://github.com/marzer/muu/blob/master/include/muu/folder.dox

(same goes for any other folder that forms part of the documented hierarchy)

@marzer
Copy link
Owner

marzer commented Mar 1, 2023

@wroyca

I'm unaware of doxygen internals specifically, but surely we can intercept in the middle?

Yup, it's possible. The 'XML v2' stuff I was working on did this to stabilise some IDs and it worked OK. The main issue here would be python falling down with long filenames; since Doxygen uses the IDs as filenames, I'd need to be able to interact with the filesystem to rename them, and it appears that even forming a path to do that will cause the OSError. There's probably a workaround (e.g. invoking the shell), but it looks like the SHORT_NAMES might already have me covered 😄

@mosra
Copy link

mosra commented Mar 1, 2023

FYI, for excessively long names there's another option, CASE_SENSE_NAMES, which is NO on OSes with case-insensitive default filesystems (Windows, macOS) and causes the crazy long underscored names to appear. Unless your codebase contains classes, namespaces or files that differ only by case (which is rather unlikely I think), it should be safe to set to YES. Compared to SHORT_NAMES you'll get the URLs still somewhat readable with that option.

sed 's/REQUIRESv<[^>]*>/REQUIRES<true>/'

What I often do for such constructs is listing them in the PREDEFINED option and defining them to empty Ah, alright, it's not really a macro due to the <>. Then this is a useless suggestion.

I can do some long symbol renaming in the XML preprocess step but it's a nontrivial fix

I want to look into that for m.css itself eventually, because I suspect it'll need a lot more than just XML preprocess. And I could possibly implement INLINE_SIMPLE_STRUCTS and other useful output reorganization features alongside that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants