Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make latex able to rebuild only what is needed, just as make html? #13181

Open
dbitouze opened this issue Dec 16, 2024 · 9 comments
Open

make latex able to rebuild only what is needed, just as make html? #13181

dbitouze opened this issue Dec 16, 2024 · 9 comments

Comments

@dbitouze
Copy link

Is your feature request related to a problem? Please describe.
I'm always frustrated when I have to recompile the whole PDF of my project, even though only one of its many source files has changed, whereas, for HTML output, only the corresponding HTML file is rebuilt.

Describe the solution you'd like
It would be nice to have make latex as make html able to rebuild only what is needed. For this, make latex would have to copy the HTML builder but it looks needed then for its mechanism to produce in build repertory as many .tex files as there are source files. Hopefully it should be possible from one extra "main" file to "input" all the small "unit files". Is this possible?

@dbitouze dbitouze added the type:proposal a feature suggestion label Dec 16, 2024
@jfbu jfbu added this to the some future version milestone Dec 16, 2024
@jfbu
Copy link
Contributor

jfbu commented Dec 16, 2024

This is an interesting idea which however will speed up only the make latex phase. The PDF build itself by pdflatex, lualatex, xelatex or uplatex will still redo the entirety of its job. One could imagine that final PDF would be constructed from "joining" all small PDF's, in which case, indeed only one "small" PDF is rebuilt when one source file is modified, but I am personally unaware of open source tool which does this and handles correctly cross-references and pagination : if one small PDF increases from 3 to 4 pages, clearly that needs modifying all page numbering in the other small PDF's and the page targets of cross-references.

I think a start of a similar conversation was done already on this site but don't have a link at this time.

Anyway, thanks for proposal.

@dbitouze
Copy link
Author

@jfbu "Joining" all small PDF's will make all of them start a new page and at the top of the page, which isn't what is expected, is it?

IMHO, obtaining only a small PDF would be just for checking the layout of the modifications just done in a single source file.

@electric-coder
Copy link

electric-coder commented Dec 17, 2024

able to rebuild only what is needed

The way HTML is organized part of the build is split across several files that are linked (e.g. rebuilding a single page assumes no significant changes happened to the index) but PDF is different in that it's compiled so to rebuild one part you have to rebuild the whole thing, compile it and finally compress it. (Not to mention any LaTeX magic that may also impose requirements on the final document that HTML does not.)

only the corresponding HTML file is rebuilt

With HTML output the build process isn't as simple as it may seem, if you're changing CSS, JS, or some included SVG the build won't detect changes to those files so any change also requires rebuilding the whole project.

It's tedious, I know..!

@dbitouze
Copy link
Author

@electric-coder I understand that the “normal” PDF output requires to rebuild the whole thing: currently make latex creates a single, monolithic (and possibly huge) .tex file that gathers all the TeX counterparts of all the (.rst or .md) source files; so, when you want to have a look at the resulting PDF, you have only this single, monolithic file to compile.

My point here is: it would nice for make latex to create:

  • a parent (main) .tex file,
  • for each (.rst or .md) source file, a child .tex (sub)file which would be \input into the main file.

Hence, if what is to be rebuilt is:

  • the whole PDF, just compile the parent file,
  • the (“small”) PDF(s) corresponding to some child .tex (sub)file(s) corresponding to the (.rst or .md) source file(s) that have been modified, compile (say) a copy of the parent file which \inputs, not all the child .tex (sub)files, but only these child .tex (sub)file(s).

@electric-coder
Copy link

electric-coder commented Dec 17, 2024

@dbitouze LaTeX isn't my forte but I don't think that having "individual PDFs" would speed up the build of the final monolith due to requirements on internal consistency that HTML does not impose but the other formats do. So, if there's an omission in the index both PDF and LaTeX are likely to throw errors whereas a split HTML won't.

a copy of the parent file which \input

Your argument is somewhat conceptual but, as @jfbu hinted, that's just not how the other document's build process works because there are requirements on internal consistency, it seems LaTeX itself doesn't provide the necessary underlying tooling: see Does LaTeX have to reprocess included files that haven't changed?

@jfbu
Copy link
Contributor

jfbu commented Dec 17, 2024

@dbitouze

@jfbu "Joining" all small PDF's will make all of them start a new page and at the top of the page, which isn't what is expected, is it?

Indeed, of course.

IMHO, obtaining only a small PDF would be just for checking the layout of the modifications just done in a single source file.

I see. I understand the idea. I see one major difficulty regarding cross references, when working with \input. As @electric-coder LaTeX is not so much my forte (I am more at ease with Plain TeX), but I am vaguely aware of \include/\includeonly, which could help with the cross-references, although I have not practiced that for a long time, and I remember \include does a clearpage which causes a problem if we are to structure the main .tex file as a bunch of \include with respect to the "atomic" ones.

If we were to produce "atomic" ones via Sphinx we would also have to revisit the whole way cross references, especially footnotes, are resolved by the Sphinx LaTeX builder, which has been worked out with a "single document" in mind.

Sphinx does produce mark-up in the output called \sphinxstepscope which afair matches original source files of the compiled project, it looks possible to give those a hacky LaTeX definition so that if the latexmk build is triggered in a special way such as \def\sourcefiletokeep{here the normalized name of the source file} and some \input of the big tex file, the \sphinxstepscope if enriched into \sphinxstepscope{the normalized name of the source file this is starting} could act as a toggle between "ignore all material from now on" and "keep all material from now on". But cross-references will cause issues on build.

I think your problem is of realistic concern only for big projects which have a significant build time for the PDF build phase, and that for most of them, checking the HTML build is a better alternative, I mean by that few people are really interested into checking the impact of a local change into the PDF build. I do think the issue is interesting to discuss, but Icross-references not being to solely other "web pages/documents" but having to know the final page number in the final big PDF, this whole topic looks like a rather difficult one, for a (as I can judge but I may be wrong) somewhat "niche" (I hope this works in English) purpose.

@dbitouze
Copy link
Author

@electric-coder

LaTeX isn't my forte but I don't think that having "individual PDFs" would speed up the build of the final monolith [...]

But that's not what I'm about :) My aim is not to speed up the construction of the final monolith, just to be able to construct a partial PDF containing only the few pages corresponding to the modified source file(s).

For example, our LaTeX FAQ is made up of almost 1400 source files (.md) and the complete PDF is over 2050 pages long. When a contributor modifies something in one of the 1400 source files (.md), for example the one corresponding to this short FAQ, he would be interested in checking the PDF output of the modified file, which is quickly constructed because it contains only (or almost only) the one or two corresponding pages.

@dbitouze
Copy link
Author

@jfbu

I see one major difficulty regarding cross references, when working with \input. As @electric-coder LaTeX is not so much my forte (I am more at ease with Plain TeX), but I am vaguely aware of \include/\includeonly, which could help with the cross-references, although I have not practiced that for a long time, and I remember \include does a clearpage which causes a problem if we are to structure the main .tex file as a bunch of \include with respect to the "atomic" ones.

You're right about \input which would miss the cross references if targets are in files not
\inputed” in order to speed up the compilation of the partial PDF. And, indeed, \include/\includeonly is better in this respect, as long as the whole PDF has been built (and stabilized) at least once. But:

  • missing the cross-references (replaced by “??” by LaTeX) is most of the time acceptable if the purpose is just to check the PDF output,
  • indeed, \include does a \clearpage but it could be redefined into \input for the whole PDF.

@electric-coder
Copy link

electric-coder commented Dec 18, 2024

just to be able to construct a partial PDF containing only the few pages corresponding to the modified source file(s).

Lets split this sentence:

  1. just to be able to construct a partial PDF containing only the few pages

The two common alternatives:

A. Construct a second /doc directory to build a separate documentation containing links to only the sources you want to build - that is: a second main index toctree changed manually to only include the files that changed in your case. (One project using this kind of layout is Pyramid docs their motivation being to have one "organic" documentation and another inner API doc with linear package/module layout for machine consumption). You'll be running into the difficulty: Can sphinx link to documents that are not located in directories below the root document?.

B. Same as the above but keeping a "development" version of your docs separately from the "production" version where you selectively wrap whatever is causing delays (the main toctree) in an ifconfig controlled via a conf.py option.

  1. the few pages corresponding to the modified source file(s)

Here Sphinx itself keeps track of changed files (the manual solutions above being obviously inconvenient). It's a lot easier to just use the alternative proposed by @jfbu

checking the HTML build is a better alternative


Describe the solution you'd like

It would be nice to have make latex as make html able to rebuild only what is needed.

So a development convenience tool (as a builder CLI flag) to build just the the latest changed sources as individual PDFs ignoring any internal inconsistencies arising from incompleteness (a kind of --keep-going) ... There's definitely a valid use case here, there's also a promising change with #12882 to speed up builds by an order of magnitude that might make this kind of issue with large documents a non-problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants