Greatly speed up doctests by compiling compatible doctests in one file #123974

GuillaumeGomez · 2024-04-15T17:34:38Z

I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...).

The following tests are not included into the combined doctests:

compile_fail
#![no_std]
have invalid AST
test_harness
no capture
--show-output (because the output is nicer without the extra code surrounding it)

Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately.

Because of the edition codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic.

Now the interesting part, I ran it on a few crates and here are the results (with --doc to only include doctests):

crate	nb doctests	before this PR	with this PR
sysinfo	227	4.6s	1.11s
geos	157	3.95s	0.45s
core	4604	54.08s	23.20s
std	1147	13.88s	8.82s

r? @notriddle

GuillaumeGomez · 2024-04-16T18:19:36Z

Seems like where there are enough tests, libtest simply blocks and does nothing. I'll split tests by chunks of ~200 tests at once to start and then check how to improve output.

rustbot · 2024-04-17T14:04:06Z

Some changes occurred in run-make tests.

cc @jieyouxu

petrochenkov · 2024-04-17T14:45:47Z

There should probably be some opt out from this merging.

There are numerous reasons why two pieces of code may cause issues when combined together (they should compete for some common resource, in general, and that is not always detectable by a compiler).
This was previously attempted with run-pass tests from the compiler test suite, but it ended up being too brittle.

notriddle · 2024-04-17T15:01:55Z

How does this affect env_logger?

GuillaumeGomez · 2024-04-17T16:26:48Z

It fails in multiple tests:

running 28 tests
test src/fmt/humantime.rs - fmt::humantime::Formatter::timestamp (line 17) ... ok
test src/fmt/mod.rs - fmt (line 21) ... ok
test src/fmt/mod.rs - fmt (line 42) ... ok
test src/fmt/mod.rs - fmt::Formatter (line 121) ... ok
test src/lib.rs - (line 175) ... ok
test src/lib.rs - (line 234) ... ok
test src/logger.rs - logger::Builder::filter (line 382) ... ok
test src/logger.rs - logger::Builder::filter_level (line 360) ... ok
test src/logger.rs - logger::Builder::format (line 231) ... ok
test src/logger.rs - logger::Builder::filter_module (line 341) ... ok
test src/logger.rs - logger::Builder::write_style (line 436) ... ok
test src/logger.rs - logger::Logger::from_default_env (line 600) ... ok
test src/logger.rs - logger::Builder::target (line 415) ... ok
test src/logger.rs - logger::Logger::from_env (line 567) ... ok
test src/logger.rs - logger::Logger::from_env (line 576) ... ok
test src/lib.rs - (line 253) ... FAILED
test src/lib.rs - (line 214) ... FAILED
test src/lib.rs - (line 19) ... FAILED
test src/logger.rs - logger::Builder (line 22) ... FAILED
test src/logger.rs - logger::Builder::from_default_env (line 176) ... FAILED
test src/logger.rs - logger::Builder::from_env (line 96) ... FAILED
test src/logger.rs - logger::Builder::from_env (line 86) ... FAILED
test src/logger.rs - logger::Builder::parse_default_env (line 199) ... FAILED
test src/logger.rs - logger::Builder::new (line 56) ... FAILED
test src/logger.rs - logger::Builder::parse_env (line 123) ... FAILED
test src/logger.rs - logger::Builder::parse_env (line 138) ... FAILED
test src/logger.rs - logger::init_from_env (line 927) ... FAILED
test src/logger.rs - logger::try_init_from_env (line 890) ... FAILED

failures:

---- src/lib.rs - (line 253) stdout ----
thread 'src/lib.rs - (line 253)' panicked at src/logger.rs:499:14:
Builder::init should not be called after logger initialized: SetLoggerError(())
stack backtrace:
   0:     0x564cf5d0b200 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::h9cb9ba9de157d448
   1:     0x564cf5d3115b - core::fmt::write::h712ed17a4bea48d7
   2:     0x564cf5cf3ff9 - std::io::Write::write_fmt::ha9b4cf5497604558
   3:     0x564cf5d0afbe - std::sys_common::backtrace::print::h527405b0eb64d489
   4:     0x564cf5d0c09a - std::panicking::default_hook::{{closure}}::h7c8284f1aa6c5571
   5:     0x564cf5d0bdbf - std::panicking::default_hook::haf59e725ae5fe4cb
   6:     0x564cf5ca07fa - test::test_main::{{closure}}::h24de74550d40f595
   7:     0x564cf5d0c749 - std::panicking::rust_panic_with_hook::hc2212689298edd22
   8:     0x564cf5d0c4f6 - std::panicking::begin_panic_handler::{{closure}}::h2058e51712b5fdf6
   9:     0x564cf5d0b419 - std::sys_common::backtrace::__rust_end_short_backtrace::h1dc4a638a0e69753
  10:     0x564cf5d0c227 - rust_begin_unwind
  11:     0x564cf5b2dc16 - core::panicking::panic_fmt::h88caaa095c0a5b53
  12:     0x564cf5b2e586 - core::result::unwrap_failed::ha5ec7624a00a22e4
  13:     0x564cf5b391fe - core::result::Result<T,E>::expect::ha6de1769dcf31264
                               at /home/imperio/rust/rust/library/core/src/result.rs:1034:23
  14:     0x564cf5b3610d - env_logger::logger::Builder::init::h3eaa9af3496d3a3d
                               at /home/imperio/rust/env_logger/src/logger.rs:498:9
  15:     0x564cf5b31e74 - rust_out::__doctest_8::main::hc4539ff026f50669
  16:     0x564cf5b31ee3 - rust_out::__doctest_8::TEST::{{closure}}::hffdb7dfd4fa1d8d0
  17:     0x564cf5b30176 - core::ops::function::FnOnce::call_once::h74d127fff682a19d
  18:     0x564cf5ca5d92 - test::__rust_begin_short_backtrace::h377d4000c87befa2
  19:     0x564cf5cc18b2 - test::types::RunnableTest::run::h656e7414cfb3441c
  20:     0x564cf5ca6056 - test::run_test_in_process::hcb4c940c64146602
  21:     0x564cf5cae2bd - std::sys_common::backtrace::__rust_begin_short_backtrace::h6035574334cd54dd
  22:     0x564cf5cb89c5 - core::ops::function::FnOnce::call_once{{vtable.shim}}::hebcf07f42d5ca27a
  23:     0x564cf5d1757b - std::sys::pal::unix::thread::Thread::new::thread_start::h21bb0d8f9a9dc8fb
  24:     0x7fcbf7494ac3 - start_thread
                               at ./nptl/pthread_create.c:442:8
  25:     0x7fcbf7526850 - __GI___clone3
                               at ./misc/../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
  26:                0x0 - <unknown>

I implemented a fallback in case compilation fails. I could do the same for failing tests. Like that, we can support both current doctests with shared context and others who don't have have this shared context issue can enjoy having doctests running much faster. What do you think?

EDIT: Another solution would be to add a new codeblock attribute like "standalone" to say that a doctest should not be combined with others, but I'm not a big fan of this approach.

notriddle · 2024-04-17T17:24:58Z

I implemented a fallback in case compilation fails. I could do the same for failing tests.

That only helps with spurious failure. Spurious success is an even worse problem, if a test case accidentally relies on a global resource that another test set up. Any kind of merging has this problem, but it gets worse if the merging only happens sometimes.

This is also a problem with TEST_BATCH_SIZE, I think. If a test accidentally relies on a global resource set up by another test, then the tests might magically stop passing when 250 other tests get sandwiched between them.

GuillaumeGomez · 2024-04-17T17:57:11Z

The logic currently is as follows: we merge tests, if the compilation of the combined tests failed, we fallback to the current behaviour.

One thing I thought though: we could spawn a new process for each test. That would prevent them from using "in program" common resources.

This is also a problem with TEST_BATCH_SIZE, I think. If a test accidentally relies on a global resource set up by another test, then the tests might magically stop passing when 250 other tests get sandwiched between them.

Yes it's the current issue with env_logger, hence why I'm suggesting to either try running all tests in the batch independently or start a new process for each doctest.

notriddle · 2024-04-17T19:00:43Z

The troublesome possibility that I'm thinking of is:

You start by setting things up so that all your tests pass in a batch, because one test relies on a process-global resource that another one set up. This is what I called a "spurious passing test."
You change something so that compilation in a batch fails, and the system silently runs all your tests in separate processes instead.
The error you get (something like "unwrap() called on None") is very far away from the change that caused things to break. It's "spooky action at a distance," and we should do better than that.

GuillaumeGomez · 2024-04-17T19:54:59Z

What about changing the default way to run doctests starting the 2024 edition?

EDIT: With a new standalone attribute allowing a doctest to not be part of the combined doctests.

notriddle · 2024-04-17T20:21:54Z

That's better than implicit fallbacks.

GuillaumeGomez · 2024-04-17T20:40:16Z

Agreed. Should I add the new standalone doctest attribute in this PR as well?

bors · 2024-05-14T17:27:32Z

☔ The latest upstream changes (presumably #125120) made this pull request unmergeable. Please resolve the merge conflicts.

GuillaumeGomez · 2024-05-14T19:26:55Z

Fixed merge conflict (it was in the run-make test I updated).

bors · 2024-05-18T20:07:38Z

☔ The latest upstream changes (presumably #125254) made this pull request unmergeable. Please resolve the merge conflicts.

…e are too many

…valid code for combined doctests

…pdate associated test to check both combined and non-combined doctests

…he 2021 edition

To do so, AST error detection was improved in order to not filter out too many doctests.

GuillaumeGomez · 2024-05-19T13:06:53Z

Fixed merge conflict.

rustbot assigned notriddle Apr 15, 2024

rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Apr 15, 2024