Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-73090] Handle CR from LineTransformationOutputStream #9219

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

jglick
Copy link
Member

@jglick jglick commented May 1, 2024

See JENKINS-73090 for background.

Testing done

Without the fix, the cr test with the count set to a billion failed with

java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.Arrays.copyOf(Arrays.java:3745)
	at java.base/java.io.ByteArrayOutputStream.grow(ByteArrayOutputStream.java:120)
	at java.base/java.io.ByteArrayOutputStream.ensureCapacity(ByteArrayOutputStream.java:95)
	at java.base/java.io.ByteArrayOutputStream.write(ByteArrayOutputStream.java:137)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:56)
	at hudson.console.LineTransformationOutputStream.write(LineTransformationOutputStream.java:75)
	at java.base/java.io.OutputStream.write(OutputStream.java:122)
	at hudson.console.LineTransformationOutputStreamTest.cr(LineTransformationOutputStreamTest.java:45)

(I am not leaving this count that high in the committed test since when it passes it takes over two minutes of high CPU on my laptop.)

I also tried installing the Timestamper plugin, enabling the global decorator (auto timestamps in all Pipeline builds), and running the Windows git clone pipeline mentioned in Jira. This produced timestamped output, though the output did seem to come all at once, though that seemed to be an aspect of Git not Jenkins so far as I could tell from looking at the jenkins.log file in the workspace temp dir; so I also tried

node('linux') {
    sh 'set +x; for i in `seq 00 99`; do printf "$i\r"; sleep 1; done'
}

which did produce live timestamped output, though the classic console does not reliably show it; Pipeline Graph View plugin does. (tail -f …/builds/…/log from a terminal actually updates the line in place.)

Proposed changelog entries

  • Treat lines of text (mainly in build logs) as completed by a single carriage return in addition to a newline or carriage return plus newline, avoiding an out of memory error if a large number of such lines are printed in sequence.

Before the changes are marked as ready-for-merge:

Maintainer checklist

Edit tasklist title
Beta Give feedback Tasklist Maintainer checklist, more options

Delete tasklist

Delete tasklist block?
Are you sure? All relationships in this tasklist will be removed.
  1. There are at least two (2) approvals for the pull request and no outstanding requests for change.
    Options
  2. Conversations in the pull request are over, or it is explicit that a reviewer is not blocking the change.
    Options
  3. Changelog entries in the pull request title and/or Proposed changelog entries are accurate, human-readable, and in the imperative mood.
    Options
  4. Proper changelog labels are set so that the changelog can be generated automatically.
    Options
  5. If the change needs additional upgrade steps from users, the upgrade-guide-needed label is set and there is a Proposed upgrade guidelines section in the pull request title (see example).
    Options
  6. If it would make sense to backport the change to LTS, a Jira issue must exist, be a Bug or Improvement, and be labeled as lts-candidate to be considered (see query).
    Options

@jglick
Copy link
Member Author

jglick commented May 2, 2024

#3580 👀

@NotMyFault NotMyFault requested a review from a team May 4, 2024 08:11
@jglick jglick requested a review from dwnusbaum May 9, 2024 13:56
Copy link
Member

@dwnusbaum dwnusbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense. It is quite awkward right now that LineTransformationOutputStream has different behavior to most other Java line-based APIs like BufferedReader.readLine and Files.readAllLines (makes it difficult to use these APIs together for things like reading log files).

I do wonder about potential performance impact. It might also be worth running the PCT to get an idea in advance of how many tests might be affected, if any.

@jglick
Copy link
Member Author

jglick commented May 9, 2024

I do wonder about potential performance impact.

Any particular concerns?

It might also be worth running the PCT

Good point, I can do that in bom and/or @cloudbees.

@dwnusbaum
Copy link
Member

I do wonder about potential performance impact.

Any particular concerns?

My worry is just that this is inner loop code that looks at every byte that gets written to a build log, so it seems possible that even the small changes here which add a field lookup and some branches could have measurable impact at scale. We don't have benchmarks to measure against though, and maybe JIT compilation, branch prediction, and out-of-order execution will make it a nonissue. Either way, unless the performance impact is massive (which I doubt), I think the change is probably worth it.

@jglick
Copy link
Member Author

jglick commented May 10, 2024

this is inner loop code that looks at every byte

Indeed it is. On the other hand it is already calling ByteArrayOutputStream.write(byte) on every byte, even when write(byte[], int, int) is called, which seems like the most obvious target for optimization. I would expect this to be negligible compared to the overhead of receiving input on the one side (typically from a TLS-encrypted Remoting pipe) and forwarding content in Delegating on the other hand (typically to a file handle ultimately), plus various other processing such as masking secrets.

In the particular case being fixed here, that of a large block of text separated by CR, this is not just a performance improvement but potentially necessary to avoid OOME. Presumably that case is unusual and most text is NL-delimited or occasionally CRNL.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants