cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806

holiman · 2024-11-25T12:26:53Z

This PR refactors the structlog a bit, making it so that it can be used in a streaming mode.

The commandline usage of structlog now uses the streaming mode, leaving the non-streaming mode of operation for the eth_Call.

There are two benefits of streaming mode

Not have to maintain a long list of operations,
Not have to duplicate / n-plicate data, e.g. memory / stack / returndata so that each entry has their own private slice.

Before:

[user@work go-ethereum]$ ./evm-master --debug --code 604019 run

#### TRACE ####
PUSH1           pc=00000000 gas=10000000000 cost=3

NOT             pc=00000002 gas=9999999997 cost=3
Stack:
00000000  0x40

STOP            pc=00000003 gas=9999999994 cost=0
Stack:
00000000  0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffbf

#### LOGS ####

After

[user@work go-ethereum]$ ./evm-structlog --debug --code 604019 run
PUSH1           pc=00000000 gas=10000000000 cost=3

NOT             pc=00000002 gas=9999999997 cost=3
Stack:
00000000  0x40

STOP            pc=00000003 gas=9999999994 cost=0
Stack:
00000000  0xffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffbf

MariusVanDerWijden · 2024-11-26T07:03:43Z

eth/tracers/logger/logger.go

+	// create a log
+	if l.writer == nil {
+		// Non-streaming, need to copy slices.
+		log.Memory = slices.Clone(log.Memory)


I'm wondering if it makes more sense to write them to a buffer in this case and flush them at the end if we want to support the non-streaming case. I don't see a case in the codebase where we use the structlogs directly, only ever in a formatted/textified form

The buffering is really due to them being included as items in a json-list, in the rpc response.

I mean shouldn't we marshal them here already into a buffer and not store them as individual logs, but I guess that will break external projects using the struct logger

We kind of could, but then we would have to redefine the other rpc message to not double-encode the already encoded snippets. It would allow us to have a better estimation of how much data we are accumulating, so instead of limiting for n number of items, we could limit for a certain number of MB.
Is there any other reason for you to suggest this?

Nope, basically these reasons. Will also limit the peak amount of memory, since we can free up the un-encoded data quickly and don't have to keep it around until the end of the encoding. Average memory consumption will go slightly up though (encoded bigger then not encoded)

What you are suggesting makes sense though, because it simplifies a whole lot.

holiman · 2024-11-26T09:48:30Z

I have now pushed some changes in the non-streaming encoder, which is used when returning "legacy" traces in the debug api.

To ensure that nothing substantial changed, this is how I tested it:

 go run ./cmd/geth --dev console --exec 'debug.traceCall({input: "0x600051600155602051600255"},"latest" , {"enableMemory": true})'  > ethcall_result

The expected response is

{
  failed: false,
  gas: 57606,
  returnValue: "",
  structLogs: [{
      depth: 1,
      gas: 49946818,
      gasCost: 3,
      op: "PUSH1",
      pc: 0,
      stack: []
  }, {
      depth: 1,
      gas: 49946815,
      gasCost: 6,
      op: "MLOAD",
      pc: 2,
      stack: ["0x0"]
  }, {
      depth: 1,
      gas: 49946809,
      gasCost: 3,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000"],
      op: "PUSH1",
      pc: 3,
      stack: ["0x0"]
  }, {
      depth: 1,
      gas: 49946806,
      gasCost: 2200,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000"],
      op: "SSTORE",
      pc: 5,
      stack: ["0x0", "0x1"],
      storage: {
        0000000000000000000000000000000000000000000000000000000000000001: "0000000000000000000000000000000000000000000000000000000000000000"
      }
  }, {
      depth: 1,
      gas: 49944606,
      gasCost: 3,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000"],
      op: "PUSH1",
      pc: 6,
      stack: []
  }, {
      depth: 1,
      gas: 49944603,
      gasCost: 6,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000"],
      op: "MLOAD",
      pc: 8,
      stack: ["0x20"]
  }, {
      depth: 1,
      gas: 49944597,
      gasCost: 3,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000", "0000000000000000000000000000000000000000000000000000000000000000"],
      op: "PUSH1",
      pc: 9,
      stack: ["0x0"]
  }, {
      depth: 1,
      gas: 49944594,
      gasCost: 2200,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000", "0000000000000000000000000000000000000000000000000000000000000000"],
      op: "SSTORE",
      pc: 11,
      stack: ["0x0", "0x2"],
      storage: {
        0000000000000000000000000000000000000000000000000000000000000001: "0000000000000000000000000000000000000000000000000000000000000000",
        0000000000000000000000000000000000000000000000000000000000000002: "0000000000000000000000000000000000000000000000000000000000000000"
      }
  }, {
      depth: 1,
      gas: 49942394,
      gasCost: 0,
      memory: ["0000000000000000000000000000000000000000000000000000000000000000", "0000000000000000000000000000000000000000000000000000000000000000"],
      op: "STOP",
      pc: 12,
      stack: []
  }]
}

The only diff between this PR and master is that this omits empty memory.

9d8
<       memory: [],
17d15
<       memory: [],

The early encoder makes it so we don't have to copy slices explicitly, we just encode a json.Rawmessage. This is ideal for keeping track of the response size. Earlier, the limit was on the number of elements, which is a very vague number (e..g depending on if each element has a megabyte of memory) -- we can now redefine this to mean a number of bytes.

MariusVanDerWijden

LGTM, but I think @s1na should also take a look

s1na · 2024-11-27T11:23:19Z

eth/tracers/tracers_test.go

+	// Create a tracer which records the number of steps
+	var steps = 0
+	tracer := &tracing.Hooks{
+		OnOpcode: func(pc uint64, op byte, gas, cost uint64, scope tracing.OpContext, rData []byte, depth int, err error) {


I feel like at this point we are benching EVM execution :) it seems this benchmark was introduced in #23016 to test optimizations to the struct logger. What do you think about dropping it?

I fixed it instead in the latest commit. It's horribly much slower though, due to the json-encoding and extra mem usage.

You were not kidding. It's 10x slower.

PR:

goos: darwin goarch: arm64 pkg: github.com/ethereum/go-ethereum/eth/tracers BenchmarkTransactionTraceV2-11 15 74619664 ns/op 87275606 B/op 897587 allocs/op PASS

Master:

goos: darwin goarch: arm64 pkg: github.com/ethereum/go-ethereum/eth/tracers BenchmarkTransactionTrace-11 163 6710361 ns/op 3798259 B/op 81648 allocs/op PASS

s1na · 2024-11-27T11:24:57Z

eth/tracers/logger/logger.go

+
+// toLegacyJSON converts the structlog to json-encoded legacy form (StructLogRes).
+//
+// The differences between the structlog json and the 'legacy' json are:


Where do we use the non-legacy json format? I can't seem to find it.

See my comment above, with debug tracecall

go run ./cmd/geth --dev console --exec 'debug.traceCall({input: "0x600051600155602051600255"},"latest" , {"enableMemory": true})'

The updated benchmark is a lot slower than previously, since previously the json-encoding was deferred to later. Also, storing the data as json-encoded strings is larger than the raw bytes. The BenchmarkTransactionTraceV2 is thus renamed from the original

holiman requested review from s1na, MariusVanDerWijden and lightclient as code owners November 25, 2024 12:26

holiman force-pushed the fix_structlogger branch from 3990d1c to c2da651 Compare November 25, 2024 12:30

MariusVanDerWijden reviewed Nov 26, 2024

View reviewed changes

holiman requested review from karalabe and rjl493456442 as code owners November 26, 2024 09:43

holiman force-pushed the fix_structlogger branch from 6142b72 to 84783af Compare November 26, 2024 17:35

MariusVanDerWijden reviewed Nov 27, 2024

View reviewed changes

s1na reviewed Nov 27, 2024

View reviewed changes

holiman added 6 commits November 30, 2024 20:14

cmd/evm, eth/tracers: refactor structlogger + make it streamable

ab445f0

core/vm, eth/tracers: simplify ethcall response type, encode early

ed801f8

eth/tracers: follow-up fixes

fddff58

eth/tracers/logger: fix nit re empty lists vs nil

784358c

eth/tracers: update benchmark

57d9e16

The updated benchmark is a lot slower than previously, since previously the json-encoding was deferred to later. Also, storing the data as json-encoded strings is larger than the raw bytes. The BenchmarkTransactionTraceV2 is thus renamed from the original

cmd/evm, eth/tracers/logger: lint nits

f18d591

holiman force-pushed the fix_structlogger branch from d2af996 to f18d591 Compare November 30, 2024 19:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806

cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806

holiman commented Nov 25, 2024

MariusVanDerWijden Nov 26, 2024

namiloh Nov 26, 2024

MariusVanDerWijden Nov 26, 2024

holiman Nov 26, 2024

MariusVanDerWijden Nov 26, 2024

holiman Nov 26, 2024

holiman commented Nov 26, 2024 •

edited

Loading

MariusVanDerWijden left a comment

s1na Nov 27, 2024

namiloh Nov 27, 2024

holiman Nov 27, 2024

s1na Nov 27, 2024

s1na Nov 27, 2024

namiloh Nov 27, 2024

holiman Nov 27, 2024

cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806

Are you sure you want to change the base?

cmd/evm, eth/tracers: refactor structlogger + make it streamable #30806

Conversation

holiman commented Nov 25, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

holiman commented Nov 26, 2024 • edited Loading

MariusVanDerWijden left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

holiman commented Nov 26, 2024 •

edited

Loading