Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Self network overloaded #274

Open
ShockedPlot7560 opened this issue Jun 13, 2023 · 9 comments
Open

[BUG] Self network overloaded #274

ShockedPlot7560 opened this issue Jun 13, 2023 · 9 comments
Labels
bug Something isn't working priority/high High priority issues unconfirmed

Comments

@ShockedPlot7560
Copy link

Describe the bug
This bug may seem bizarre at first glance, and it took us several days of diagnosis to arrive at the suspicion of the proxy.

To put it simply, the proxy itself generates 20x more network traffic than normal to a single server, causing network overload to that server and connection to the others.

This network load never leaves the public network, and remains only between the proxy and a particular server. We're talking about a 10-20Mb load, compared with not even 1Mb for normal traffic: Proxy <-> Server. This traffic exits the proxy, enters the pocketmine server (4.22.0) and is sucked in (does not exit either the proxy or the pocketmine server).

As you can see on the screen below, on the left we have the output of the proxy, and on the right the input from the pocketmine server.
image

The overload stops as soon as the pocketmine server is restarted, and then restarts itself after a random time. We were unable to find any correlation between activity on our network and this sudden increase in load.

On the pocketmine server side, we've been able to diagnose with the help of debugging, that during picks of this kind, a constant spam of logs coming from this line is observed.

A network capture at the interface between this proxy and the server enabled us to observe this kind of content being generated from the proxy:
image
172.20.0.4 being the proxy, 172.20.0.7 being the mc server.

To Reproduce
Steps to reproduce the behavior:
No source has yet identified a way to reproduce it.

Platform Information:

  • Server OS: Ubuntu
  • Java Version: Java 17 (Waterdog last pre-release)
  • Server Deployment Method: Custom docker build
@ShockedPlot7560 ShockedPlot7560 added bug Something isn't working unconfirmed labels Jun 13, 2023
@ShockedPlot7560
Copy link
Author

Here are the logs during a spam:
image

@ShockedPlot7560
Copy link
Author

After a number of bugs, we noticed that the picks all correspond to errors in packet decoding/decompression.

io.netty.handler.codec.DecoderException: java.util.zip.DataFormatException: invalid distance too far back

java.lang.IllegalArgumentException: Split packet part index out of range. Got 2, expected 0-1

The two errors above correspond to these problems:

@ShockedPlot7560
Copy link
Author

This is caused by pocketmine's ban address. Once deactivated, it works fine.

@TobiasGrether
Copy link
Member

@ShockedPlot7560 So did disabling pocketsmines packet rate limiting by IP fix this issue? If not, are you using anything like ProxyTransport?

@ShockedPlot7560
Copy link
Author

ShockedPlot7560 commented Jun 29, 2023

@ShockedPlot7560 So did disabling pocketsmines packet rate limiting by IP fix this issue? If not, are you using anything like ProxyTransport?

By completely disabling IP ban functionality in PHP with reflection, yes. We had simply increased the packet count to PHP_INT_MAX, but it didn't seem to work properly (proof of this issue).
But the ban address causes a serious bug on the Waterdog side.

We didn't use any plug-ins like ProxyTransport, apart from Stargate

@TobiasGrether
Copy link
Member

I'm a bit confused. You disabled the IP rate limiting in pocketmine and that fixed the issue, but it still proves this issue?

To investigate this issue a bit better, could you please do the following:

Utilise this as well as this class to gather some information regarding network data?

You can use this by calling the ProxyServer#setNetworkMetrics method with a custom implementation of these classes.

I can recommend using something like Prometheus or InfluxDB to record these data points.

This will contain lots of useful information that will help us resolve this issue. If you need any assistance, let me know.

@TobiasGrether TobiasGrether added the priority/high High priority issues label Jun 29, 2023
@ShockedPlot7560
Copy link
Author

I'm a bit confused. You disabled the IP rate limiting in pocketmine and that fixed the issue, but it still proves this issue?

We had limited the number of packets on pocketmine to PHP_INT_MAX, when the problem occurred. Sometimes, the number of packets seemed to reach this limit, causing a ban of the proxy's ip at PocketMine level. This ban caused the proxy to do something strange, spamming the server in a few seconds up to 15MB/s, always with exactly the same packet.

We then decided to completely deactivate the ban address on pocketmine using some rather tricky means. Since the ban address has been properly deactivated, there have been no further problems.

When debugging locally, we managed to reproduce this spam by simply blocking the proxy ip. (We observed a sudden increase in network traffic).

@TobiasGrether
Copy link
Member

I suspect that the fact that the IP gets banned causes the proxy to try to retransmit the packets that have been blocked by PMMP, causing this issue

@ShockedPlot7560
Copy link
Author

This is what was happening if you look at the screen I put up above, you can see that it's the same packet raknet all the time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority/high High priority issues unconfirmed
Projects
None yet
Development

No branches or pull requests

2 participants