Skip to content
This repository has been archived by the owner on Jul 12, 2021. It is now read-only.

make sure ZeroDivisionError never happens #155

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

webdawg
Copy link

@webdawg webdawg commented Mar 31, 2016

When trying to get my electrum server synced and build the entire chain it kept stopping with this error:

Exception in thread Thread-4:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(self.__args, *self.__kwargs)
File "/usr/lib/python2.7/site-packages/electrumserver/blockchain_processor.py", line 104, in do_catch_up
self.catch_up(sync=False)
File "/usr/lib/python2.7/site-packages/electrumserver/blockchain_processor.py", line 709, in catch_up
self.print_time(n)
File "/usr/lib/python2.7/site-packages/electrumserver/blockchain_processor.py", line 128, in print_time
tx_per_second = (1-alpha2) * tx_per_second + alpha2 * num_tx / delta
ZeroDivisionError: float division by zero

^CINFO:electrum:Stopping Stratum

@@ -119,7 +119,7 @@ def set_time(self):
self.time_ref = time.time()

def print_time(self, num_tx):
delta = time.time() - self.time_ref
delta = time.time() - self.time_ref + 0000000000.000001
# leaky averages
seconds_per_block, tx_per_second, n = self.avg_time
alpha = (1. + 0.01 * n)/(n+1)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

time.time() is not necessarily monotonous, so this does not prevent delta from being 0.

delta = max(time.time() - self.time_ref, 1e-6) would be more robust

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it, testing it right now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Been working for 13 days strait.

@bauerj
Copy link
Collaborator

bauerj commented Apr 30, 2016

What caused this error for you? Is your server too fast or is time.time too inaccurate?

@webdawg
Copy link
Author

webdawg commented Apr 30, 2016

I think time.time is not accurate enough for me. I am still looking into it but it may have to do with my clocksource on my Xen domU:

cat /sys/devices/system/clocksource/clocksource0/current_clocksource
xen

This is from an archlinux rolling release domU.

To be honest with you this is the first time I have ever had an issue with a clock source...the funny part about it 3-4 weeks later I am building a prototype proxmox system with FreeBSD 10.3 on it and now get to experience the full extent of clock source and virtualization matters.

I am sure I will solve the problem locally once I get through the issues I am having with that system, some distro's seem to do some weird things with clocksource.

I only do about 5-6tx a second with the current setup but it seems to work just great with 24/7 uptime. I could dedicate more resources to it in the future if I suppose.

I am running this on the latest patched XenServer. I have no other reverse clock issues except for this one with this software, I am going to look into it more because I have a few other projects on a proxmox system and a XenServer system that depend on stuff like this.

I know you guys do not want people throwing this on VPS's and stuff but the server I have it running on is an enterprise hardware type setup and this is all default settings at the moment.

This is just a logging/stat feature right? Would this have much impact on anything else then? Would it not be worth it for others in the future?

Clocksource just does not seem simple anymore heh. hpet seems to be the one that is most universally acceptable but not as fast.

Things I am reading about tsc now:

https://superuser.com/questions/393969/what-does-clocksource-tsc-unstable-mean

The time stamp counter has, until recently, been an excellent high-resolution, low-overhead way of getting CPU timing information. With the advent of multi-core/hyperthreaded CPUs, systems with multiple CPUs, and "hibernating" operating systems, the TSC cannot be relied on to provide accurate results — unless great care is taken to correct the possible flaws: rate of tick and whether all cores (processors) have identical values in their time-keeping registers. There is no promise that the timestamp counters of multiple CPUs on a single motherboard will be synchronized. In such cases, programmers can only get reliable results by locking their code to a single CPU. Even then, the CPU speed may change due to power-saving measures taken by the OS or BIOS, or the system may be hibernated and later resumed (resetting the time stamp counter). In those latter cases, to stay relevant, the counter must be recalibrated periodically (according to the time resolution your application requires).

And then:

Quoting from the same article some modern CPUs also provide a constant Time Stamp Counter: Recent Intel processors include a constant rate TSC (identified by the constant_tsc flag in Linux's /proc/cpuinfo). With these processors, the TSC reads at the processor's maximum rate regardless of the actual CPU running rate.

I will have to look more into how to handle the xen clocksource.

@bauerj
Copy link
Collaborator

bauerj commented May 1, 2016

Wow, that's interesting! And no, I've just been curious about this issue, I didn't want to imply that your patch is not worth merging 😃

@bauerj
Copy link
Collaborator

bauerj commented May 1, 2016

However, wouldn't it be better to just catch the ZeroDivisionError and return? The output wouldn't be of much use anyway when there is no measurable time difference.

@webdawg
Copy link
Author

webdawg commented May 1, 2016

No, the only reason I was talking like that is because some of the comments in the past about running electrum-server on virtualized machines.

I can look into catching it.

@bauerj
Copy link
Collaborator

bauerj commented May 27, 2016

@webdawg Any news?

@webdawg
Copy link
Author

webdawg commented May 27, 2016

Been running since Apr 6 w/ delta = max(time.time() - self.time_ref, 1e-6)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants