server stops suddenly / std::bad_alloc memory exhaustion #126

gits7r · 2015-09-26T08:57:19Z

Hi,

Server was running fine. Just upgraded to latest commits few days ago. Now when I type electrum-server starts, it looks like it starts (take some time) but when I run electrum-server getinfo (after ~1 minute) it says server not running. There is nothing in the log files which would be interesting, except starting TCP Server on ... and starting SSL server on...

bitcoind is working good, didn't touch it. I have tried restarting bitcoind as well, and then then electrum server started and was running for few hours, but died again with nothing in the logfile. How can I debug this?

abitfan · 2015-09-26T13:01:47Z

You can try to run run_electrum_server directly and see if it spits out more info.

gits7r · 2015-09-26T13:44:17Z

INFO:electrum:Starting Electrum server on 127.0.0.1
ERROR:electrum:db init
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 180, in init
self.db_utxo = DB(self.dbpath, 'utxo', config.getint('leveldb', 'utxo_cache'))
File "/usr/lib/python2.7/ConfigParser.py", line 359, in getint
return self._get(section, int, option)
File "/usr/lib/python2.7/ConfigParser.py", line 356, in _get
return conv(self.get(section, option))
File "/usr/lib/python2.7/ConfigParser.py", line 618, in get
raise NoOptionError(option, section)
NoOptionError: No option 'utxo_cache' in section: 'leveldb'
INFO:electrum:Stopping Stratum
INFO:electrum:Initializing database
Traceback (most recent call last):
File "/usr/local/bin/run_electrum_server", line 4, in
import('pkg_resources').run_script('electrum-server==1.0', 'run_electrum_server')
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 534, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/usr/lib/python2.7/dist-packages/pkg_resources.py", line 1445, in run_script
exec(script_code, namespace, namespace)
File "/usr/local/lib/python2.7/dist-packages/electrum_server-1.0-py2.7.egg/EGG-INFO/scripts/run_electrum_server", line 256, in

File "build/bdist.linux-x86_64/egg/electrumserver/blockchain_processor.py", line 57, in init
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 195, in init
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 324, in put_node
AttributeError: 'Storage' object has no attribute 'db_utxo'

ecdsa · 2015-09-26T17:17:39Z

you have to run run_electrum_server.py, not run_electrum_server

gits7r · 2015-09-26T20:51:30Z

Maybe the database was corrupt. I have deleted electrum's database and downloaded it again from foundry. Started fine and working for last hours under normal parameters... slowly catching up.

Could my database just get corrupted on the fly, without anyone doing anything wrong? I know how to start/stop the server and never kill -9 electrum.

gits7r · 2015-09-27T07:56:51Z

@ecdsa It have downloaded a fresh leveldb dump from foundry, started again and it died again unfortunately. There is a bug here. I run run_electrum_server.py in console and here is what I get:

INFO:electrum:Starting Electrum server on 127.0.0.1
INFO:electrum:Database version 3.
INFO:electrum:Pruning limit for spent outputs is 10000.
INFO:electrum:Blockchain height 375506
INFO:electrum:UTXO tree root hash: c5e8dca8fefc2e5f8ab198aac02824d9b0b3e08c414cd 249fa62bb0d0408221a
INFO:electrum:Coins in database: 1463486254755852
INFO:electrum:catching up missing headers: 375492 375506
INFO:electrum:TCP server started on 127.0.0.1:50001
INFO:electrum:SSL server started on 127.0.0.1:50002
Exception in thread Thread-4:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
File "/usr/lib/python2.7/threading.py", line 763, in run
File "build/bdist.linux-x86_64/egg/electrumserver/blockchain_processor.py", line 83, in do_catch_up
File "build/bdist.linux-x86_64/egg/electrumserver/blockchain_processor.py", line 657, in catch_up
File "build/bdist.linux-x86_64/egg/electrumserver/blockchain_processor.py", line 413, in import_block
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 625, in import_transaction
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 585, in set_spent
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 144, in get
File "_plyvel.pyx", line 299, in plyvel._plyvel.DB.get (plyvel/_plyvel.cpp:4025)
File "_plyvel.pyx", line 103, in plyvel._plyvel.db_get (plyvel/_plyvel.cpp:1891)
File "_plyvel.pyx", line 80, in plyvel._plyvel.raise_for_status (plyvel/_plyvel.cpp:1698)
IOError: IO error: /home/bitnode/electrum-server/electrum-leveldb-utxo-10000/hist/30610918.ldb: Too many open files

What could be the issue? I have the correct limits setup in /etc/security/limits.conf for the user running electrum.

abitfan · 2015-09-27T08:24:21Z

If this is a ubuntu install you also need to edit /etc/pam.d/common-session and add
session required pam_limits.so
To test that your changes are ok login with the user running electrum and run:
ulimit -n

gits7r · 2015-09-27T08:29:01Z

@abitfan I am on Debian Jessie.
Unfortunately, here is what ulimit -n run as the user running electrum says:
sudo -u bitnode -i ulimit -n
1024

I have in /etc/security/limits.conf the following appended:
bitnode hard nofile 65536
bitnode soft nofile 65536

abitfan · 2015-09-27T09:31:46Z

Actually the common-session mod is required for debian as well

gits7r · 2015-09-27T11:17:54Z

@abitfan can you let me know step by step what do I need to do in order to enable it? thanks.

abitfan · 2015-09-27T11:27:05Z

as root:
echo "session required pam_limits.so" >> /etc/pam.d/common-session

gits7r · 2015-09-27T16:23:49Z

I have done that. now the limit is 65536 for 'bitnode' which is the user I run electrum-server as.
It still did not fix it. It starts and dies with nothing relevant in electrum.log. Running from console I get the following:

sudo -u bitnode -i run_electrum_server.py
INFO:electrum:Starting Electrum server on 127.0.0.1
ERROR:electrum:db init
Traceback (most recent call last):
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 180, in init
self.db_utxo = DB(self.dbpath, 'utxo', config.getint('leveldb', 'utxo_cache'))
File "build/bdist.linux-x86_64/egg/electrumserver/storage.py", line 129, in init
self.db = plyvel.DB(os.path.join(path, name), create_if_missing=True, compression=None, lru_cache_size=cache_size)
File "_plyvel.pyx", line 236, in plyvel._plyvel.DB.init (plyvel/_plyvel.cpp:3129)
File "_plyvel.pyx", line 80, in plyvel._plyvel.raise_for_status (plyvel/_plyvel.cpp:1698)
IOError: IO error: lock /home/bitnode/electrum-server/electrum-leveldb-utxo-10000/utxo/LOCK: Resource temporarily unavailable
INFO:electrum:Stopping Stratum

abitfan · 2015-09-28T06:06:50Z

Can you try this with a fresh db ?

gits7r · 2015-09-30T15:17:11Z

Ok. I have tried with fresh DB 10 times. Correct dbs, checked the hash and everything.
I have set the limits properly like you said, the user running electrum now has soft 65536 and hard 65536. It always dies like this after few seconds:

INFO:electrum:Starting Electrum server on 127.0.0.1
terminate called after throwing an instance of 'std::bad_alloc'
what(): std::bad_alloc

I am on latest commit. What could be wrong?

shsmith · 2015-10-01T03:05:21Z

bad_alloc sounds like memory exhaustion. Try allocating more swap space.
You could also reduce the cache sizes via your electrum.conf hist_cache, utxo_cache and addr_cache settings.

gits7r · 2015-10-01T08:42:46Z

My swap allocated space looks empty. This machine used to work with electrum very well. Can it suddenly require more swap space?

gits7r · 2015-10-02T13:07:26Z

@shsmith @ecdsa I have increased the allocated RAM for this virtual machine from 8GB to 16GB and increased the swap space from 5GB to 8GB and this seam to have fixed it -- now electrum is catching up with bitcoind height and updating leveldb.

Do we require more resources now to run electrum server?

gits7r · 2015-10-18T17:43:42Z

Tried 100 more times with different changes, it still won't work.
I think this is not related to electrum-server, this is maybe the fault of not enough hard disk I/O operations allowed since the server doesn't have SSD (server has normal SATA drives, no raid). This is a virtual machine, hosted on shared hardware - on the same hardware I have another electrum server + many other things so I guess the disk just can't take all of it and the hypervisor doesn't allocate more I/O hard disk resources to this virtual machine in order to protect the others.

We already know that leveldb uses the disk very much, it needs SSD. So, I will close this, since I don't see a bug in electrum-server. The last log message is:
[18/10/2015-05:06:38] block 379312 (410 401.10s) 457255bb18a4ba9e792ab8f3e2b4d5fd34f3dccf7b008ed5d278622c31f3e280 (4.49tx/s, 255.59s/block) (eta 11.3 hours, 112 blocks)

You can see it takes a lot to expand blocks. The RAM/CPU/Swap space resources are plenty, but the hard disk is not.

EagleTM · 2015-11-09T13:18:15Z

I'm seeing the same issue here:
server with 4 GB RAM and 4 GB swap dies after around a week of running, with caches at half the size of the new lower default (so they are not the issue):
Crash message "terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc"

It's definitely running out of memory / swap.
I've noticed the current git head pulls in like 90% of memory (with swap that'd be 8 GB on my system) on startup to catch up blocks.

This might be related to recent changes like "writing once per block" or the "ordering of tx" stuff. server versions from June 2015 don't have this issue.

Unless memory footprint can be reduced we should recommend at least 8 GB or RAM - better 16 GB - for running electrum server

lvets · 2015-12-06T22:23:15Z

Any update on this? I'm still seeing electrum-server taking 16GB of RAM + 4 GB swap on a server when processing blocks...

EagleTM · 2015-12-12T12:27:03Z

We're investigating the issue. Thomas' server is using less than 2 gigs of RES RAM, while I'm at 11 gig. It might be the plyvel verison. I've updated to 0.9 (from 0.2) recently - still running leveldb 1.9.x (2013) with it. Thomas is using leveldb 1.9.x and plyvel 0.8. Which pyvel versions are you using?

For now I get a stable running server with 16 GB RAM + 16 GB swap. Around 6 GB swap gets used so I can recommend setting 24 Gigs of RAM + swap.

EagleTM · 2016-02-21T00:57:19Z

Sorry, no progress here currently. The RAM recommendations still stand. I've put them into the HOWTO for now

gits7r closed this as completed Oct 18, 2015

EagleTM reopened this Nov 9, 2015

EagleTM changed the title ~~server stops suddenly~~ server stops suddenly / std::bad_alloc memory exhaustion Nov 9, 2015

EagleTM mentioned this issue Nov 9, 2015

Server doesn't accept new connections after a while #132

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server stops suddenly / std::bad_alloc memory exhaustion #126

server stops suddenly / std::bad_alloc memory exhaustion #126

gits7r commented Sep 26, 2015

abitfan commented Sep 26, 2015

gits7r commented Sep 26, 2015

ecdsa commented Sep 26, 2015

gits7r commented Sep 26, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 28, 2015

gits7r commented Sep 30, 2015

shsmith commented Oct 1, 2015

gits7r commented Oct 1, 2015

gits7r commented Oct 2, 2015

gits7r commented Oct 18, 2015

EagleTM commented Nov 9, 2015

lvets commented Dec 6, 2015

EagleTM commented Dec 12, 2015

EagleTM commented Feb 21, 2016

server stops suddenly / std::bad_alloc memory exhaustion #126

server stops suddenly / std::bad_alloc memory exhaustion #126

Comments

gits7r commented Sep 26, 2015

abitfan commented Sep 26, 2015

gits7r commented Sep 26, 2015

ecdsa commented Sep 26, 2015

gits7r commented Sep 26, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 27, 2015

gits7r commented Sep 27, 2015

abitfan commented Sep 28, 2015

gits7r commented Sep 30, 2015

shsmith commented Oct 1, 2015

gits7r commented Oct 1, 2015

gits7r commented Oct 2, 2015

gits7r commented Oct 18, 2015

EagleTM commented Nov 9, 2015

lvets commented Dec 6, 2015

EagleTM commented Dec 12, 2015

EagleTM commented Feb 21, 2016