Add independent txindex #6

dcousens · 2017-03-16T05:17:55Z

This would mean we could run indexd on a pruning node, provided it is fully synchronized to start.

The text was updated successfully, but these errors were encountered:

dcousens · 2017-04-19T04:57:08Z

The three options as I see them:

Ask bitcoind for the block@height (getblock), and our index maintains the transaction offset into that block
- Easily cache-able... but probably? high latency due to multiple applications doing IO
Store all of the transaction data in our in database... you wouldn't bother with -txindex... and you'd definitely want to -prune the node. Not as simple to "reset" your index but; you'd have reset everything.
Use -txindex, and forgo the problem

dcousens · 2017-05-10T13:05:22Z

If we maintain our own txindex, not only can we bundle it with #9 - but we can preserve the entire transaction for other analysis.

@Runn1ng the concern here is, this could severely blow out a disk in terms of space required...
Maybe optional?

karelbilek · 2017-05-11T13:37:18Z

What would be the motivation of having at the same time pruning node and txindex? When user is using pruning, he wants to save disk space, which is then negated by saving txindex :)

Also what is the reasoning of having separate index here instead of relying on bitcoind?

karelbilek · 2017-05-11T13:44:51Z

Btw, getting back to bitcore fork (I am going back to it, because it does what I need :), but it's a bit painful to maintain because of the rather large patchset)

If you have addressindex, spentindex, timestampindex and txindex enabled, the disk space significantly grows, I think about 2 or 3 times from blockchain without the indexes

unsystemizer · 2017-05-21T10:10:14Z

addrindex and txindex don't add nearly as much. If the exact figures are important I can look them up.
A motivation of having indexes with a pruned blockchain would be you could fetch tx details later (not necessarily from the same bitcoind) no?

dcousens · 2017-05-21T10:52:23Z

@unsystemizer exactly

instagibbs · 2017-05-30T13:52:38Z

@Runn1ng I don't have any special info but txindex may someday get retired from bitcoind, especially if external indexes like this are successful. Core contributors in general are quite down on additional indexes due to complexity and interactions with consensus code.

unsystemizer · 2017-10-28T13:52:18Z

Just did a repeat of the same experiment somebody did with addrindex before:
a) Create txindex on testnet (mainnet is larger, so ...)
b) Use the max dbcache value (16GB) on bitcoind

root@indexd:~/.bitcoin/testnet3/blocks/index# du -sh
949M .
...
root@indexd:~/.bitcoin# tail -f /root/.bitcoin/testnet3/debug.log
...
2017-10-28 13:41:29 Cache configuration:
2017-10-28 13:41:29 * Using 1024.0MiB for block index database
2017-10-28 13:41:29 * Using 8.0MiB for chain state database
2017-10-28 13:41:29 * Using 15352.0MiB for in-memory UTXO set

Conclusions:

on mainnet, txindex likely cannot fit in dbcache (I haven't tried, but...)
on testnet, txindex can't fit in dbcache unless one wastefully allocates 15 GiB of RAM to in-mem UTXOs

It'd be valuable to be able to either load txindex in RAM (mainnet) or avoid wasting many GBs of RAM on caching in-memory UTXOs (testnet) in order to be able to fully cache txindex (edit: in indexd, of course)

KanoczTomas · 2017-10-29T14:45:04Z

@unsystemizer I have a feeling you are mixing up txindex and chainstate. The UTXO set in bitcoind is in chainstate dir, while the txindex is in blocks/index. The chainstate is currently 2.8G for mainnet, a full index is 14G. edit: of course I could be wrong ... that is my understanding from chat with the devs.

The dbcache switch is only used for the chainstate db in bitcoind. Just wanted to make sure you have the correct asumptions, not sure what you were trying to calculate. 4G dbcache is effectively an infinite space while syncing as the utxos set never raches it (core devs use it as the max in benchmarks).

unsystemizer · 2017-10-31T08:44:46Z

What would be the motivation of having at the same time pruning node and txindex?

@Runn1ng exactly that. It allows me to keep my indexes on a fast node (while running bitcoind on a slow node) and at the same time doesn't require bitcoind admin to worry about index maintenance. There are several other reasons, some of which I mentioned 2 comments above.
Regarding your point on txidnex size from the other comment, I checked on both testnet and mainnet, and currently txindex occupies (approximately) 10% of block capacity (on a non-pruned node). If both addrindex and txindex are enabled (using Bitcoin Core with addrindex patch), they take (approximately) 25% of block data capacity (addrindex 15% and txindex 10%, roughly speaking).

dcousens · 2017-11-02T04:04:52Z

@unsystemizer the issue with a local transaction index, is that we can't index into the .dat files themselves, as they may be in an indeterminate state (as bitcoind updates them).

We would have to maintain nearly the entire blockchain in our database, as the block headers only account for 80 bytes...

Hence, why I suggest that sane users will think they should prune... which could make any "catch up"/"resync" phase difficult if the data is missing.

I agree that you could use an external node for initial resync, then the local pruning node after that.

dcousens · 2017-11-02T04:11:39Z

Another alternative is if we could ask network peers for the blocks on the initial sync... then continue as normal with our pruned node.
We don't want to ask random peers directly, as we don't want indexd to have to verify consensus rules.

If the block was verified by bitcoind on the way... that'd be near perfect.

Maybe a new RPC call for pruned nodes?
fetchblock, with a condition the block has to be on the best chain.
For non-pruned nodes, it is an alias to getblock.

@theuni thoughts, could this be possible?

This would allow us to resync, using a pruned node, and therefore drop our dependency on -txindex by maintaing our own.

The option to fast synchronize via something like fast-dat-parser could still be done, as that is an offline-step to initialize the local database, and is more of an overall deployment consideration.

dcousens · 2017-11-02T06:43:12Z

In the mean time, indexd could use the pruneblockchain RPC command to signal where it is up to -prune=1 (aka, manual RPC pruning only), then we could allow indexd to signal when it is safe to prune.

This wouldn't help if the database is lost, but, it would stop there being too much data duplication.

theuni · 2017-11-02T20:38:43Z

@dcousens If I'm understanding your question, I think bitcoin/bitcoin#10794 would do what you want?

dcousens · 2017-11-03T01:30:37Z

@theuni yes it would. Thanks for pointing that issue out.

dcousens added the feature label Mar 16, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add independent txindex #6

Add independent txindex #6

dcousens commented Mar 16, 2017 •

edited

dcousens commented Apr 19, 2017 •

edited

dcousens commented May 10, 2017

karelbilek commented May 11, 2017

karelbilek commented May 11, 2017

unsystemizer commented May 21, 2017 •

edited

dcousens commented May 21, 2017

instagibbs commented May 30, 2017 •

edited

unsystemizer commented Oct 28, 2017 •

edited

KanoczTomas commented Oct 29, 2017 •

edited

unsystemizer commented Oct 31, 2017

dcousens commented Nov 2, 2017 •

edited

dcousens commented Nov 2, 2017 •

edited

dcousens commented Nov 2, 2017

theuni commented Nov 2, 2017

dcousens commented Nov 3, 2017

Add independent txindex #6

Add independent txindex #6

Comments

dcousens commented Mar 16, 2017 • edited

dcousens commented Apr 19, 2017 • edited

dcousens commented May 10, 2017

karelbilek commented May 11, 2017

karelbilek commented May 11, 2017

unsystemizer commented May 21, 2017 • edited

dcousens commented May 21, 2017

instagibbs commented May 30, 2017 • edited

unsystemizer commented Oct 28, 2017 • edited

KanoczTomas commented Oct 29, 2017 • edited

unsystemizer commented Oct 31, 2017

dcousens commented Nov 2, 2017 • edited

dcousens commented Nov 2, 2017 • edited

dcousens commented Nov 2, 2017

theuni commented Nov 2, 2017

dcousens commented Nov 3, 2017

dcousens commented Mar 16, 2017 •

edited

dcousens commented Apr 19, 2017 •

edited

unsystemizer commented May 21, 2017 •

edited

instagibbs commented May 30, 2017 •

edited

unsystemizer commented Oct 28, 2017 •

edited

KanoczTomas commented Oct 29, 2017 •

edited

dcousens commented Nov 2, 2017 •

edited

dcousens commented Nov 2, 2017 •

edited