This is a short snippet from a recent XRP Ninja post. We encourage you to check out r0bertz’s blog for more articles like this one.
There’s been some problems with the validator ever since it crashed about a week ago. It doesn’t crash any more since I added much more disk space. But it easily falls behind and needs to play catch up several hours after each restart. I learned this from state_accounting field in the output of “rippled server_info”. I created a metric and a chart from it. This is what the chart looks for the last week.
” mode state means it is fully synced and participating in consensus process. You can see it is now asymptotically approaching 100 now. The value means the percentage of uptime the validator stays in each mode.
I figured out why the disk was out of space. It was because the validator fell behind. When that happens, online delete
is disabled. See this code:
I saw some “Not deleting” messages in the log which led me to the above code.
I still can’t determine the exact reason why it fell behind. I suspect it was caused by some “insane” testnet peer. So I blocked some peers with iptables. But now it’s almost 100% full and it still has an insane testnet peer.
However I do know why it didn’t recover after disk is enlarged. I believe it’s because I changed the db to NuDB from RocksDB and still have online_delete set in the config. I changed it because I learned NuDB works better on SSD. But it doesn’t support update or delete. It is meant to be used in full history validator. It appears that when it tries to do online_delete, it starts to fall behind. Now I have switched back to RocksDB and everything is fine.
Please visit the XRP Nijna for more information.