Cannot Start HBase Master: SplitLogManager: Error Splitting

I could not start HBase within Cloudera Manager, the service reported errors. I was initially confused because I could start the Master when the RegionServers were stopped, but as soon as I started a RegionServer, the master went down. I tracked this problem down to an unexpected server reboot of the node running the HBase Master. After the Master restarted, HBase was not able to continue reading from the transaction log because it had become unusable (corrupt). I had to delete the broken file before restarting the Master node.

Digging through the logs: sudo tail /var/log/hbase/hbase-cmf-hbase1-MASTER-ServerName.log.out, I discovered:

java.io.IOException: error or interrupted while splitting logs in [hdfs://ServerName:8020/hbase/.logs/ServerName,60020,1393982440484-splitting] Task = installed = 1 done = 0 error = 1

In the log file, look for the file that cannot be split:

hdfs://ServerName:8020/hbase/.logs/ServerName,60020,1393982440484-splitting

Then search hdfs for the file:

sudo -u hdfs hadoop fs -ls /hbase/.logs

Note that the file is 0 KB. Next, move the offending file:

sudo -u hdfs hadoop fs -mv /hbase/.logs/ServerName,60020,1393982440484-splitting /tmp/ServerName,60020,1393982440484-splitting.old

Restart the HBase Master service. The splitting log file can be replayed back to recover any lost data, but I didn’t look into that because there was no data to recover.

Note: Here is a fantastic HBase command to identify and fix any problems with HBase:

sudo -u hbase hbase hbck -fix

Leave a Reply