Developer's Closet A place where I can put my PHP, SQL, Perl, JavaScript, and VBScript code.

18Jul/140
Sqoop Server Startup Failure: Upgrade required but not allowed

After an upgrade from CDH 5.0.2 to CDH 5.0.3, Sqoop failed to start with the following error: Server startup failure, Connector registration failed, Upgrade required but not allowed - Connector: generic-jdbc-connector.

To resolve this problem I had to add the following property to the Sqoop 2 Server Advanced Configuration Snippet (Safety Valve) for sqoop.properties. You can find this property under Cloudera Manager, Sqoop Service, Configuration, Sqoop 2 Server Default Group, and Advanced:

org.apache.sqoop.connector.autoupgrade=true

After the upgrade has completed successfully, the property can be removed.

Log File: /var/log/sqoop2/sqoop-cmf-sqoop-SQOOP_SERVER-servername01.ds-iq.corp.log.out

Server startup failure
org.apache.sqoop.common.SqoopException: CONN_0007:Connector registration failed
at org.apache.sqoop.connector.ConnectorManager.registerConnectors(ConnectorManager.java:236)
at org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:197)
at org.apache.sqoop.connector.ConnectorManager.initialize(ConnectorManager.java:145)
at org.apache.sqoop.core.SqoopServer.initialize(SqoopServer.java:51)
at org.apache.sqoop.server.ServerInitializer.contextInitialized(ServerInitializer.java:38)
at org.apache.catalina.core.StandardContext.listenerStart(StandardContext.java:4206)
at org.apache.catalina.core.StandardContext.start(StandardContext.java:4705)
at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:799)
at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:779)
at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:601)
at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:943)
at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:778)
at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:504)
at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1317)
at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:324)
at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:142)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1065)
at org.apache.catalina.core.StandardHost.start(StandardHost.java:840)
at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1057)
at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:463)
at org.apache.catalina.core.StandardService.start(StandardService.java:525)
at org.apache.catalina.core.StandardServer.start(StandardServer.java:754)
at org.apache.catalina.startup.Catalina.start(Catalina.java:595)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:289)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:414)
Caused by: org.apache.sqoop.common.SqoopException: JDBCREPO_0026:Upgrade required but not allowed - Connector: generic-jdbc-connector
at org.apache.sqoop.repository.JdbcRepository$3.doIt(JdbcRepository.java:190)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:90)
at org.apache.sqoop.repository.JdbcRepository.doWithConnection(JdbcRepository.java:61)
at org.apache.sqoop.repository.JdbcRepository.registerConnector(JdbcRepository.java:156)
at org.apache.sqoop.connector.ConnectorManager.registerConnectors(ConnectorManager.java:216)
... 28 more

1Jul/140
HBase All Regions in Transition: state=FAILED_OPEN

After I added a jar file to the HBase Master I had a problem where regions failed to transition to a RegionServer. Below are the errors; removing the jar file from the hbase/lib folder resolved this problem (full path to jar: /opt/cloudera/parcels/CDH-5.0.2-1.cdh5.0.2.p0.13/lib/hbase/lib/). What tipped me off was the missing class definition: Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/ipc/CoprocessorProtocol.

Failed open of region=REGION.NAME,,4194066667839.6ea7d7ff9276f9c0e9b126c73e25bc54., starting to roll back the global memstore size.
java.lang.IllegalStateException: Could not instantiate a region instance.
at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3970)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4276)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4249)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4205)
at org.apache.hadoop.hbase.regionserver.HRegion.openHRegion(HRegion.java:4156)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.openRegion(OpenRegionHandler.java:475)
at org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler.process(OpenRegionHandler.java:140)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.GeneratedConstructorAccessor7.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at org.apache.hadoop.hbase.regionserver.HRegion.newHRegion(HRegion.java:3967)
... 10 more
Caused by: java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/ipc/CoprocessorProtocol
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
...

9:10:19.721 AM INFO org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler
Opening of region {ENCODED => 6ea7d7ff9276f9c0e9b126c73e25bc54, NAME => 'REGION.NAME,,4194066667839.6ea7d7ff9276f9c0e9b126c73e25bc54.', STARTKEY => '', ENDKEY => ''} failed, transitioning from OPENING to FAILED_OPEN in ZK, expecting version 28

30Jun/140
Cron file failed to load: (username~) ORPHAN (no passwd entry)

This problem bothers me a little. The authentication server failed during a cron job that referenced a specific account. Ubuntu could not authenticate the account, and id username failed. So the account was written to the /var/spool/cron/crontabs/ and any time I tried to edit the cron file under /etc/cron.c/username-cron-file, the reload would fail:

cron[17959]: (*system*username-cron-file) RELOAD (/etc/cron.d/username-cron-file)
cron[17959]: Error: bad username; while reading /etc/cron.d/username-cron-file
cron[17959]: (*system*username-cron-file) ERROR (Syntax error, this crontab file will be ignored)
cron[17959]: (username~) ORPHAN (no passwd entry)

I deleted the spool entry and was able to recreate the cron file.

 

Filed under: Linux, Ubuntu No Comments
24Jun/140
Cloudera Manager HBase Check HFile Version

An under-documented feature in Cloudera Manager is the HBase Check HFile Version. When upgrading from CDH 4.7 to CDH 5.0, I ran across the instructions that the HBase upgrade will not be successful if there are any HFiles with version 1 present. Run "Check HFile Version" from the HBase service Actions menu to ensure that HBase is ready for the upgrade.

After a check if HFiles with v1 are present, and "Process (###) has reached expected state", you are looking for the message that No files with v1 were found and HBase can be upgraded. Otherwise HBase regions will have to be compacted. I'll post more on that later.

In Stderr, look for the following:

INFO util.HFileV1Detector: Count of HFileV1: 0
INFO util.HFileV1Detector: Count of corrupted files: 0
INFO util.HFileV1Detector: Count of Regions with HFileV1: 0
12Jun/140
Cannot Start HBase Master: SplitLogManager: Error Splitting

I could not start HBase within Cloudera Manager, the service reported errors. I was initially confused because I could start the Master when the RegionServers were stopped, but as soon as I started a RegionServer, the master went down. I tracked this problem down to an unexpected server reboot of the node running the HBase Master. After the Master restarted, HBase was not able to continue reading from the transaction log because it had become unusable (corrupt). I had to delete the broken file before restarting the Master node.

Digging through the logs: sudo tail /var/log/hbase/hbase-cmf-hbase1-MASTER-ServerName.log.out, I discovered:

java.io.IOException: error or interrupted while splitting logs in [hdfs://ServerName:8020/hbase/.logs/ServerName,60020,1393982440484-splitting] Task = installed = 1 done = 0 error = 1

In the log file, look for the file that cannot be split:

hdfs://ServerName:8020/hbase/.logs/ServerName,60020,1393982440484-splitting

Then search hdfs for the file:

sudo -u hdfs hadoop fs -ls /hbase/.logs

Note that the file is 0 KB. Next, move the offending file:

sudo -u hdfs hadoop fs -mv /hbase/.logs/ServerName,60020,1393982440484-splitting /tmp/ServerName,60020,1393982440484-splitting.old

Restart the HBase Master service. The splitting log file can be replayed back to recover any lost data, but I didn't look into that because there was no data to recover.

Note: Here is a fantastic HBase command to identify and fix any problems with HBase:

sudo -u hbase hbase hbck -fix

7Jun/140
SQL Query for Recently Changed Objects

These may be useful to determine when objects have been changed.  We might infer from this info when was the last formal deployment of a database.

 

-- get most recently changed object of each type

SELECT  o.name

,SCHEMA_NAME(o.schema_id) AS 'SchemaName'

,COALESCE(OBJECT_NAME(o.parent_object_id), '') AS 'ParentName'

,o.type

,o.type_desc

,o.create_date

,o.modify_date

FROM    sys.objects o

JOIN(SELECT    type

,'LateDate' = MAX(modify_date)

FROM      sys.objects

WHERE     is_ms_shipped = 0

AND OBJECT_SCHEMA_NAME(object_id) <> 'tSQLt'

GROUP BY  type

) xo

ON o.type = xo.type

AND o.modify_date = xo.LateDate

ORDER BY o.modify_date DESC

,o.type

,o.name

 

-- get 20 most recently changed objects

SELECT TOP 20

o.name

,SCHEMA_NAME(o.schema_id) AS 'SchemaName'

,COALESCE(OBJECT_NAME(o.parent_object_id), '') AS 'ParentName'

,o.type

,o.type_desc

,o.create_date

,o.modify_date

FROM   sys.objects o

WHERE   o.is_ms_shipped = 0

AND o.parent_object_id = 0

--AND o.type = 'U'

ORDER BY o.modify_date DESC

 

-- get 20 most recently changed objects that are not tSQLt or tSQLt tests

SELECT TOP 20

SCHEMA_NAME(o.schema_id) AS 'SchemaName'

,o.name

,COALESCE(OBJECT_NAME(o.parent_object_id), '') AS 'ParentName'

,o.type

,o.type_desc

,o.create_date

,o.modify_date

FROM   sys.objects o

WHERE   o.is_ms_shipped = 0

AND o.parent_object_id = 0

--AND o.type = 'U'

AND OBJECT_SCHEMA_NAME(o.object_id) NOT LIKE 'Z%'

AND OBJECT_SCHEMA_NAME(o.object_id) <> 'tSQLt'

ORDER BY o.modify_date DESC

 

Filed under: SQL, Windows No Comments
30May/140
Fix an SQL Database Marked Suspect

If you have a database marked suspect, and do not have a backup to restore from, you can attempt to repair the database. This may lead to data loss and is most certainly not the best option (see: database backup).

Make sure there are no open connections to the database - stop all services that might attempt a connection. Run sp_who2 to check on open connections.

Run the commands one at a time and fix any errors as they appear:

EXEC sp_resetstatus 'DatabaseName';
ALTER DATABASE DatabaseName SET EMERGENCY;
DBCC CHECKDB ('DatabaseName');
ALTER DATABASE DatabaseName SET SINGLE_USER WITH ROLLBACK IMMEDIATE;
DBCC CHECKDB ('DatabaseName', REPAIR_ALLOW_DATA_LOSS);
ALTER DATABASE DatabaseName SET MULTI_USER;

Restoring from a backup is the better way to fix this issue.

Filed under: SQL, Windows No Comments
23May/140
Configure Solr Using Cloudera Manager

Solr, also known as Cloudera Search within Cloudera Manager, is a distributed service for indexing and searching data stored in HDFS.

Add the Solr Service

Using Cloudera Manager, add a Solr Server to a host that is not hosting Zookeeper or Oozie. Solr will take a lot of processing power and memory. You can collocate a Cloudera Search server (solr-server package) with a MapReduce TaskTracker (MRv1) and a HDFS DataNode. When co-locating with MapReduce TaskTrackers, be sure that the resources of the machine are not oversubscribed. It's safest to start with a small number of MapReduce slots and increase them gradually.

Here is Cloudera’s current Solr guide: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/Search/Cloudera-Search-User-Guide/Cloudera-Search-User-Guide.html

Creating Your First Solr Collection

To use Solr for the first time you will have to create Collections. Here is how: http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/Search/Cloudera-Search-Installation-Guide/csig_deploy_search_solrcloud.html, look under the heading: Creating Your First Solr Collection.

By default, the Solr server comes up with no collections. Make sure that you create your first collection using the instancedir that you provided to Solr in previous steps by using the same collection name. (numOfShards is the number of SolrCloud shards you want to partition the collection across. The number of shards cannot exceed the total number of Solr servers in your SolrCloud cluster):

solrctl collection --create collection1 -s {{numOfShards}}

You should be able to check that the collection is active. For example, for the server myhost.example.com, you should be able to navigate to http://myhost.example.com:8983/solr/collection1/select?q=*%3A*&wt=json&indent=true and verify that the collection is active. Similarly, you should also be able to observe the topology of your SolrCloud using a URL similar to: http://myhost.example.com:8983/solr/#/~cloud

You will then be able to create a new core.

Creating a New Core

In Cloudera Manager, browse to the Solr Service. On the right of the menu, click on Solr Web UI. On the lower left menu, click on Add a New Core. Enter the collection created above, give the core a name, and submit.

22May/140
Purge Backup History from SQL msdb Database

I just about ran out of space on my Data partition and discovered that the msdb database was over 114 GB! In particular, the backupfile table had over 107,000,000 rows which took up over 70 GB. Other backup-related tables took up the rest of the space. This is a real problem. The normal command I would run to purge backup history did not work because it uses transactions and would fill up TLog. So I had to run the sp in a look one day at a time:
SET NOCOUNT ON
DECLARE @purge_date DATETIME
DECLARE @cutoff_date DATETIME

-- Set purge cutoff to n days back
set @cutoff_date = DATEADD(dd,DATEDIFF(dd,0,GETDATE())-180,0)

PRINT 'Start Date: ' +
CONVERT(VARCHAR(10),GETDATE(),121)
PRINT 'Purge backup history before Cutoff Date: ' +
CONVERT(VARCHAR(10),@cutoff_date,121)

WHILE 1 = 1
BEGIN

SET @purge_date = null

-- Find date of oldest backup set
SELECT @purge_date = DATEADD(dd,DATEDIFF(dd,0,MIN(backup_finish_date))+1,0) FROM msdb.dbo.backupset(NOLOCK)
WHERE backup_finish_date <= @cutoff_date

IF @purge_date is null or @purge_date > @cutoff_date
BEGIN
PRINT 'Purge backup history complete through: '+
CONVERT(VARCHAR(10),@cutoff_date ,121)
BREAK
END

PRINT CHAR(10)+CHAR(13) +
'Purging backup history before: ' +
CONVERT(VARCHAR(10),@purge_date,121) +CHAR(10)+CHAR(13)

SELECT [Backup Sets to be Deleted Count ] = COUNT(*) FROM msdb.dbo.backupset(NOLOCK)
WHERE backup_finish_date < @purge_date

EXEC msdb.dbo.sp_delete_backuphistory @purge_date

END

PRINT 'End Date: ' +
CONVERT(VARCHAR(10),GETDATE(),121)

 

Filed under: SQL, Windows No Comments
1Apr/140
Clear Orchestrator Logs from Runbooks by Folder Name

I found it useful to clear Orchestrator Runbook logs by Folder. This allowed me to clear smaller chunks of logs from Orchestrator where in the past I was clearing the entire database which was a very slow process. A small improvement, but it helps.

-- This script will remove logs from a folder in Orchestrator
-- enter a folder name into the folder variable

-- Create temporary table for sysprocesses
set QUOTED_IDENTIFIER Off
SET NOCOUNT ON

-- Variables
declare @strsql char(200),
  @ThePolicyId varchar(50),
  @USRCOUNT int,
  @PRINTUSRCOUNT VARCHAR (255), 
  @SqlScript varchar(255),
  @PolicyId varchar(50),
  @FolderName VARCHAR(100),
  @NumberOfDaysToKeep INT

-- Folder name
SET @FolderName = 'Folder Name'
SET @NumberOfDaysToKeep = 180

-- Query for UniqueId number of all Runbooks in a folder
use Orchestrator
set @strsql = 'select p.UniqueID from [dbo].[POLICIES] p 
INNER JOIN [dbo].[FOLDERS] f ON p.ParentID = f.UniqueID
WHERE f.[Name] = ''' + @FolderName + '''
AND p.Deleted = 0
ORDER BY f.name, p.name'

IF OBJECT_ID('tempdb..#toRun') IS NOT NULL DROP TABLE #toRun

create table #toRun
(
  PolicyId uniqueidentifier,
)

-- Insert into temp table
INSERT INTO #toRun EXEC(@strsql)

-- Comment out the query below to hide connections
--select * from #toRun

SELECT @USRCOUNT = (SELECT CONVERT(CHAR, COUNT(1)) FROM #toRun)
SET @PRINTUSRCOUNT = 'Attempting to run ' + CAST(@USRCOUNT AS VARCHAR(36)) + ' Runbooks' 
PRINT @PRINTUSRCOUNT 

-- Pause script for 1 seconds
WAITFOR DELAY '00:00:01' 

-- Begin cursor 
DECLARE CUSRRUN SCROLL CURSOR FOR 
SELECT PolicyId FROM #toRun 

OPEN CUSRRUN

FETCH FIRST FROM CUSRRUN INTO @PolicyId 
SELECT @SqlScript = 'DECLARE @Completed bit 
	SET @Completed = 0 
	WHILE @Completed = 0 EXEC sp_CustomLogCleanup 
	@Completed OUTPUT, 
	@FilterType=2,
	@YDays=''' + CAST(@NumberOfDaysToKeep AS VARCHAR) + ''',
	@PolicyID=''' + CAST(@PolicyId AS VARCHAR(36)) + ''''
PRINT 'Runbook: ' + CAST(@PolicyId AS VARCHAR(36))
PRINT @SqlScript 
EXEC(@SqlScript) 

WHILE @@FETCH_STATUS = 0 
BEGIN 
  FETCH NEXT FROM CUSRRUN into @PolicyId 
  WHILE @@FETCH_STATUS = 0 
  BEGIN 

    SELECT @SqlScript = 'DECLARE @Completed bit 
		SET @Completed = 0 
		WHILE @Completed = 0 EXEC sp_CustomLogCleanup 
		@Completed OUTPUT, 
		@FilterType=2,
		@YDays=''' + CAST(@NumberOfDaysToKeep AS VARCHAR) + ''',
		@PolicyID=''' + CAST(@PolicyId AS VARCHAR(36)) + ''''
    PRINT 'Runbook: ' + CAST(@PolicyId AS VARCHAR(36))
    PRINT @SqlScript 
    EXECUTE(@SqlScript) 

    FETCH NEXT FROM CUSRRUN INTO @PolicyId 

  END 
END 
CLOSE CUSRRUN 
DEALLOCATE CUSRRUN 

--End cursor
PRINT 'Finished running'
DROP TABLE #toRun

--select TOP 10 * from [dbo].[POLICIES] p 

--select TOP 10 f.Name,* from [dbo].[POLICIES] p 
--INNER JOIN [dbo].[FOLDERS] f ON p.ParentID = f.UniqueID
--WHERE f.[Name] = 'Cisco'
--AND p.Deleted = 0
--ORDER BY f.name, p.name

--select * from [dbo].[FOLDERS] f WHERE f.[Name] = 'Cisco' ORDER BY f.name

Notes:

When the sp_CustomLogCleanup stored procedure is executed, it identifies 200 policy instances to purge and returns either a 0 or a 1 upon completion. The stored procedure identifies the 200 policy instances that are to be purged by executing one of the three additional stored procedures based on the log purging option selected:

  • sp_GetLogEntriesForDelete_FilterByEntries
  • sp_GetLogEntriesForDelete_FilterByDays
  • sp_GetLogEntriesForDelete_FilterByEntriesAndDays

The log purge utility has a hard-coded time out value for the execution of the stored procedure. While not recommended, you can change this value within the stored procedure to accommodate extremely large log counts.

If you choose to turncate, truncating the OBJECTINSTANCEDATA and OBJECTINSTANCES tables should be enough, as these two tables contain the bulk of log data. You may not be able to truncate the POLICYINSTANCES due to foreign key constraints.