How to join and view data from SQL and HDFS through Drill

Configure Drill to query SQL Server

Microsoft SQL Server can be reached by a jdbc driver they publish. The 4.0 version is compatible with jdk 7, while the 4.2 is compatible with jdk 8: sqljdbc4.jar.

Note: the jdbc drive needs a username/password so you have to turn on SQL Authentication on the SQL server and use local accounts – this is not a common SQL deployment.

Add the new mssql Storage to Drill:

1. Click Storage.

2. Type mssql under new Storage, click Create

3. Update the configuration with the following:

{
“type”: “jdbc”,
“driver”: “com.microsoft.sqlserver.jdbc.SQLServerDriver”,
“url”: “jdbc:sqlserver://servername:1433;databaseName=Test”,
“username”: “local_drill”,
“password”: “********”,
“enabled”: true
}

Query SQL Server

Example:   SELECT TableName, `Count`
FROM `mssql`.`dbo`.`rc`

Join and view data from SQL and Hadoop

Example:   SELECT distinct hd1.columns[0] as CountDate ,ms1.TableName,ms1.`Count`
FROM `hdfs`.`root`.`Test-1.csv` as hd1
join
`mssql`.`dbo`.`rc` as ms1
on hd1.columns[1] = ms1.TableName
where ms1.`count` = 1

Leave a Reply