{"id":1083,"date":"2016-05-03T10:36:33","date_gmt":"2016-05-03T18:36:33","guid":{"rendered":"http:\/\/www.developerscloset.com\/?page_id=1083"},"modified":"2018-05-04T11:56:45","modified_gmt":"2018-05-04T19:56:45","slug":"drill","status":"publish","type":"page","link":"https:\/\/www.developerscloset.com\/?page_id=1083","title":{"rendered":"Drill"},"content":{"rendered":"<p><a href=\"http:\/\/www.developerscloset.com\/wp-content\/uploads\/2018\/05\/drill.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"size-medium wp-image-1084 alignnone\" src=\"http:\/\/www.developerscloset.com\/wp-content\/uploads\/2018\/05\/drill-300x200.jpg\" alt=\"\" width=\"300\" height=\"200\" \/><\/a><\/p>\n<p>Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Drill is the open source version of Google&#8217;s Dremel system which is available as an infrastructure service called Google BigQuery. One explicitly stated design goal is that Drill is able to scale to 10,000 servers or more and to be able to process petabytes of data and trillions of records in seconds. Drill is an Apache top-level project.<\/p>\n<p>Drill supports a variety of NoSQL databases and file systems, including HBase, MongoDB, MapR-DB, HDFS, MapR-FS, Amazon S3, Azure Blob Storage, Google Cloud Storage, Swift, NAS and local files. Using jdbc Drill is able to query SQL, Oracle, etc. A single query can join data from multiple datastores. For example, you can join a user profile collection in MongoDB with a directory of event logs in HDFS.<\/p>\n<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_79 counter-hierarchy ez-toc-counter ez-toc-grey ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-69ea23908b361\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #999;color:#999\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #999;color:#999\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-69ea23908b361\"  aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#How_to_Use_Drill\" >How to Use Drill<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#WebUI\" >WebUI<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#SQLLine\" >SQLLine<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Create_a_View\" >Create a View<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Nested_Data_Functions\" >Nested Data Functions<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_dfs\" >Query dfs<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_HBase\" >Query HBase<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_HDFS\" >Query HDFS<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-9\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_Hive\" >Query Hive<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-10\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_JSON\" >Query JSON<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-11\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_View\" >Query View<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-12\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Configure_Drill\" >Configure Drill<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-13\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Configure_Authentication\" >Configure Authentication<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-14\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Install_PAM_for_Drill\" >Install PAM for Drill<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-15\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Configure_PAM_for_Drill\" >Configure PAM for Drill<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-16\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Test_Authentication\" >Test Authentication<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-17\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Configure_Interfaces\" >Configure Interfaces<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-18\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#ODBC\" >ODBC<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-19\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#JDBC\" >JDBC<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-20\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Configure_Drill_to_use_HDFS_High_Availability\" >Configure Drill to use HDFS High Availability<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-1'><a class=\"ez-toc-link ez-toc-heading-21\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Troubleshooting\" >Troubleshooting<\/a><ul class='ez-toc-list-level-2' ><li class='ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-22\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Logging\" >Logging<\/a><ul class='ez-toc-list-level-3' ><li class='ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-23\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Drill\" >Drill<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-3'><a class=\"ez-toc-link ez-toc-heading-24\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#HBase\" >HBase<\/a><\/li><\/ul><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-25\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#UnknownHostException_nameservice1\" >UnknownHostException: nameservice1<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-26\" href=\"https:\/\/www.developerscloset.com\/?page_id=1083\/#Query_Timeout_%E2%80%93_UnknownScannerException\" >Query Timeout &#8211; UnknownScannerException<\/a><\/li><\/ul><\/li><\/ul><\/nav><\/div>\n<h1><span class=\"ez-toc-section\" id=\"How_to_Use_Drill\"><\/span>How to Use Drill<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<h2><span class=\"ez-toc-section\" id=\"WebUI\"><\/span>WebUI<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Query Drill through the WebUI at http:\/\/<em>servername<\/em>:8047<\/p>\n<h2><span class=\"ez-toc-section\" id=\"SQLLine\"><\/span>SQLLine<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>SQLLine comes with the Drill distribution. To start using Drill, go to the directory into which you extracted the tar file and run SQLLine:<br \/>\ncd \/opt\/apache-drill\/bin\/<\/p>\n<p>.\/sqlline -u jdbc:drill:zk=local<\/p>\n<p>You can now execute SQL queries to Drill. To exit from SQLLine, run the command<\/p>\n<p>!quit<\/p>\n<p>You can see view the available schemas using this query<\/p>\n<p>SELECT * FROM INFORMATION_SCHEMA.SCHEMATA;<\/p>\n<p>Show databases:<\/p>\n<p>show databases;<\/p>\n<p>Show schema:<\/p>\n<p>show schemas;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Create_a_View\"><\/span>Create a View<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Create a view in Drill:<\/p>\n<p>CREATE VIEW `hdfs.root`.`lib\/drill\/views\/vwMyView` AS SELECT<br \/>\ncolumns[1] `zip_code`<br \/>\n,columns[2] `zip4`<br \/>\nFROM `hdfs`.`root`.`path\/to\/file.txt`<\/p>\n<p>Update the view:<\/p>\n<p>CREATE\u00a0OR REPLACE\u00a0VIEW `hdfs.root`.`lib\/drill\/views\/vwMyView` AS SELECT<br \/>\ncolumns[1] `zip_code`<br \/>\n,columns[2] `zip4`<br \/>\n,columns[3] `address1`<br \/>\nFROM `hdfs`.`root`.`path\/to\/file.txt`<\/p>\n<p>Delete the view:<\/p>\n<p>DROP VIEW `hdfs`.`root`.`lib\/drill\/views\/vwMyView`<\/p>\n<p>Query a view:<\/p>\n<p>SELECT * FROM `views`.`views`.`vwPlaylogs` LIMIT 100<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Nested_Data_Functions\"><\/span>Nested Data Functions<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>FLATTEN &#8211;\u00a0separates the elements in a repeated field into individual records.<\/p>\n<p>SELECT\u00a0h.foo.id,\u00a0h.foo.name<br \/>\nFROM (<br \/>\nSELECT<br \/>\nFLATTEN(r.holidays) foo<br \/>\nFROM `hdfs`.`root`.`asset\/reference\/timeanddate\/sample` r<br \/>\n) h<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_dfs\"><\/span>Query dfs<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>dfs represents the local Linux file system.<\/p>\n<p>To query the file system, try this:<\/p>\n<p>SELECT * FROM dfs.root.`\/web\/logs`;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_HBase\"><\/span>Query HBase<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Some observations:<\/p>\n<p>&#8220;LIKE&#8221; does not work well on charlist, so instead use &#8220;SIMILAR TO&#8221; instead (example: \u2026WHERE fieldX SIMILAR TO &#8216;abc%&#8217;). For example, `hbase`.`TEST.TABLE1` has columns &#8220;row_key; S; SG&#8221;.\u00a0 In order to query a parameter (sub field) in SG, you have to include \u201c,*\u201d in the SELECT.\u00a0 Without \u201c, *\u201d you will not get any results back for a specific parameter in column &#8220;SG&#8221;.\u00a0 This does not repro for column &#8220;S&#8221;.<\/p>\n<p>HBase data is stored in UTF8, make sure you convert from UTF8 when you return your results:<\/p>\n<p>select CONVERT_FROM(row_key, &#8216;UTF8&#8217;) FROM hbase.`TEST.TABLE1` limit 10;<\/p>\n<p>Query all &#8216;US&#8217; &#8216;events&#8217; from the TABLE1 table in HBase:<\/p>\n<p>SELECT<\/p>\n<p>CONVERT_FROM(TABLE1.S.TITLE, &#8216;UTF8&#8217;) AS Title<\/p>\n<p>,CONVERT_FROM(TABLE1.S.A1, &#8216;UTF8&#8217;) AS Address<\/p>\n<p>,CONVERT_FROM(TABLE1.S.CI, &#8216;UTF8&#8217;) AS City<\/p>\n<p>,CONVERT_FROM(TABLE1.S.CNY, &#8216;UTF8&#8217;) AS Country<\/p>\n<p>,CONVERT_FROM(TABLE1.S.DATETIME_LOCAL, &#8216;UTF8&#8217;) AS LocalDateTime<\/p>\n<p>,CONVERT_FROM(TABLE1.S.VENUE, &#8216;UTF8&#8217;) AS Venue<\/p>\n<p>,CONVERT_FROM(TABLE1.S.ZIP, &#8216;UTF8&#8217;) AS Zip<\/p>\n<p>,CONVERT_FROM(TABLE1.SG.TYPE, &#8216;UTF8&#8217;) AS `Type`<\/p>\n<p>FROM `hbase`.`TEST.TABLE1` TABLE1<\/p>\n<p>WHERE CONVERT_FROM(TABLE1.SG.TYPE, &#8216;UTF8&#8217;) like &#8216;%events%&#8217;<\/p>\n<p>AND CONVERT_FROM(TABLE1.S.CNY, &#8216;UTF8&#8217;) like &#8216;US&#8217;<\/p>\n<p><strong>Count Rows:<\/strong><\/p>\n<p>You DO NOT use select count(*)\u00a0to count the number of rows, Drill will return a tiny number of rows, MUCH lower than the actual number of rows. Instead count on a specific column.<\/p>\n<p>select COUNT(q.S.TITLE) from `hbase`.`TEST.TABLE1` `q` limit 1;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_HDFS\"><\/span>Query HDFS<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>List the top 100 rows from all files in a folder\u00a0in HDFS:<\/p>\n<p>SELECT * FROM `hdfs`.`root`.`\/asset\/reference\/timeanddate` LIMIT 100;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_Hive\"><\/span>Query Hive<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Select from a hive table:<\/p>\n<p>SELECT firstname,lastname FROM `hive`.test.customers limit 10;<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_JSON\"><\/span>Query JSON<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>List the top 10 rows from all files in a folder\u00a0in HDFS:<\/p>\n<p>SELECT * FROM `hdfs`.`root`.`\/asset\/reference\/timeanddate` LIMIT 10;<\/p>\n<p>Select from a nested JSON:<\/p>\n<p>Data:<\/p>\n<p>{&#8220;version&#8221;:2,&#8221;holidays&#8221;:[{&#8220;id&#8221;:1687,&#8221;urlid&#8221;:&#8221;romania\/new-year-day&#8221;,&#8221;url&#8221;:&#8221;<a rel=\"nofollow\">http:\/\/www.timeanddate.com\/holidays\/romania\/new-year-day&#8221;,&#8221;country&#8221;:{&#8220;id&#8221;:&#8221;ro&#8221;,&#8221;name&#8221;:&#8221;Romania&#8221;},&#8221;name&#8221;:&#8221;New<\/a>\u00a0Year&#8217;s Day&#8221;,&#8221;oneliner&#8221;:&#8221;New Year\u2019s Day (Anul Nou) and the following day, on January 1 and 2 respectively, are annual holidays in Romania.&#8221;,&#8221;types&#8221;:[&#8220;National holiday&#8221;],&#8221;date&#8221;:{&#8220;iso&#8221;:&#8221;2014-01-01&#8243;,&#8221;datetime&#8221;:{&#8220;year&#8221;:2014,&#8221;month&#8221;:1,&#8221;day&#8221;:1}},&#8221;uid&#8221;:&#8221;00069700000007de&#8221;},<\/p>\n<p>Query:<\/p>\n<p>SELECT\u00a0h.foo.id,\u00a0h.foo.name<br \/>\nFROM (<br \/>\nSELECT<br \/>\nFLATTEN(r.holidays) foo<br \/>\nFROM `hdfs`.`root`.`asset\/reference\/timeanddate\/sample` r<br \/>\n) h<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Query_View\"><\/span>Query View<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>To query a view:<\/p>\n<p>SELECT * FROM `views`.`views`.`vwSomeView` LIMIT 100<\/p>\n<p>&nbsp;<\/p>\n<h1><span class=\"ez-toc-section\" id=\"Configure_Drill\"><\/span>Configure Drill<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<h2><span class=\"ez-toc-section\" id=\"Configure_Authentication\"><\/span>Configure Authentication<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Drill can be configured to use authentication, using pam, against LDAP. Using the instructions below an account is checked against \/etc\/passwd, test before installing by running id\u00a0<em>username<\/em>. If you see results (list of groups) then you are ready to apply security to Drill.<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Install_PAM_for_Drill\"><\/span>Install PAM for Drill<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Download the JPam-Linux_amd64-1.1.tgz tar file and copy to \/tmp\/:<\/p>\n<p>Create the folder structure and extract the files:<\/p>\n<p>sudo mkdir -p \/opt\/pam\/<\/p>\n<p>sudo tar zxvf \/tmp\/JPam-Linux_amd64-1.1.tgz &#8211;directory \/opt\/pam\/<\/p>\n<p>sudo cp \/opt\/pam\/JPam-1.1\/libjpam.so \/opt\/pam\/<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Configure_PAM_for_Drill\"><\/span>Configure PAM for Drill<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>Edit the drill-env.sh file:<\/p>\n<p>sudo vi \/opt\/apache-drill\/conf\/drill-env.sh<\/p>\n<p>Include the directory where the\u00a0<code>libjpam.so<\/code> file is located:<\/p>\n<p>export DRILLBIT_JAVA_OPTS=&#8221;-Djava.library.path=\/opt\/pam\/&#8221;<\/p>\n<p>Edit the drill-override.conf file:<\/p>\n<p>sudo vi \/opt\/apache-drill\/conf\/drill-override.conf<\/p>\n<p>Add the following configuration to the\u00a0<code>drill.exec<\/code>\u00a0block &#8211; simply enable security.user.auth if the section already exists:<\/p>\n<p>security.user.auth {<br \/>\nenabled:\u00a0true,<br \/>\npackages += &#8220;org.apache.drill.exec.rpc.user.security&#8221;,<br \/>\nimpl: &#8220;pam&#8221;,<br \/>\npam_profiles: [ &#8220;sudo&#8221;, &#8220;login&#8221; ]<br \/>\n}<\/p>\n<p>Comment out the following keystore:<\/p>\n<p># Below SSL parameters need to be set for custom transport layer settings.<br \/>\n#javax.net.ssl {<br \/>\n#\u00a0 keyStore: &#8220;\/keystore.file&#8221;,<br \/>\n#\u00a0 keyStorePassword: &#8220;ks_passwd&#8221;,<br \/>\n#\u00a0 trustStore: &#8220;\/truststore.file&#8221;,<br \/>\n#\u00a0 trustStorePassword: &#8220;ts_passwd&#8221;<br \/>\n#}<\/p>\n<p>Restart the drillbit:<\/p>\n<p>sudo \/opt\/apache-drill\/bin\/drillbit.sh restart<\/p>\n<h3><span class=\"ez-toc-section\" id=\"Test_Authentication\"><\/span>Test Authentication<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>You can test authentication by signing onto sqlline or the ODBC connection. For sqlline, you will first need to open sqlline, then connect &#8211; you will then be asked for your username and password.<\/p>\n<p>1.\u00a0Open sqlline<\/p>\n<p>\/opt\/apache-drill\/bin\/sqlline<\/p>\n<p>2. Connect to Drill, substitute localhost with your ZooKeeper address:<\/p>\n<p>!connect jdbc:drill:zk=localhost:2181<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Configure_Interfaces\"><\/span>Configure Interfaces<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Overview:\u00a0<a class=\"external-link\" href=\"https:\/\/drill.apache.org\/docs\/interfaces-introduction\/\" rel=\"nofollow\"><u>https:\/\/drill.apache.org\/docs\/interfaces-introduction\/<\/u><\/a><\/p>\n<h3><span class=\"ez-toc-section\" id=\"ODBC\"><\/span>ODBC<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>MapR provides an ODBC 3.8 driver that connects Windows, Mac OS X, and Linux to Apache Drill and BI tools. Install the latest version of Apache Drill with the latest version of the MapR Drill ODBC driver. An ODBC driver that you installed with a older version of Drill probably will not work with an upgraded version of Drill.<\/p>\n<p>Download the Windows 64 version of the ODBC driver from MapR.<\/p>\n<p>This link has details on creating ODBC driver on Windows:\u00a0<a class=\"external-link\" href=\"https:\/\/drill.apache.org\/docs\/installing-the-driver-on-windows\/\" rel=\"nofollow\">https:\/\/drill.apache.org\/docs\/installing-the-driver-on-windows\/<\/a><\/p>\n<h3><span class=\"ez-toc-section\" id=\"JDBC\"><\/span>JDBC<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>You can connect to Drill through a JDBC client tool, such as SQuirreL, on Windows, Linux, and Mac OS X systems, to access all of your data sources registered with Drill. An embedded JDBC driver is included with Drill. Configure the JDBC driver in the SQuirreL client to connect to Drill from SQuirreL. This document provides instruction for connecting to Drill from SQuirreL on Windows.<\/p>\n<h2><span class=\"ez-toc-section\" id=\"Configure_Drill_to_use_HDFS_High_Availability\"><\/span>Configure Drill to use HDFS High Availability<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Drill&#8217;s Hive storage has not been written to use HDFS HA.<\/p>\n<p>Copy the hdfs-site-xml file into Drill&#8217;s conf folder, use the following as a template as to which properties to include. The changes are highlighted in <span style=\"color: #0000ff\">blue<\/span>:<br \/>\n&lt;?xml version=&#8221;1.0&#8243; encoding=&#8221;UTF-8&#8243;?&gt;<br \/>\n&lt;!&#8211;Autogenerated by Cloudera Manager&#8211;&gt;<br \/>\n&lt;configuration&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.nameservices&lt;\/name&gt;<br \/>\n&lt;value&gt;nameservice1&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.client.failover.proxy.provider.nameservice1&lt;\/name&gt;<br \/>\n&lt;value&gt;org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.ha.automatic-failover.enabled.nameservice1&lt;\/name&gt;<br \/>\n&lt;value&gt;true&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.ha.namenodes.nameservice1&lt;\/name&gt;<br \/>\n&lt;value&gt;<span style=\"color: #0000ff\">namenode80<\/span>,<span style=\"color: #0000ff\">namenode171<\/span>&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.namenode.rpc-address.nameservice1.<span style=\"color: #0000ff\">namenode80<\/span>&lt;\/name&gt;<br \/>\n&lt;value&gt;<span style=\"color: #0000ff\">servername02:8020<\/span>&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;property&gt;<br \/>\n&lt;name&gt;dfs.namenode.rpc-address.nameservice1.<span style=\"color: #0000ff\">namenode171<\/span>&lt;\/name&gt;<br \/>\n&lt;value&gt;<span style=\"color: #0000ff\">servername03:8020<\/span>&lt;\/value&gt;<br \/>\n&lt;\/property&gt;<br \/>\n&lt;\/configuration&gt;<\/p>\n<h1><span class=\"ez-toc-section\" id=\"Troubleshooting\"><\/span>Troubleshooting<span class=\"ez-toc-section-end\"><\/span><\/h1>\n<h2><span class=\"ez-toc-section\" id=\"Logging\"><\/span>Logging<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<h3><span class=\"ez-toc-section\" id=\"Drill\"><\/span>Drill<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>\/var\/log\/apache-drill\/<\/p>\n<p>Increase logging verbosity:<\/p>\n<p>ALTER SYSTEM SET `exec.errors.verbose` = true;<\/p>\n<h3><span class=\"ez-toc-section\" id=\"HBase\"><\/span>HBase<span class=\"ez-toc-section-end\"><\/span><\/h3>\n<p>For further HBase logging, you need to modify &#8220;$DRILL_HOME\/conf\/logback.xml&#8221; on all Drillbit nodes and add the following snippet (\/opt\/apache-drill\/conf\/logback.xml):<\/p>\n<p>&lt;logger name=&#8221;org.apache.drill.exec.store.hbase&#8221; additivity=&#8221;false&#8221;&gt;<br \/>\n&lt;level value=&#8221;debug&#8221; \/&gt;<br \/>\n&lt;appender-ref ref=&#8221;FILE&#8221; \/&gt;<br \/>\n&lt;\/logger&gt;<\/p>\n<p>You will then see additional HBase logs in the \/var\/log\/apache-drill\/drillbit.log\u00a0log file.<\/p>\n<p>Example log results:<\/p>\n<p>Took xxxx ms to get yyyyy records<\/p>\n<h2><span class=\"ez-toc-section\" id=\"UnknownHostException_nameservice1\"><\/span>UnknownHostException: nameservice1<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>There is a bug in Drill&#8217;s hive Storage when connecting to HDFS HA. When querying a hive database on HDFS HA, the query will throw the following error:<\/p>\n<div>&gt; select * from `hive`.`TEST`.TABLE1 limit 1;<\/div>\n<div>Error: SYSTEM ERROR: UnknownHostException: nameservice1<\/div>\n<div>[Error Id: a681966f-2a77-4853-9046-63d4038dbc71 on servername10:31010] (state=,code=0)<\/div>\n<p>Resolution: As of 11\/2\/2015, version 1.2, Drill has not fixed this bug. To\u00a0workaround this bug:<\/p>\n<ol>\n<li>NOT RECOMMENDED: Add nameservice1 to the server&#8217;s hosts file pointed to the IP Address of the current Active NameNode. Obviously this defeats the purpose of HDFS HA, but until Drill fixes this the workaround will help.<\/li>\n<li>The recommended solution is to use the configure Drill to use\u00a0HDFS HA by adding specific sections of the hdfs-site.xml file to Drill&#8217;s conf folder. See the configuration section above for details.<\/li>\n<\/ol>\n<h2><span class=\"ez-toc-section\" id=\"Query_Timeout_%E2%80%93_UnknownScannerException\"><\/span>Query Timeout &#8211; UnknownScannerException<span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p>Drill queries would hang, never complete, and canceling the queries would not work.<\/p>\n<p>In sqlline, run a query and if it hangs, press CTRL-C to cancel the query. If the results display then you have this problem. The reason it fails is that the results are returned successfully but the connection does not close on the RegionServer and the session hangs. To\u00a0resolve\u00a0the problem I had to restart the RegionServer.<\/p>\n<p>This problem presented many errors found in \/var\/log\/apache-drill\/drillbit.log:<\/p>\n<blockquote><p>org.objectweb.asm.tree.analysis.AnalyzerException: Error at instruction 84: Expected an object reference, but found .<\/p>\n<p>at org.objectweb.asm.tree.analysis.Analyzer.analyze(Analyzer.java:294)<\/p>\n<p>at org.objectweb.asm.util.CheckMethodAdapter$1.visitEnd(CheckMethodAdapter.java:450)<\/p>\n<p>at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877)<\/p>\n<p>&#8230;<\/p>\n<p>Caused by: org.objectweb.asm.tree.analysis.AnalyzerException: Expected an object reference, but found .<\/p>\n<p>at org.objectweb.asm.tree.analysis.BasicVerifier.copyOperation(BasicVerifier.java:80)<\/p>\n<p>at org.objectweb.asm.tree.analysis.BasicVerifier.copyOperation(BasicVerifier.java:47)<\/p>\n<p>at org.objectweb.asm.tree.analysis.Frame.execute(Frame.java:276)<\/p>\n<p>&#8230;<\/p><\/blockquote>\n<p>When you would\u00a0stop the query the query would enter a PENDING or CANCELATION_REQUEST state and hang, eventually the following error would be thrown:<\/p>\n<blockquote><p>2015-12-14 17:22:16,786 [BitServer-1] WARN\u00a0 o.a.d.exec.rpc.control.WorkEventBus &#8211; A fragment message arrived but there was no registered listener for that message: profile {<\/p>\n<p>state: FAILED<\/p>\n<p>error {<\/p>\n<p>error_id: &#8220;7277ec0e-cff6-41d9-9f43-3c3d2c84fa5b&#8221;<\/p>\n<p>endpoint {<\/p>\n<p>address: &#8220;servername04&#8221;<\/p>\n<p>user_port: 31010<\/p>\n<p>control_port: 31011<\/p>\n<p>data_port: 31012<\/p>\n<p>}<\/p>\n<p>error_type: SYSTEM<\/p>\n<p>message: &#8220;SYSTEM ERROR: RemoteWithExtrasException: org.apache.hadoop.hbase.UnknownScannerException: Name: 22491, already closed?\\n\\tat org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2092)\\n\\tat org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:31443)\\n\\tat org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2035)\\n\\tat org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:107)\\n\\tat org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)\\n\\tat org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)\\n\\tat java.lang.Thread.run(Thread.java:745)\\n\\n\\nFragment 1:0\\n\\n[Error Id: 7277ec0e-cff6-41d9-9f43-3c3d2c84fa5b on servername31:31010]&#8221;<\/p>\n<p>exception {<\/p>\n<p>exception_class: &#8220;org.apache.drill.common.exceptions.DrillRuntimeException&#8221;<\/p>\n<p>message: &#8220;org.apache.hadoop.hbase.client.ScannerTimeoutException: 600330ms passed since the last invocation, timeout is currently set to 60000&#8221;<\/p>\n<p>stack_trace {<\/p>\n<p>class_name: &#8220;org.apache.drill.exec.store.hbase.HBaseRecordReader&#8221;<\/p>\n<p>file_name: &#8220;HBaseRecordReader.java&#8221;<\/p>\n<p>line_number: 185<\/p>\n<p>method_name: &#8220;next&#8221;<\/p>\n<p>is_native_method: false<\/p>\n<p>}<\/p>\n<p>stack_trace {<\/p><\/blockquote>\n<p>Finally I saw the following error in the drill log:<\/p>\n<blockquote><p>2015-12-14 18:28:02,790 [2990f646-eb31-bcb7-5c41-55a4e1b1a9d8:frag:1:0] ERROR o.a.d.exec.rpc.RpcConnectionHandler &#8211; Failed to establish connection<\/p>\n<p>java.util.concurrent.TimeoutException: Timeout waiting for task.<\/p>\n<p>at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276) ~[guava-14.0.1.jar:na]<\/p>\n<p>at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:96) ~[guava-14.0.1.jar:na]<\/p>\n<p>at org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:115) ~[drill-java-exec-1.2.0.jar:1.2.0]<\/p>\n<p>&#8230;<\/p>\n<p>at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71]<\/p>\n<p>2015-12-14 18:28:02,812 [BitServer-1] INFO\u00a0 o.a.d.e.w.fragment.FragmentExecutor &#8211; Applying request for early sender termination for 1:0 -&gt; 0:0.<\/p>\n<p>2015-12-14 18:28:02,822 [BitServer-1] WARN\u00a0 o.a.d.exec.rpc.control.ControlServer &#8211; Message of mode REQUEST of rpc type 7 took longer than 500ms.\u00a0 Actual duration was 121081ms.<\/p>\n<p>2015-12-14 18:28:02,870 [2990f646-eb31-bcb7-5c41-55a4e1b1a9d8:frag:1:0] INFO\u00a0 o.a.d.e.w.fragment.FragmentExecutor &#8211; 2990f646-eb31-bcb7-5c41-55a4e1b1a9d8:1:0: State change requested RUNNING &#8211;&gt; FINISHED<\/p>\n<p>2015-12-14 18:28:02,870 [2990f646-eb31-bcb7-5c41-55a4e1b1a9d8:frag:1:0] INFO\u00a0 o.a.d.e.w.f.FragmentStatusReporter &#8211; 2990f646-eb31-bcb7-5c41-55a4e1b1a9d8:1:0: State to report: FINISHED<\/p>\n<p>2015-12-14 18:30:02,822 [WorkManager.StatusThread] ERROR o.a.d.exec.rpc.RpcConnectionHandler &#8211; Failed to establish connection<\/p>\n<p><span style=\"color: #ff0000\">java.util.concurrent.TimeoutException: Timeout waiting for task.<\/span><\/p>\n<p>at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:276) ~[guava-14.0.1.jar:na]<\/p>\n<p>at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:96) ~[guava-14.0.1.jar:na]<\/p>\n<p>at org.apache.drill.exec.rpc.ReconnectingConnection$ConnectionListeningFuture.waitAndRun(ReconnectingConnection.java:115) ~[drill-java-exec-1.2.0.jar:1.2.0]<\/p>\n<p>&#8230;<\/p><\/blockquote>\n<p>This pointed me to an HBase problem, since with was not a timeout, I restarted the HBase RegionServer and the problem was repaired. You should see a corresponding error to the\u00a0UnknownScannerException\u00a0error in the HBase RegionServer log.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Drill is the open source version of [&#8230;]<\/p>\n","protected":false},"author":2,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"jetpack_post_was_ever_published":false,"footnotes":""},"class_list":["post-1083","page","type-page","status-publish","hentry"],"jetpack_shortlink":"https:\/\/wp.me\/P1BQ8S-ht","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/pages\/1083","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=1083"}],"version-history":[{"count":5,"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/pages\/1083\/revisions"}],"predecessor-version":[{"id":1092,"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=\/wp\/v2\/pages\/1083\/revisions\/1092"}],"wp:attachment":[{"href":"https:\/\/www.developerscloset.com\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=1083"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}