Developer's Closet A place where I can put my PHP, SQL, Perl, JavaScript, and VBScript code.

Configure Solr Using Cloudera Manager

Solr, also known as Cloudera Search within Cloudera Manager, is a distributed service for indexing and searching data stored in HDFS.

Add the Solr Service

Using Cloudera Manager, add a Solr Server to a host that is not hosting Zookeeper or Oozie. Solr will take a lot of processing power and memory. You can collocate a Cloudera Search server (solr-server package) with a MapReduce TaskTracker (MRv1) and a HDFS DataNode. When co-locating with MapReduce TaskTrackers, be sure that the resources of the machine are not oversubscribed. It's safest to start with a small number of MapReduce slots and increase them gradually.

Here is Cloudera’s current Solr guide:

Creating Your First Solr Collection

To use Solr for the first time you will have to create Collections. Here is how:, look under the heading: Creating Your First Solr Collection.

By default, the Solr server comes up with no collections. Make sure that you create your first collection using the instancedir that you provided to Solr in previous steps by using the same collection name. (numOfShards is the number of SolrCloud shards you want to partition the collection across. The number of shards cannot exceed the total number of Solr servers in your SolrCloud cluster):

solrctl collection --create collection1 -s {{numOfShards}}

You should be able to check that the collection is active. For example, for the server, you should be able to navigate to*%3A*&wt=json&indent=true and verify that the collection is active. Similarly, you should also be able to observe the topology of your SolrCloud using a URL similar to:

You will then be able to create a new core.

Creating a New Core

In Cloudera Manager, browse to the Solr Service. On the right of the menu, click on Solr Web UI. On the lower left menu, click on Add a New Core. Enter the collection created above, give the core a name, and submit.