AlignMinds Technologies logo

Working with Apache Solr: A Developer’s Insights

MODIFIED ON: November 29, 2022 / ALIGNMINDS TECHNOLOGIES / 0 COMMENTS

The search module is a vital component in today’s web applications. Its importance, as well as user-friendliness, is very critical in the growth of the business.

Two most important, most discussed, and widely used Search Engines are Apache Solr and Amazon CloudSearch. Both Solr and Amazon CloudSearch are search platforms that enable you to search your data by submitting HTTP requests and receive responses in either XML or JSON.

Apache Solr is open-source software. It is written entirely in Java and uses Lucene as the “engine” but adds full enterprise search server features and capabilities. Highly specialized search solution companies like lucid works, search technologies etc may prefer creating plugins and modules using open source code which gives more flexibility and control. The current, released, stable version of Apache Solr is 3.3.

Why need search engines?

So, why we really need a search engine?

Do the direct search in the database is not enough?

This is the first question that came to my mind when I was about to integrate Solr to one of our projects. I went on with some researches based on these questions.

What the search engines actually do?

Search engines go through each and every data and find a match if exist. If for a small web application direct database search is fine. But for applications with huge data, if we do a direct database search it will be a heavy process to perform a search.

What search engines like Solr do is, they read the data from database and keep a local index with that data. Periodically or when some data update occurs, we update the index also. And, when we perform the search, the search is done in the index and matching results are fetched from the index. Since the index and the code lie on the same location, search becomes tremendously fast and easy.

Courtesy: viblo.asia

Solr setup

First, download the latest version of the Service from the official site. Solr is written in Java so you also need Java Runtime Environment to run it.

$ cd solr-4.1.0/example/
$ java -jar start.jar

Once JRE is installed and started, Solr will be available with a web interface on port 8983. Open a web browser and go to http://localhost:8983/solr/(assuming Solr is installed in your local server). You can see something similar to the following image:

If you look at the left-hand side navigation you will find “collection1″. Collections in Solr are something similar to a database table. You can query it. Click on the collection and choose “query” from the submenu.

The first option is called “Request-Handler (qt)” with default value “/select”. Request handlers are sort of pre-defined queries. Next parameter is a query and its default value “*:*” selects everything. If you execute the query, it will select all data from the index. For now, since the index is empty, it will give zero results.

Now, we need to insert some data to the index from our database, right? First, include the Solr service.php (the config file that reads and writes data to the Solr Index). Now, fetch the current index and save it to a new array say $results_old. So $results_old->response->docs shall contain all the data in the Solr index (if there is some data, for now, no value will be there). Fetch whatever data we want to index from the database and map it to the Solr index fields. Once all data are mapped, write them to the index in a key->value pair format.

If you do a normal select query from the web interface, you will get a result similar to the following:

{ “responseHeader“: { “status“: 0, “QTime“: 1, “params“: { “indent“: “true”, “q“: “*:*”, “_“: “1406616999120″, “wt“: “json”, “rows“: “2″ } }, “response“: { “numFound“: 258, “start“: 0, “docs“: [ { “id”: “1”, “field_a”: “abcd”, “field_b”: “xyzz”, “field_c”: “43234” }, { “id”: “2”, “field_a”: “efgh”, “field_b”: “xyzz”, “field_c”: “76545” } ] } }

Now how to use Solr with your PHP project? You shall be having the PHP library called solr-php-client. It offers an object-oriented interface to Solr, somewhat like the PHP Solr extension. This library is however fully implemented in PHP so it can easily be used on any PHP environment. You may download it from : http://code.google.com/p/solr-php-client/downloads/list .

Once the library is added along with your project files, go to the file SolrPhpClient/Apache/Solr/Service.php . This is the main configuration file.

To connect your project with Solr, include the service.php file in your code. Then you will be able to access the Solr object as:

$solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);

Now, we need to write our data retrieved from the database to the Solr index. First initialize the Solr document as:

$document = new Apache_Solr_Document();

Save each of your data as a key-value pair (if multiple tables, using multidimensional array structure so that, the table names fall on the parent key-value pair.). After adding all your data to the Solr document,

if(!empty($documents))
{
$solr->addDocuments($documents);
$solr->commit4();
$solr->optimize4();
}

You have written your data to the Solr index.

Now simply to retrieve results from Solr,

$results = $solr->search(“*”,0,0);

So, if you need to make a search in your code to Solr, do it like:

$results = $solr->search(your_search_query, start_val, no_of_rows, query_conditions);

These are the basic steps, to integrate Solr with your project. You have connected your project with Solr, added your data to it and performed search too.

Distinctive advantages of Solr in my experience

Solr is one of the most widely used Search Engines in the current world. Here follow the major reasons for why Solr is most opted.

  • Apache Solr has multilingual support. So, you can make it useful for your websites which are not necessarily in English.
  • Faceting is one of the important features used in eCommerce website search modules. Faceting allows you to categorize your results into sub-groups, which can be used as the basis for another search. Solr supports Faceting to a minimal level.
  • When you do a search, as you type you will be able to see suggestions of popular queries in relevance to the input are presented as shown in the following image like what we see in Google Search.

This feature is called Auto Suggest. This feature can be implemented at the Search Engine level or at the Search Application level. Apache Solr has the native support for the autosuggest feature. It can be facilitated in many ways using – NGramFilterFactory, EdgeNGramFilterFactory or TermsComponent. Usually, you can find this feature of Apache Solr is used in conjunction with jQuery for creating powerful auto-suggestion experience in applications.

  • Ecommerce sites can benefit from the “Find Similar” feature as research suggests that users typically compare products before making a transaction and are likely to buy a product which is better. Apache Solr implements the “Find Similar” feature using handlers/components like MoreLikeThisHandler orMoreLikeThisComponent.
  • Sometimes when we type the search term spellings go wrong, then the search engine automatically corrects the spelling and present you with even the search result. This feature of presenting the user with spelling corrected suggestions is called “Did you mean…” feature. Apache Solr implements the “Did you mean…” feature with the Spellcheck search component.
  • Apache Solr has many algorithms including cache implementations such as LRUCache and FastLRUCache. Solr, being open-source, it can be extended by adding your own algorithms.
  • Solr is a free service. All you need to do is install the Solr to your server, add the Solr-php-client library, and write some small codes to connect your project to Solr.

– Radha R Krishnan