Solr - Alignminds Technologies

Search module is a vital component in today’s web applications. Its importance as well as user friendliness is very critical in the growth of business. Two most important, mostly discussed, and widely used Search Engines are Apache Solr and Amazon CloudSearch. Both Solr and Amazon CloudSearch are search platforms that enable you to search your data by submitting HTTP requests and receive responses in either XML or JSON. Apache Solr is an open source software. It is written entirely in Java and uses Lucene as the “engine”, but adds full enterprise search server features and capabilities. Highly specialized search solution companies like lucidworks, search technologies etc may prefer creating plugins and modules using open source code which gives more the flexibility and control. The current, released, stable version of Apache Solr is 3.3.

> Introduction

So, why we really need a search engine? Do the direct search in database is not enough? This is the first question that came to my mind when I was about to integrate Solr to one of our projects. I went on with some researches based on these questions. What the search actually do: go through each and every data and find a match if exist. If for a small web application direct database search is fine. But for applications with huge data, if we do a direct database search it will be a really heavy process to perform search.

What search engines like solr do is they read the data from database and keep a local index with that data. Periodically or when some data update occur, we update the index also. And, when we perform the search, the search is done in the index and matching results are fetch from the index. Since the index and the code lie on the same location, search becomes tremendously fast and easy.

solr1

Courtesy : https://viblo.asia/acclou84/posts/l5y8Rr32Mob3

> Solr set up

First, download the latest version of the Service from the official site. Solr is written in Java so you also need Java Runtime Environment to run it.

$ cd solr-4.1.0/example/

$ java -jar start.jar

Once JRE is installed and started, Solr will be available with a web interface on port 8983. Open a web browser and go to http://localhost:8983/solr/(assuming Solr is installed in your local server). You can see something similar to the following image:

solr2

If you look at the left hand side navigation you will find “collection1″. Collections in Solr are something similar to database table. You can query it. Click on the collection and chose “query” from submenu.

First option is called “Request-Handler (qt)” with default value “/select”. Request handlers are sort of pre-defined queries. Next parameter is query and its default value “*:*” selects everything. If you execute the query, it will select all data from the index. For now since the index is empty, it will give zero results.

Now, we need to insert some data to the index from our database, right? First include the Solr service.php (the config file that reads and writes data to the Solr Index). Now, fetch the current index and save it to a new array say $results_old. So $results_old->response->docs shall contain all the data in the Solr index (if there is some data, for now no value will be there). Fetch whatever data we want to index from the database and map it to the Solr index fields. Once all data are mapped, write them to the index in a key->value pair format.

If you do a normal select query from the web interface, you will get a result similar to following:

{ “responseHeader“: { “status“: 0, “QTime“: 1, “params“: { “indent“: “true”, “q“: “*:*”, “_“: “1406616999120″, “wt“: “json”, “rows“: “2″ } }, “response“: { “numFound“: 258, “start“: 0, “docs“: [ { "id": "1", "field_a": "abcd", "field_b": "xyzz", "field_c": "43234" }, { "id": "2", "field_a": "efgh", "field_b": "xyzz", "field_c": "76545" } ] } }

Now how to use Solr with your php project? You shall be having the PHP library called solr-php-client. It offers an object-oriented interface to Solr, somewhat like the PHP Solr extension. This library is however fully implemented in PHP so it can easily be used on any PHP environment. You may download it from : http://code.google.com/p/solr-php-client/downloads/list .

Once the library is added along with your project files, go to the file SolrPhpClient/Apache/Solr/Service.php . This is the main configuration file.

To connect your project with Solr, include the service.php file in your code. Then you will be able to access the solr object as:

$solr = new Apache_Solr_Service(SOLRHOST, SOLRPORT, SOLRNAME);

Now, we need to write our data retrieved from database to the solr index. First initiallize the Solr document as:

$document = new Apache_Solr_Document();

Save each of your data as a key value pair (if multiple tables , using multidimentional array structure so that, the table names falls on the parent key value pair.). After adding all your data to the Solr document ,

if(!empty($documents))

{

            $solr->addDocuments($documents);

            $solr->commit4();

            $solr->optimize4();

}

You have wrote your data to the Solr index.

Now simply to retrieve results from solr,

$results = $solr->search(“*”,0,0);

So if you need to make a search in your code to Solr, do it like :

$results = $solr->search(your_search_query, start_val, no_of_rows, query_conditions);

These are the basic steps, to integrate Solr with your project. You have connected your project with Solr, added your data to it and performed search too.

> Distinctive advantages of Solr in my experience:

Solr is one of the most widely used Search Engines in the current world. Here follows the major reasons for why Solr is most opted:

    • Apache Solr has multilingual support. So you can make it useful for your websites which are not necessarily in English.
    • Faceting is one of the important features used in ecommerce website search modules. Faceting allows you to categorize your results into sub-groups, which can be used as the basis for another search. Solr supports Faceting to a minimal level.
    • When you do a search, as you type you will be able to see suggestions of popular queries in relevance to the input are presented as shown in the following image like what we see in Google Search.

solr3

This feature is called as Auto Suggest. This feature can be implemented at the Search Engine level or at the Search Application level. Apache Solr has the native support for autosuggest feature. It can be facilitated in many ways using – NGramFilterFactory, EdgeNGramFilterFactory or TermsComponent. Usually you can find this feature of Apache Solr is used in conjunction with jQuery for creating powerful auto suggestion experience in applications.

  • Ecommerce sites can benefit from the “Find Similar” feature as research suggests that users typically compare products before making a transaction and are likely to buy a product which is better. Apache Solr implements the “Find Similar” feature using handlers/components like MoreLikeThisHandler orMoreLikeThisComponent.
  • Sometimes when we type the search term spellings go wrong, then the search engine automatically corrects the spelling and present you with even the search result. This feature of presenting the user with spelling corrected suggestions is called “Did you mean…” feature. Apache Solr implements the “Did you mean…” feature with the Spellcheck search component.
  • Apache Solr has many algorithms including cache implementations such as LRUCache and FastLRUCache. Solr, being open source, it can be extended by adding your own algorithms.
  • Solr is a free service. All you need to do is install the Solr to your server, add the Solr-php-client library, and write some small codes to connect your project to Solr.

Radha R Krishnan