Here at anhaminha, we build software for legal firms. One of them is a legal research engine that is simple to use. Since the database contains over 200.000 rows, full text search was the only way to go.
The app is developed with Rails 3 and will be deployed on Heroku when ready. As Heroku easily integrates with websolr, we have chosen Solr as our search engine. However, it turned out to be not so easy.
Websolr allows you to choose between acts_as_solr and sunspot gems to index and search your models. acts_as_solr was our first choice as it was considered to be the easy one. After following the documentation, I deployed the app to Heroku and received my first failure message :
2918f410013eb5ccc9f62769ce95418ffddc13a0/lib/websolr-acts_as_solr.rb:17: undefined method `info’ for nil:NilClass (NoMethodError)
After some headache (after all, the aforementioned NilClass was the Rails.logger), the issue was solved with the help of the Heroku support (thanks a lot). In the gemfile, I had:
gem ‘websolr-acts_as_solr’, :git => ‘git://github.com/onemorecloud/websolr-acts_as_solr.git’
and replacing it with:
gem ‘websolr-acts_as_solr’
solved the issue.
Expecting to get results soon, I deployed the app and opened up heroku console. The joy was short-lived. Issuing the command Model.rebuild_solr_index awarded me a with new nice error :
/opt/local/lib/ruby/gems/1.8/gems/rest-client-1.3.1/lib/restclient/exceptions.rb:6:in `method_missing’: stack level too deep (SystemStackError)
Now, that doesn’t tell much, does it? :)
Love led to anger, anger led to hate and I turned to the dark side. I switched to the sunspot gem.
Now, acts_as_solr is considered to be easier than sunspot, but I disagree. I found sunspot to be easier and very well documented, in fact, way better than acts_as_solr. Especially, the wiki at http://wiki.github.com/outoftime/sunspot is a great help.
Now, if you have jumped the rails3 wagon like me, the wiki explains how to integrate sunspot with rails 3 and websolr. You should remember to
- not to use ‘websolr-sunspot_rails’ gem anymore, you should only have - gem ‘sunspot’, :require => ‘sunspot’ - in your gemfile.
- not to try to use sunspot_rails, meaning no ‘searchable’ declaration in the model.
So what do you do? Following the wiki, add your Sunspot.setup declarations to the end of your initializer. Do not do this in the model.
Also, issue a heroku config command, note your websolr username and password. Go to http://www.websolr.com and login to your account. Go into your index and choose Sunspot 1.1 Experimental as your client. Anything else will result in a sweet RSolr::RequestError: Solr Response: Bad Request error.
Well, all set, I deployed and opened up heroku console. I issued the index command in sunspot : Sunspot.index(Decision.all). No error at first. With each second, I became confident that I had succeeded this time. And then, suddenly :
/opt/local/lib/ruby/gems/1.8/gems/rest-client-1.3.1/lib/restclient/exceptions.rb:6:in `method_missing’: stack level too deep (SystemStackError)
Well, Yoda was right. Dark side led to destruction. So, I went to sleep.
When I woke up, I had my revelation. The error was maybe due to the sheer amount of rows in the database. I opened up heroku console, and issued a find_each command that will do the same thing in batches :
Model.find_each(:batch_size => 10000) { |model| Sunspot.index(model) }
Yet, it resulted in the same SystemStackError. However, when I tried to open up heroku console again, I had a timeout error which kept popping up on my subsequent tries for a good 10 minutes. I knew Heroku was working behind my back.
When I was able to load the console again, I issued a simple Sunspot.commit and there, my huge database was indexed and everything worked as intended. I don’t know whether the same would have happened after Sunspot.index(Decision.all) command. But I don’t remember being unable to load the console after that, as I have a bad habit of trying the same thing over and over again hoping that the result may change.
Do not forget the Sunspot.commit! Sunspot holds the index in memory until you issue a commit.
Well, I hope this will help people running into these problems.
More info at:
http://outoftime.github.com/sunspot/