Internship: An introduction

My internship: Global search @ ugent.be

Lagging a bit behind, I’m going to describe my internship.

In a nutshell, my internship is about search, a whole lot of search. Since the portal site of the university at Ghent (http://www.ugent.be) is moving to and running Plone as its main CMS and a whole lot of data is going into this portal site, people need to be able to search all that data.

Solr logoSo, the challenge is to implement a search system taking care of as much as possible, preferably in an open source environment! The idea was to use Apache Solr as main search engine, which uses Lucene. The fun stuff about this is that we stay within the field of open source, and still provide a strong search engine.

That being said, I’m currently working towards understanding Solr, since I have no experience what-so-ever with Solr (and only very little experience in Plone) this is going to challenge me, and I like challenges!

So, now you know what the assignment is, let’s break it down into steps. Because we’re in a western world and we enjoy the clear path of steps. Obviously that’s needed, since just starting it head-on will end in failure.

First: Learn Solr

First I’m going to discover as much as possible about Solr. It so happens that the university’s library has an implementation of Solr running for their search. I’ve also been given the book Solr 1.4 Enterprise Search Server by David Smiley and Eric Pugh to study. Using that book I hope to get a clear vision of Solr.

Next: Case study

As mentioned in the first step, the university’s library is running a Solr implementation. It just so happens that implementation is currently barely documented. We proposed to do this for them, since they’ve been (and still will be probably) a great help in getting us up to speed with Solr.

I’m basically going to put their techniques in documentation, case study style. That way I have seen a Solr implementation running (all parts of it) and I’ll have a better vision on how to implement it on the scale we want.

After that: Plone + Solr

When that’s done, the goal is to have a Plone plugin to integrate Solr as easy as possible into any Plone setup. Since there’s been found some evidence that people in the Plone community have already started such effort (or once started it) it’s possible that we will communicate with them to finish the module.

Extra extra

Every internship has its main goals and some extra, in case the main goals were achieved too early (or just in time to do more work), I have some extras too.

As extra we have the expansion of the global search. Since the portal site of UGent is not all the content hosted by the university, and we want as much as possible searchable, it’s possible we look for ways to have the external UGent websites searchable too. Possibly by crawler (Nutch maybe)

Internship category

Since it’s best to blog my improvements (and findings, if any) and possibly help people along the way, I created this small subsection on my site. For people interested in Solr and / or plone (possibly the improvement or development of a plone module for Solr, although it seems such effort is already on the way) I hope to help as much as possible!

Share and Enjoy:
  • Print
  • Reddit
  • Identi.ca
  • Digg
  • Sphinn
  • del.icio.us
  • Facebook
  • Mixx
  • Google Bookmarks
  • Slashdot

You can follow any responses to this entry through the RSS 2.0 feed. Both comments and pings are currently closed.

1 Comment »