Search engine vendors – good at eating their own dog food?
You probably know the saying “you should eat your own dog food”. It means that you should use your own products – everything else would tell your customers that you’re not trustworthy. Or would you buy a Mercedes if Dr. Dieter Zetsche was driving a Toyota?
With search engine vendors it’s the same. You would surely expect that the companies are using their own technology for searching their corporate websites, and of course they do. But how good are they at it? With most of the commercial search engine packages you can get average results with out-of-the-box configuration. For adding a really helpful service to your website however, you need to tweak and tune both data and software in order to being able to answer your users’ questions. No wonder that all vendors do not only offer their software but consulting on data aggregation, data cleansing and configuration of their products. So they must be experts at these tasks. But does this show as well on their own websites? Let’s chose three different vendors, navigate to their sites and ask them some questions – because they will know.
The selection of companies contains two main rivals in the high end segment (Autonomy and FAST) and one with more moderate pricing (Thunderstone).
What can I do for you?
The better your products are, the more you will find these queries in your logs: people are looking for a job. The pattern is representative for the so-called “navigational queries”: the user expects an entry point to job listings from where he can navigate and browse the offerings for deciding whether it’s worth filing his curriculum vitae or not.
Will we find what we are looking for when searching for “jobs” in the three different sites?
Autonomy’s results on jobs list their products first (“Autonomy Service Manager” is a software component, not a job title
), followed by information about solutions, services and investing. Finally as tenth entry you find the job section you were looking for – if you don’t miss it because the title just tells “Autonomy Aims High!”. What about adding representative titles for the main navigation entries?
Results from FAST are exactly what you would want: first item is a “Featured Content” box which leads you to the job openings directly. This is not a part of the resultset however but obviously the result of query analysis and a second query on – most likely more structured – data. Lesson learned: detecting the purpose of a query by doing some query analysis and therefore knowing where to find what will lead to better results than endless tuning of your engine’s weighting schema.
When searching for Jobs at thunderstone you will only get 4 results – none of them pointing to their (yet existing) jobs page. Interesting is a sentence readable in the summary of one result: “If you wanted to know which employees were seeking new jobs at the present time, …”. Apparently they would have the tools to know if and how people are looking for jobs…
Who’s your boss?
Let’s try a “fact query”. You want to know who’s the main responsible of the company. Searching for “CEO” you would expect to find the name of the person including CV and maybe contact information with one click.
No results are found when trying to find Thunderstone’s CEO. Seems that they decided for not listing the management on the site, so it’s obvious that you won’t find it. Ok for that.
Autonomy’s CEO is found on the first page but again at position 10. However, the result is the first match of the category “Company”. Seems that they have a fixed rating schema for the results (products and services first, then news and in the end all about company). If the category refinement was better visualized you reach your goal a little easier.
First match for the CEO of FAST is the profile of John M. Lervik – perfect.
Let’s talk
Third and last dicipline is “informational queries”. These contain a fully formulated request for information about a certain topic. Usually they do not consist of only one or two words but contain a naturally formulated sentence. Answering these queries correctly often requires natural language processing (NLP) that is able to parse the speech and make sense out of it. Of course, people that are not experts at the language that is used (have a look at my English for example…) should not be penalized too hard. This discipline is what Autonomy is proud of (yes, that’s why there’s a text area instead of a single text field, which suggests that you should use more words), so the expectations are made.
As an example query we want to know the following: “How to crawl an NTLM protected site”. Probably we won’t find too many technical details since they will be protected inside an extranet for existing customers. But nevertheless, we should find some hints about the security architecture of the products.
Thunderstone does not tell you about NTLM – no results found.
Same results at the FAST site. No results about crawling NTLM.
The only site that delivers you some information is indeed the one of Autonomy. And the first entry really points you to the security architecture where the difference between “mapped” and “unmapped” security is explained. Respect!
Conclusion
I know that this little crosscheck is far from being representative or even fair. The pure portal search is only one of the (minor?) goals of the large systems. Connectors to whatever data sources you could find in your company, security, personalization, push propagation, clustering and many more are parts of the whole and I’m not interested about that for now.
However, since your company’s corporate website may be the first impression that a possible future customer gets, it is really important to prove him that you are able to really use your products yourself. How would you want to sell consulting otherwise? Maybe, this argument applies a little less for lower cost search engines that rather try to reach masses with their software licenses instead of selling consultancy to the Fortune 500.
As far as their own portal search is concerned, FAST definitely leaves the best impression, even though I have no experience about using their products so far (unlike the other two that I have used in quite some projects).
One last: My friends at Eurospider who are doing consulting and implementation for us don’t even have search functionality on their website. Any reason for that?






June 20th, 2007 at 12:04
Nice article Chris!
Concerning the Eurospider no-search-available: maybe there’s not enough information on their website to make it simpler to find using a search?
Regards,
Kay