Tuesday, August 21, 2007

ILTA Conference, Day 2, Session 2: Finding the Needle in the Enterprise Search Haystack

Chard Ergun, White & Case
Dora Martinez, Sheppard Mullin Richter & Hampton LLP
Mike Tominna, DLA Piper

Engine Selection Process

White & Case started working 1 1/2 years ago. The special challenge for White & Case was to find a search engine that was language-independent and able to handle multiple offices.

They gathered search engine requirements through surveys, interviews, log files, usage reports.
They built a search engine matrix with priorities and weights.
Started with 4 products from Gartner’s Magic Quadrant (Autonomy, perhaps Fast and Endeca?).
Built extensive test environment, had 6 million documents, 120 mailboxes, documents from Europe and America, and external web sites. They installed 4 search engines and spent 4-6 weeks on each. Each was ranked against the 30-item matrix.

Preferred “concept-based” search. Each attorney ran a test against what they had done in the last few months. Language independency was a major factor.

Attorneys liked the automated query guidance that provided suggestions for other terms.

W & C uses a podcast / voicemail system and took Outlook offline. Autonomy allowed attorneys to search C drive and these systems.

Autonomy respected underlying security model. It was scalable (9 million documents) and the support was excellent. They are a publicly traded company.

One box, one button, gives attorneys what they want. Keep adding a new feature without tellinig attorneys. Autonomy has other features but they only displayed the search. Hold monthly or weekly updates.

Autonomy is automatic, language independent.

Lessons learned:

There is no search engine that is perfect. They have advantages and disadvantages. Fine-tuning is therefore crucial. There will be extensive costs in setting them up.

Have to have a good security model, scalable platform, with extensive data source coverage.

They did no training, had a one-page Quick Chart, and had some marketing. If there is a high usability rate, you don’t need training.

They have Elite, iManage/Interwoven, and many other applications. Autonomy covers more than 400 data types. They have used it for more than a year.

Word of mouth is very important, but you have to test as well. There is no way back after the investment.

Screen shots/features

Can limit search to an application by icons
Remote channels pop up as you start to type on a letter. Example was "comfort letter." A remote channel is a dynamic list of knowledge documents, or other channels that show you related precedent or exemplar documents related to the one you are working on.

Michael Tominna/DLA Piper

DLA Piper was Recommind's first sale five years back.

The problem is that there is too much data from too many sources, with poor search in the different applications.

The solution is a product that produces meaningful results, that is scalable to millions of records, with the ability to crawl standard products (SQL, File Shares, XML, web, etc.), and has the ability to crawl via API.

Will it create own indexes or use the indexes of the applications.
With Recommind they are searching bios, 18 million documents, Elite accounting system including clients, matters, and time entries, biographical people information such as bar, languages, and so forth.

Faceted search avoids having to create Boolean searching. Can limit by matter, industry group, author, applications, search for people or documents. Drill-down on matter includes MRA, CRA, who worked on it.

Search exposed poor security in some areas such as HR.


Search technology will not fix all of the problems. Tweaking will be required. Slow performance (5-10 seconds) will kill it. Build out more than you need because the amount of data crawled is huge.

Dora Martinez / Director of Project Management at Sheppard Mullin

SM looked at Sharepoint because they needed a single place to search full-text across multiple systems, with relevancy, and were also looking for a new portal.

SM did a requirements-gathering process through interviewing key constituents from each practice group, every department head, and every office manager, along with a firm-wide survey. They worked with an outside Microsoft Partner.

Results validated that the existing search tools were not doing the job, and that users wanted a "google-like" enterprise search. Because Sharepoint out of the box was not going to have all these features, they decided to work with XMLaw.

Sharepoint will not respect security features and will not provide DMS functionality.

One challenge was that they were working with beta code up until the release date.


Sharepoint let them index CMS, the DMS, and many other data sources. There was no extra expense, and other search engines were outside SM's price point.

They enhanced the firm directory and used Best Bets to identify the most common forms used such as HR.

Results are really, really fast (fraction of a second). Typically a user gets 8-9,000 documents. User can refine by client, matter, or author.

Their intranet has 4 searches on the home page, with a "Google"-like box up top and 3 on the left-hand side (Client-Matter and 2 others).

A document result includes a hyperlink to the matter information.

Results and Tips

Search was a big hit, with a lot of usage. Firms should continue to survey users to determine what they like and don't like. Remind users how great the system is.

They follow metrics and look into declines.

Users now stay in Outlook and the portal.

What resources were required for implementation?
SM--implementation of portal including took about 2 months, took 2 weeks to index.
DLA Piper--had to recrawl 18 million documents, was an 8-week cycle. Relevancy tweaking was key (Doc type? partner documents?)
White & Case--knowledge bank implementation took a few weeks; then opened document management system, took 6 months.

Any client document uses?

One of White & Cases' major clients is tapping into some of their internal documents across the Atlantic.

I asked about their experience with tapping into time and billing records.

White & Case doesn't tap into expertese directly that joins biographies with data from time & billing system such as number of hours on a certain type of matter. Autonomy taps into the index from this separate system.

SM allows drilling down into matter information, and attorney have appreciated the ability to find invoices and so forth. The search does not integrate daily billing information, however.

Michael noted that just because your engine is crawling this information, that doesn't mean that you have to expose it.

No comments: