Enterprise search is much like air and water: Users expect it to be available without a second thought. Google and ISYS continue to perfect their enterprise offerings to do just that.
Both the Google Search Appliance 4.6.4 and ISYS:web 8 produce quality Web search results, and both vendors offer desktop components to index local hard-disk content. The battle lines are drawn in how each crawls and federates content from enterprise systems — portals, databases, legacy systems, and external Web services — and the value enterprises receive.
Google Search Appliance 4.6.4
Much has improved since I first reviewed the GB-1001 2U server, especially how deep the system now reaches into relational databases and file shares. Furthermore, you can push non-Web-accessible information from portals and other internal systems to the appliance by employing code based on Google Enterprise APIs.
I found the new One Box for Enterprise most intriguing. This set of APIs enables users to securely access business applications, such as CRM or BI systems, from the Google search box — and have this information presented separately from public search results.
Dell now manufactures the Google Search Appliance; my test system was a re-badged PowerEdge 2950 with two dual-core Intel Xeon 5140 (2.33GHz) processors and 16GB of RAM. The system is still physically locked, and initial setup remains plug-and-play.
I connected the box to my network and temporarily plugged in my laptop to perform the initial configuration using a Web form. In about an hour, I was using my remote PC to create search collections, manage crawls, and customize search layout pages.
The Admin Console UI remains a collection of basic Web pages and forms accessed from a straightforward navigation tree. I had no trouble entering URLs to index and specifying continuous crawling to ensure that new content would be found right away and be included in search results.
I also set up KeyMatches, to give preference to specific results for common queries; Query Expansion, to enlarge a query to include multiple words with identical meanings; and Synonym lists. Changing the basic look of the search box and results was quick; more extensive changes didn’t take too much longer using the XSLT style-sheet editor.
Crawling structured content in database systems follows much the same formula. I easily completed a form with the connection information for a Microsoft SQL 2000 database server and designated the database rows and fields to crawl.
Because enterprise use of Microsoft SharePoint is so prevalent, I put indexing of WSS (Windows SharePoint Services) or SPS (SharePoint Portal Server) sites on my requirements list. Google currently handles this with an open source SharePoint Connector. For now, this is only sample code, and it takes a bit of configuring to make it work. Google representatives said they plan to release a new API and Connector framework in the first quarter of 2007. The new framework will build the SharePoint connector into the appliance’s software and enable easier crawling of Documentum, OpenText, and other enterprise document repositories.
The Google Search Appliance provides a solid range of security and access control, omitting documents from search results if users aren’t entitled to see them. The system indexes both public and restricted information — and enforces document-level security policies at search time. Google also serves secure results with x509 client certificates, a common requirement in government agencies.
Search results were consistently top-quality. At the basic level, I searched information protected by basic HTTP authentication, and I integrated the appliance with Lotus Notes to crawl a Lotus Domino server. Best pages were shown first, with similar results grouped into one cluster. New conveniences include number and date ranges that users can specify to narrow down results.
I also examined several third-party OneBox for Enterprise solutions, which are quickly loaded through the appliance’s admin interface. The OneBox technology creates a trigger that determines whether the search is relevant to a OneBox module, such as finding customer information within your Salesforce.com account. Google then passes appropriate security credentials to the provider, gets the results in XML, transforms the data into HTML based on an XSL template, and presents the results to the user in line with their other search results.
This type of mashup is one of the more important developments in enterprise search. Users get relevant information from document management systems, Oracle purchase requisitions, SAS reports, and others within the featured area of the search results — all without any special steps.
It may not have the consumer name-recognition of Google, but ISYS nonetheless has a strong following in midsize enterprises and governments. That’s because this solution is very easy to implement and use; it runs on common Windows hardware and doesn’t have any per-document charges or lease terms.
Along with adding many new features — including automatic entity extraction, SharePoint support, Search and Site Designers, and Best Bets — ISYS improved some of the top items from 2005’s Version 7.
These include a menu-assisted system for constructing complex Boolean queries and Quick View, which extracts and displays relevant portions of large documents to eliminate some downloading and opening of, say, large Acrobat or Word files.
I performed most of my ISYS:web 8 testing on a Windows 2000 Server because that was the closest environment to Google. I also looked at the companion ISYS:desktop 8 and its enhanced interface.
Another interesting addition, ISYS Search Designer, allowed me to view live search trends such as top search words or queries that had minimal hits despite my expectations of popularity. Based on these results, I could optimize the system quickly with the new Best Bets feature by placing a specific document as the first search result.
ISYS doesn’t have as many connectors to external systems as Vivísimo Velocity or exalead:one .
That said, ISYS Version 8 does crawl SQL databases, Web sites, Lotus Notes, and Unix servers, along with WSS and SPS. In the latter case, ISYS respected user permissions, returning only authorized content in searches. It handled XML documents, including RSS feeds, without a hiccup.
ISYS snappily indexed all sources, typically scanning 10GB to 12GB an hour — about 5 percent faster than Google, even though my Windows server was older Dell hardware with two classic 1.8GHz Xeon processors and only 2GB of RAM.
These queries are performed from the ISYS Web site, which runs under Microsoft IIS or the ISYS:web server. For larger installations, ISYS Federator allows users to make a single query against indexes on multiple ISYS:web servers. </p