The Search Engine Meeting
 

April 24-25, 2006Fairmont Copley Plaza Boston • Boston, MA
(Preconference Workshops: Sunday, April 23)
2006 Links
Daily Schedule
PreConference
Day One
Day Two
Speakers
Overview
Past Shows
Home

 

 

 

General Conference - Day One: Monday, April 24 2006
PreConference Day One Day Two
Keynote
9:00 am – 9:30 am
Dave Girouard, Google Enterprise
Searching The Long Tail
9:30 am – 10:00 am
Steve Papa, Endeca Technologies

Search logs show off what Chris Anderson of Wired calls a "long tail". Most people are looking for just a handful of your content, and handfuls of people are looking for each piece of the rest. The challenge is that the myriad rare goals, taken together, may be more valuable to a company than the few blockbusters. Search alone falls short. A simple list of results brings back the whole tail, leaving it to the user to trudge through the whole thing to find that rare item only he needs. But a company can't design paths through the site for each and every user. So what should it do?

The answer lies in a better understanding of human searching behavior, which is far richer than the traditional search model was designed for. Come find out about the four core activities of human information-seeking behavior and the interaction model that makes the most of this untapped resource.

Don't Just Change the Search Engine!
10:00 am – 10:30 am
Mike Moran, IBM

Only 34% of searches on corporate Web site succeed, but we often approach this problem as a technology problem only. While replacing a poor search engine is undoubtedly a good idea, too many of us stop there, ignoring the content and user interface issues that can make or break a search facility. By looking at search in a holistic way, you can get your company to take all the steps necessary to improve.

Medically Guided Search: New Technologies to Make it Good for Your Health
10:30 am – 11:00 am
Tony Gentile, Healthline Networks

Searching for health information on the Web is often like searching in a foreign language, as consumer and medical terms to not match. And, often consumers only have a limited amount of information about the health topic they are seeking and they are not sure how to express it to retrieve useful search results. Medical search must be more intuitive to searchers needs in order to be successful. The idea and implementation of concept search methods, using ontologies, categorization and facets, have been in place for some time. But, they have yet to be truly successful in a context where the searcher is often not able to manipulate the language, like the world of medical information. In this area we posit that the representation of the search results and its aids are as relevant, if not more, than the search results themselves. We look at conceptual issues at the core of this problem and present architectural solutions that leverage ontologies mapping between the consumer world and the specialized world. Furthermore, by providing key visual and navigational tools to searchers, we allow them to disambiguate their requests or better understand the situation they are in During this presentation, Igor Perisic presents information about new breakthroughs in the areas of search and text mining technologies that are being used to bridge the gap between the needs of the average user and the complexity of the inherent containing that information.  The selected area during this presentation will be the medical health space.

WYSIWYG Search Crafting
11:00 am – 11:30 am
Max Copperman, Knova Software, Inc

In internet search, users are responsible for the success of their search; search engines must be best-of-breed to maintain advertising revenue, but everyone's expectation is that if a search does not return the desired results, the user will try to formulate a new query that will. In enterprise search, the enterprise has an interest in the success of a search; for example, in Customer Service or Support it may deflect a call, or (if the search is performed by a call center agent) it may reduce call time or improve first call resolution. The enterprise has control of the content and owns the search results page in a way that an internet search engine does not. While an internet search engine may put energy into keeping content from gaming the engine, an enterprise search engine may put energy into supporting mechanisms for gaming the engine.

This presentation describes an effective approach to crafting search experiences, that is, crafting the user experience resulting from a search. Fundamental to the approach is a WYSIWYG editor for crafting search experiences in context. The tool is aimed at a "search administrator", the person responsible for improving search at the enterprise; this person knows the enterprise but is not an expert in search algorithms, knowledge representation, or linguistics. The search administrator runs a search in the editor, seeing what the end user will see, and can create, add, delete, move or edit search results, navigation links, and/or special-purpose widgets. After the editing session is saved, the end user will have the experience that the search administrator produced.

This presentation addresses several interesting questions raised by this approach. It is crucial to be able to craft an experience for a class of queries, rather than handling one query at a time. This raises the question of how to trigger a crafted search experience, that is, how to match end user queries against crafted searches. Search administrators must focus their efforts. What sort of tools would be appropriate to help a search administrator discover what searches to craft? There are other mechanisms that impact search, such as a dictionary or thesaurus. How do crafted searches (or their components) interact with any underlying knowledge representation? And finally, is such a tool only useful for enterprise search? Where might it fit in the Internet Search world?

Concept Searching Across RSS Feeds and Structured Content Repositories: A Business Use Case
11:30 am – 12:00 pm
Joseph Tragert, EBSCO

Latent semantic indexing engines, synthesizing a range of magazine and newspaper content, can create a powerful business information monitoring capability. While maintaining its long-time commitment to key word search across structured content and meta-data, EBSCO Publishing recently has developed “Executive Daily Brief,” which utilizes “Conceptual Search” (using a latent semantic indexing engine from Content Analyst LLC) with a folder-driven interface paradigm. Through the engine’s vector-based indexing algorithm, EDB is able to integrate real-time RSS feeds with journal and magazine content, and present conceptually related articles that are added to topic-specific folders (which can be defined by the users). The “concepts” are matched on the full text of the content being searched, rather than matching on subject headings or other metadata. In fact, the concept search yields appropriate results without the use of a pre-created thesaurus. Please join a developer of this new search system in a discussion on the pros and cons of Concept Searching, RSS-feed integration, and latent semantic indexing when applied to business information products.

Towards Restoring Conversation to Search1
12:00 pm – 12:30 pm
Alan Feuer, Blossom Software

Even though many people have recognized that searching is an iterative process, most studies of Web-based search have focused on individual queries. In an attempt to understand how queries evolve over a search session, we collected usage-log data from approximately 200 community and site search facilities. This presentation contrasts usage of site-specific search to web-wide search and discusses our findings on search-session behavior.

Speeding Search - Faceting for Faster, Relevant Drill-down
1:30 pm – 2:00 pm
Claude Vogel, Convera

Research shows that few viewers scroll past the first page of search results. Unfortunately, this means valuable information may be missed or relationships overlooked. One solution is to make the search engine do more of the work. Faceting, which retrieves and organizes the most relevant information from all results placing it in more digestible first page windows, is one option. More powerful and refined filtering mechanisms that do a better job of ranking results, are another. This presentation describes the different options that exist for organizations requiring comprehensive web responses and specificity of results.

Challenges in Scaling Federated Search
2:00 pm – 2:30 pm
Abe Lederman, Deep Web Technologies

If bigger is better and more is better then it is inevitable that future federated search applications will be required to scale to thousands of information sources. This scaling introduces problems not seen in today's much smaller deployments. Automated selection of the best sources to search for a particular query will become critical. Also critical will be better management of access to sources, given that sources are geographically dispersed, and that there are differences in the search capabilities, service levels, and content type and quality provided by different content owners. As large numbers of search results pour in from many sources it becomes equally important to rank documents effectively and efficiently and to organize them in useful ways, visually and otherwise. Key to searching, retrieving, ranking, organizing and presentation of relevant results in this new decentralized high content world will be the ability to divide and conquer ie, to distribute the workload among different computers in different locations providing different services. This presentation examines the problems of scaling and possible solutions.

Faceted Navigation: An Alternative to Search and Browse
2:30 pm – 3:00 pm
Tom Reamy, KAPS Group

Faceted navigation has emerged as a different and powerful way to find information. Built on a strong theoretical foundation, faceted navigation combines the power of advanced searching with the ease of use of browsing. Facets are easier for users to understand due to their orthogonal definition and this enables both novice and sophisticated users to achieve great results. This presentation looks at how to build a good foundation for faceted navigation, how to define a system of facets, how to develop structure within facets through the use of faceted taxonomies, and how to put it all together to create an alternative to search and browse.

The Hidden Side of the Metasearch (Federated Search) World (or Metasearch in the Big Bad World)
3:00 pm – 3:30 pm
Dr Peter Noerr, MuseGlobal

Metasearch (or federated search) engines search many sources simultaneously in order to get answers for users. A typical organization has some 250-300 sources, which may be internal, external, free or paid-for.A typical search done by an end-user often goes out to 40-50, sometimes many more sources simultaneously. The work of the metasearch (or federated) engine involves a complex of parsing, converting, reformatting, and  pre-search, in-search and post-search processing. However, the real work and the real pain points for a federated search engine are the connectors to the various data sources. Because of the variability of the source search engines in protocols, search languages, record structures, authentication methods and levels of implementation of all of these, it is necessary to have discrete connection code and profiles for each of them, and to allow each user organization to configure them as their own circumstances and commercial terms require. Creating, testing, deploying, configuring and maintaining these Source Packages is a daunting IT task. New Source Packages are required on short timescales and fixes must be delivered with times measured in hours or days as the service is down if the Source Package is not working. "Not working" almost invariably means there has been a change in the source -- sometimes as simple as a URL change, other times as complex as a new UI. Some sources, especially web sites, web search engines, subscription search engines, enterprise databases, and transaction systems, have stable lifetimes measured in days and this must be catered for, or a federated search system sinks to the lowest common denominator. Because all of this can be quite labor-intensive,  a great deal of the process needs to be automated. This presentation describes an architecture that makes this ultra-rapid, building and selective software replacement possible, and describes a mechanism  to handle the work and deploy the software.

Visualizing Emerging Intelligence through Text Mining
4:00 pm – 4:30 pm
Pete Cipollone, Factiva

With the explosion of both traditional and new information outlets, opportunities for gathering intelligence are everywhere and the pace of change is accelerating. On the forefront of new analysis techniques, text mining has become a powerful way to uncover the changing business issues and discover the emerging social trends that will drive business now and into the future. Text mining helps detect the complex signals of emerging trends over billions of pieces of information. Together with visualization tools, this new form of analysis provides a unique window to view the world.

Unlike data mining which extracts information from highly structured databases, text mining extracts meaning from unstructured data, such as web pages or many years of news stories. In text mining, patterns such as word proximity, sentence structure or statistical approaches uncover meaning in language and are used to extract important information for further analysis. Visualization tools often sit on top of the data to make the outputs easy to understand.
But to get a comprehensive view of emerging trends, text mining must tap all sources including mainstream press and consumer generated media such as blogs and message boards. With excellent analysis and visualization tools to make sense of the mined text, emerging trends can be gleaned by careful examination of differences over time found in content sources.

During this session, Factiva will discuss how text mining and visualization technologies can help provide a unique window to the world of information and generate relevant and meaningful results across billions of pages of data searched. Finally, it will showcase anecdotes of companies that have instituted and successfully utilized these tools in their organizations.

Searching & Mining
4:30 pm – 5:00 pm
Pascal Coupet, TEMIS

Combining search engines together with text mining is becoming a standard for industry-specific knowledge management solutions. This presentation highlights different parts of the association between information retrieval and information analysis solutions. The first part focuses on text mining capabilities to enrich with automatically extracted information the metadata associated with documents. This information is then used by the search engine indexed schema to retrieve information with more accuracy or completeness.We conclude on the complementary of search engine and text mining solutions in order to answer customers' requirement : making sense of content from the ever-growing textual information. We illustrate these benefits with real-life examples.

Enterprise search as a productivity tool - or the power to search in context
5:00 pm – 5:30 pm
Laurent Proulx, Nstein Technologies

An optimal search experience is achieved with superior metadata creation and indexing tools. By exposing metadata created with linguistic-based text-mining tools, one can find pertinent information in a matter of seconds. Quick response time and pertinent results translate to important productivity gains in the context of an enterprise search.

Conference Mixer Cocktail Sponsored by Convera and TEMIS
6:00 pm – 8:00 pm



 

 
© 2009 - 2013, Information Today, Inc. Privacy/Cookies Policy