The Search Engine Meeting
 

April 27-28, 2009Fairmont Copley Plaza Boston • Boston, MA
(Preconference Workshops: Sunday, April 26)
2009 Links
Daily Schedule
PreConference
Day One
Day Two
Speakers
Overview
Past Shows
Home

 

 

 

General Conference - Day One: Monday, April 27, 2009
PreConference Day One Day Two
Why Today’s Search Engines are not Prepared to Drive Tomorrow’s Information Experiences
9:00 am – 9:30 am
Bjørn Olstad, Microsoft

The opening keynote explores how search engines will have to transform going forwards. Three forcing functions for this transformation will be explored. First, pervasive enablement of information-driven user experiences. Search engines have come a long way from the original goal of linking 1.4 words to the right text documents. A demonstration will be given on how natural user interfaces, social computing and search can merge to power a new breed of experiences. Secondly, the presentation explores how data fusion of text, structured data and rich media will change content analytics. The Pharos research project for rich media access will be used to illustrate emerging architectural requirements. Finally, cloud computing and the impact on search will be analyzed. Search will have to consolidate information from client, cloud and on-premise repositories into broad information experiences.

The Variety of Goals and Applications of Semantic Approach to Search
9:30 am – 10:00 am
Dmitri Soubbotin, Semantic Engines LLC

This presentation compares different approaches to presenting search results to users. Various types of search queries have been identified based on the user intent. Accordingly, different types of results are identified: a conventional list of links; a hierarchy or a cluster of concepts with underlying links; a direct answer. A multi-document summary of Web sources is introduced as a legitimate type of search result on the example of SenseBot. "Semantic cloud" of key concepts is suggested as a means of controlling the focus of the summary. The idea is to give the user a quick answer fast, obviating the need to drill down into the sources in many cases.

Semantic analysis is discussed as a way to augment traditional search with its page ranking system. Examples of intelligent applications based on the approach are presented. Intersection between semantic search engines and Semantic Web is discussed as a mutually beneficial opportunity. Two major challenges facing semantic systems are the ambiguity of natural languages, and high infrastructure requirements. Some ways to deal with these challenges are discussed.

A Pragmatic Look at the Semantic Web
10:00 am – 10:30 am
Diane Burley, Burley Associates

Research shows that there are two types of site searchers: those who rely on the search bar and those who are link-dominant, who, like a spider, crawl from page to page using inline links — if those links exist. The challenge with the search bar is that unless the reader types in exactly the words that the journalist used, the story will go undiscovered. A story on “great stuffings for your holiday bird” may not appear if the reader happens to type in the word “dressing” and the story was not tagged properly. Move beyond the realm of synonyms to denotations and connotations and thus is the semantic web — a world filled with literal and figurative associations that could help the readers find what they are looking for —regardless if they know they are looking for it." Tagging” is the simple answer, while rich metadata are the crux to the semantic web.

The advancement of the semantic web is a transformative time for news sites. If simple tagging seems onerous how is it possible that we could consistently and comprehensively semantically — and more importantly, semantically associate assets — be they article, image, motion or audio? The answer is multifaceted. In this presentation we take a rudimentary look at the components of the semantic web: tagging, taxonomies, authority files, knowledge bases, and look at some of the tools that will help you automatically tag and associate. Further, how can we expose these rich metadata to better create a reader experience? Indeed, how can we expose these metadata on the back end so that we editors can research or package news with greater ease? Does automation obviate the need for mediation? Just how do editors thrive in the semantic web?

Break, Exhibition and Networking
10:40 am – 11:10 am
Semantic Coherence and a New Search Paradigm
11:10 am – 11:40 am
Frank Bandach, eeggi

This presentation discusses the engineering of an indexing-numeric language for the manipulation of semantics, grammar, concept novelty, responsiveness, disambiguation, translation, and its evolution into basic rationality towards a new search engine paradigm.

The Puzzle of Semantic Technologies
11:40 am – 12:10 pm
Dr. Kathleen Dahlgren, Cognition Technologies

Semantics is now center stage in search, with various approaches having been proposed. Most current approaches to Web 3.0, or the Semantic Web, primarily tag pages in a tagging language. Others use ontology, so that users can query "car" and see retrievals with "SUV" or "Porsche", or they present users with summaries or pull-downs based on ontology. Still other semantic approaches focus on syntax parsing in order to recover the formal semantics or argument structure of text and query. Another additive approach to semantics is the building of a Semantic Map. A Semantic Map contains word-level and contextual information that enables a search engine to do complete word sense disambiguation, or understanding, at the word level. Our goal should be a complete approach that treats all aspects of semantics, including sense disambiguation, ontology, synonymy, commonsense knowledge, aspect, information to assist in pronoun reference and discourse reasoning and any other information required to replicate full lexical and formal semantic reasoning.

Advanced Visualization of Search Results: More Risks, or More Chances?
12:10 pm – 12:40 pm
Martin Baumgärtel, Client Services, Velocitude

Many products have been deployed and numerous articles have been published about breakthroughs in the visualization of search results. Yet, is search result visualization common practice in everyday information retrieval tasks? This presentation addresses the gap. Results from case studies and from the analysis of human-computer-interaction are presented. Direct user feedback from the visualization of semantic relations is summarized and a general theory concluded. Whether you work on visualization or have investigated visualization technologies/designs to improve the search experience in your environment, this presentation will give you valuable advice, help in maintaining a realistic view and methods to prevent common pitfalls.

Lunch, Exhibition, Networking
12:45 pm – 2:00 pm
Panel Review: Non-Text Search Technologies: Speech, Images, Video
2:00 pm – 3:00 pm
Moderator: Susan E. Feldman, Search and DiscoveryTechnologies, IDC
How Video Gets Found: changing consumer search strategies for audio and video online and implications for content producers
Thomas Wilde
Classifying an image with accuracy and speed: the value of parts-based representations
Naveen Agnihotri, Milabra

There is an explosion of untagged image data on the Internet, and a paucity of means to analyze the images and search hem visually. This presentation explores using a parts-based approach to classify image features. While being highly accurate, this strategy also dramatically reduces the fingerprint of the image, which cuts down storage costs while facilitating searches based on content.

Mobile Voice Search
Mike Phillips, Vlingo

As mobile phones are becoming more capable, they are increasingly becoming people's primary personal information, entertainment, and communication devices. Just as with the internet, search technology is the key to gaining broad access to a wide range of content, both on the device and in the network But, the search experience on the mobile device is constrained on the input side by the limited keypads or touch screens, and on the output side by the limited displays. The ability to search by voice solves at least one half of the problem and has been introduced on a number of mobile devices by Yahoo!, Vlingo, and others over the past year.

This presentation will cover the technical challenges, including the demands on the speech technology side as well as impact on the underlying search applications. It will also include an overview of the current state of the art, and will show results from the latest commercial deployments.

Google Looks Beyond the Laundry List
3:00 pm – 4:00 pm
Stephen E. Arnold, ArnoldIT.com

This presentation presents three of the technologies that are shifting Google from a service which requires the user to enter a query, to a service that presents search within a user's context. Each of these technologies is in use in various Google services. The combination of Google's existing and better-known search methods are complemented by functions that operate automatically or semi-automatically to improve the user experience. First, Google's Chrome is a way for Google to connect the user to Google services and Google services to the user. One key component in Chrome is its ability to track a user's behavior, perform predictive analyses, and give the user access to containers or virtual machines. Chrome is not an operating system; Chrome is a connectivity mechanism that operates regardless of the user's computing operating system or device. Second, Google's janitor technology allows the company to "clean up" structured and unstructured information. One way to use the cleaned up data is to produce an automatic dossier about a person, place or thing. Third, Google's dataspace technology provides an environment in which Google can generate new types of metadata about information processed by Google's indexing system.

Google has not issued public information about these innovations, but each is disclosed in open source documents such as technical papers, patent documents, and public presentations by Google professionals. The conclusion drawn from this review of three interesting Google innovations from the 2007-08 period is that the company is shifting from key word queries to search-enabled applications. These applications present the user with solutions to information problems, not a laundry list of results.

Break, Exhibition and Networking
4:00 pm – 4:30 pm
Searching the Web More Effectively with Multiple Simultaneous Queries
4:30 pm – 5:00 pm
Francisco Corella, Pomcor
Karen Lewison, Pomcor

We describe a Web search facility that reduces the time and effort that it takes the user to home in on the desired results for difficult search problems. When the user enters a query the search facility anticipates possible follow-up queries, issues them immediately, and allows the user to browse the search results of the original query and these additional queries simultaneously. Additional queries may include a respelling of the original query, related queries, and/or sub-queries. (By sub-query we mean a query consisting of a subset of the search terms of the original query.) We describe a parallel algorithm that efficiently produces an optimal set of sub-queries and their results in the important special case where the original query has zero results; although it is rare for a query that targets the Web at large to have no results, the zero-result case is important for queries that target a particular site.

We have built a prototype of such a search facility as a client-side script, implemented on the Adobe Flex platform, and thus running on the Flash plug-in, that accesses the Yahoo search engine via the Yahoo Astra Web APIs library. The Yahoo search engine has not been modified for this purpose, so our innovations are implemented entirely on the client side. We point out, however, that it would be beneficial to transfer parts of the implementation to the server side, and explain how this could be done. The prototype only handles purely conjunctive queries, but we also describe a method for handling general Boolean queries, and we describe an extension of the parallel zero-result algorithm to the general case.

A Study of Evaluative Language in SMS Messages: Towards a Characterization of Opinion
5:00 pm – 5:30 pm
Marguerite Leenhardt, Université Paris 3 - Le Sémiopôle

At the moment, the results of tools for analysing information exchange have a significant commercial value. This current study is a textual and linguistic evaluation of a corpus of text messages sent by mobile phone. The approach used aims to bring distributional characteristics under different levels of description language with the aim of modeling the linguistic content of the knowledge contained in the corpus. The aim is to contribute to the characterization of the evaluative language in the SMS. In perspective, we try to put some markers on industrial applications of the analysis of such textual content, especially in relation to marketing applications.

We support the idea that the subjective knowledge gained on large body of messages can be used for automated analysis of the views contained in brief texts published on the web, such as messages posted on Twitter.

Conference Networking Cocktail Reception
5:45 pm – 7:00 pm



 

 
© 2009 - 2012, Information Today, Inc. Privacy Policy