|
|
| PreConference Workshops - Sunday, April 22, 2007 |
|
|
|
Google Version 2.0: The Impact beyond Search
|
1:30 pm
–
4:30 pm
Stephen E. Arnold, ArnoldIT.com
The news about search engine marketing share was sobering in mid-2006. Google searches accounted for as much as 70 percent of the query traffic to a low of 45 percent, depending on whose data one reviewed. The problem is not the variance in the estimates of total traffic. Google's competitors -- Ask.com, Microsoft Live.com / MSN.com, and Yahoo.com -- lagged behind. More problematic was the disturbing fact that the gap was not closing. Google had a lead and was pulling ahead. In 2005, Google redefined the landscape of advertising and "free" Web search. Now, Google -- despite the wide range of new products and services the company pumps out -- is asserting itself in media deals (MySpace.com, MTV, and the Associated Press). Google has expanded its search appliance to become the platform for accessing enterprise information. Google's denial of financial initiatives was abruptly reversed with the release of Google Checkout, an alternative for disgruntled eBay merchants. Since the release of "The Google Legacy" (Infonortics, Ltd.) in September 2005, consulting firms have picked up the arguments in that monograph and identified Google's technology as fuel for the so-called "Google effect." For 2007, the tutorial has been revised to disclose important new developments at Google. The tutorial is not a Google "commercial". The tutorial focuses on emergent actions at Google and the impact of those actions on Amazon, eBay, Microsoft and Yahoo. Harry Collier and Stephen Arnold once again have collaborated to develop the information that will be presented to registrants at the 2007 Search Engine Meeting. The tutorial provides a look at several of the themes that are developed in the forthcoming "Google Version 2.0: Google as Application Platform" to be published in 2007. This tutorial allows attendees to learn about Google Version 2.0 with pragmatic discussions of specific aspects of Google's technology and systems in the context of enterprise search, financial services and network services. Tutorial attendees will learn:
- Google's OneBox as an application platform within organizations. Special attention will be given to the mechanism for loading hybrid applications on the OneBox so that licensees can access Google functions within a secure OneBox environment yet have secure access to network resources. Search is one function of OneBox. Attendees will learn about OneBox's use as an access mechanism to business intelligence, proprietary applications, and Microsoft content stores in Exchange and SharePoint.
- Google's technology changes, including the Google platform's ability to support global services such as financial services and network services. An update on Google's patents identified since April 2005 will be included. A new feature is the analysis of two patents labeled as advertising but in the context of enterprise information functions and the OneBox.
- Google's impact on Microsoft, including the injection of an additional $1.5 billion in capital infrastructure, the change in recruiting at Microsoft, and the issues associated with cannibalizing revenue with the "Google killer" Live.com service line up. Also discussed will be Google's impact on Amazon.com, eBay and Yahoo.
- Google's expanding advertising business and what it means for media-centric competitors. Issues discussed include projected revenue impact from the MySpace.com and MTV deals, changes in Google News, and Google's rich media activities. This module looks at infrastructure costs for Google competitors compared to infrastructure costs for Google itself. Rich media demands a more robust infrastructure, which has financial impacts on firms attempting to "out Google Google".
- A new module has been developed to look at the problems associated with Web traffic manipulation, search engine optimization or SEO, and social software increasingly used by Microsoft and Yahoo as a way to counter Google's growing search presence. Social software, unlike other relevancy techniques, is subject to manipulation.
Tutorial attendees will receive access to links and information presented in the seminar. These special information resources will be available only to registered attendees and include:
- Online access to the Google patents in full text identified by ArnoldIT.
- A specially-written paper summary of Google's OneBox.
- A drawing for three names. These individuals will be invited to join Mr Arnold for a free dinner that evening to discuss Google.
The format of the tutorial is a series of presentations, each averaging 30 to 40 minutes. These presentations will then be followed by an informal question and answer period. The tutorial will last approximately four hours and there will be three short breaks.
|
|
Text, Lies and Videotape: An Introduction to Text and Web Mining through Public Data Sets
|
1:30 pm
–
4:30 pm
David D Lewis, David D. Lewis Consulting
Both the volume and scope of textual and semistructured data, and the power of technological tools to extract value from it, have exploded in recent years. This tutorial will provide a tour of these technologies through examples based on some notorious public data sets, including the Enron emails, tobacco company documents released under the Master Settlement Agreement, and the Netflix $1 Million Prize movie recommendation data.
Tutorial attendees will learn about technologies for finding and organizing documents, extracting information from natural language, and transducing information from media such as speech, images and web pages. The inevitable errors, biases and ambiguities introduced by all these technologies pose particular challenges for data mining. We will cover approaches for exploration, prediction and decision making that are robust to the complexities of natural language.
Themes of the tutorial include:
- Similarities and differences in the processing of language data vs language-like data (markup languages, category systems, metadata data models, tags, etc.)
- Cutting through marketing jargon and making cost-effective decisions in licensing or building text mining technology.
- The legal and ethical landscape of text mining.
- The roles of technological and manual approaches to text mining in an environment of cheap networked outsourcing.
|
|
|
|
|