Microsoft's Robert Scoble Discusses Search Engine Technology
By Andy Beal - February 04, 2004
Today search tools like X1 are most interesting because they index your hard drive and make it easy to search for email and files on your local drives. Microsoft Research has been working on a tool called "Stuff I've Seen" too, which is also quite interesting (both let you search email as well as files on your hard drive). But, these tools don't go far enough. First, they are bolted on top of the operating system. So, while they are indexing, your system often sees slowdowns. They can't design those to work properly with the operating system and with other applications that might need processor time.
Plus, to really make search work well search engines need metadata and metadata that's added by the system keeping track of your usage of files, as well as letting application developers add metadata into the system itself. In a lot of ways, weblogs are adding metadata to websites. When a weblog like mine links to a web site, we usually add some more details about that site. We might say it's a "cool site" for instance. Well, Google puts those words into its engine. That's metadata. (Technically metadata is "data about data"). Now if you search for "cool site" you'll be more likely to find the site I just linked to. So, you can see how Google's engine is helped by metadata. But, we haven't been able to apply those lessons to the thousands of files on your hard drive. That will change.
[AB] Can you explain the problems faced with searching hard drives and what Microsoft is working on to help?
[RB] What if we did the same thing on your hard drive [as Google]? For instance, look at pictures. When I take pictures off of my Nikon, they have some metadata (for instance, inside the file is the date it was taken, along with the exposure information) but that metadata isn't useful for most human searches. For instance, how about if I wanted to search for "my wedding photos?" Neither X1, nor Windows XP's built in search would find your wedding photos. Why? Because they have useless names like DSC0001.jpg and there's no metadata that says they are wedding photos.
Let's go forward a couple of years to the next version of Windows, code-named Longhorn. In Longhorn we're building a new file storage system, code-named WinFS. With WinFS searching and metadata will be part of the operating system. For instance, you could just start typing in an address bar "W" and "E" and "D" and "D" and anything that started with WEDD would come up to the top. For instance, your wedding documents, spreadsheets, and photos.
But, WinFS goes further than X1 and other file search tools do today. It lets you (and developers of apps you'll use) add metadata to your files. So, even if you don't change the name of your files, you might click on one of the faces in a picture application and get prompted to type a name and occasion. So, you would click on your cousin Joe's face, type in "Joe Smith" and "Wedding."
Now whenever you search for wedding stuff, that photo will come up. And that's just the start. If you imported a group of photos into a wedding album, you'd be adding metadata for the search engine to use. In other words, you'll see a much nicer system for searching your local hard drive.
[AB] It looks like Microsoft has things mapped out for offline searches, but can they compete with Internet search engines?
[RS] Now, if we're talking about the Internet, then Google has done an awesome job so far. I use Google dozens of times a day. Will MSN [search] be able to deliver more relevant results than Google? I don't know. Certainly that's not the case today. Will that change tomorrow? I'm waiting to see what the brains at MSN do.
One thing I do see is that in Longhorn, search will be nicer for customers. Google is working on making its toolbar the best possible experience. We're working on a whole raft of things too. I'm very excited about the future of search, no matter which way things go.
[AB] Let's look beyond the next couple of years. What new developments in search do you see happening in the next 3-5 years?
[RS] For Internet searches, I see social behavior analysis tools like Technorati becoming far more important. Why? Because people want different ways to see potentially relevant results. Google took us a long way toward that future as their Google's results are strongly influenced by how many inbound links a site has. But, now, let's go further, even further than Technorati has gone. Let's identify who really is keeping the market up to date on a certain field and give him/her more weight.
I also see that search engines that search just specific types of content (like Feedster) are going to be more important (Feedster only searches RSS and Atom syndication feeds).
Oh, and users are going to demand new ways of exporting searches. Google showed us that with News Alerts. Enter in a search term, like "Microsoft" and get emailed anytime a news source mentions Microsoft. Feedster goes further than that. There you can build an RSS feed from a search term. I have several of those coming into my RSS News Aggregator and find they are invaluable for watching what the weblogs are saying about your product, company, or market. For instance, one of my terms I built a feed for is "WinFS" -- I'll be watching to see how many people link to this article and if any of you have something interesting to say I'll even link back.
[AB] Let's look at your "wish list". Assuming there were no restrictions in technology, what new feature would you like to see introduced to search engines?
[RS] I want to see far better tools for searching photos -- and connecting relationships between all types of files and photos. For instance, why can't I just drag a name from my contact list to associate that name with a face in a photo? Wouldn't that help searching later on? In just 18 months I've taken 7400 photos. But I can't search any of them very well today without doing a lot of renaming and other work.
[AB] What impact do you see social networking having on the future of search engine technology?
[RS] We're already seeing an impact over on Feedster and Technorati. It's hard to tell what'll come in the future, but what would happen if everyone in the world had a weblog and was a member of Google's Orkut? Would that change how I'd search? Well, for one, it'd make me even more likely to search for people on services that linked together social spaces and weblogs. Heck, I can't remember my brother's email address, but Google finds his weblog (and I can send him an email there).
One other thing I'll be watching is how Longhorn's WinFS gets used by application developers to build new kinds of social systems. Today if you look at contacts, for instance, they are locked up in Outlook, or another personal information management program like ECCO. But, contacts in Outlook can't be used by other applications (particularly now because virus writers used the APIs in Outlook to send fake emails to all contacts in Outlook, so Microsoft turned those features off).
[AB] WinFS changes that. How?
[RS] By putting a "contacts" file type into the OS itself, rather than forcing applications developers to come up with their own contacts methodology.
What if ALL applications, not just Outlook, could use that new file type? What if we could associate that file type to social software services like Friendster, Tribe, Yahoo's personals, or Google's Orkut? Would that radically change how you would keep track of your contacts? Would that make contacts radically more useful? I think it would.
Already we're seeing systems like Plaxo keep track of contacts, but Plaxo is still unaware that I've entered my data into Google's Orkut and Friendster. Why couldn't I make a system that'd associate the data in all my social software systems? Including Outlook?
[AB] Do you foresee any problems with the WinFS approach?
[RS] Developers distrust Microsoft's intentions here. They also don't want to open up their own applications to their competitors. If you were a developer at AOL, for instance, do you see opening up your contact system with, say, Yahoo or Google or Microsoft? That's scary stuff for all of us.
But, if the industry works together on common WinFS schemas (not just for contacts either, but other types of data too), we'll come away with some really great new capabilities. It really will take getting developers excited about WinFS's promise and getting them to lose their fears about opening up their data types.
[AB] Do you foresee a time when commercial search results (product/services) will be separated from informational search results (white papers/educational sites)? And do you think all commercial listings will eventually be paid only?
[RS]I don't see the system changing from the Google-style results today. Searchers just want to see relevant results. Paid-only searches won't bring the most relevant results.
[AB] What makes you say that?
[RS] Because I often find the best information on weblogs. Webloggers are never going to be able to afford to pay to be listed on search engines.
Commercial-only listings might be seen on cell phones or PDAs, though. If I were doing a cell phone service for restaurants in Seattle, for instance, I might be more likely to list just member sites. But, thinking about it, I still don't see such a system becoming popular enough without listing every restaurant in some way.
[AB] Speaking of cell phones. How do you see search engine technology impacting our use of PDAs and Cell phones?
[RS] Not sure if search engine technology will impact it, but the mixture of speech recognition with search engines might change it a lot. When I'm using my cell phone I don't want to look at sites that have a lot to read (I'll save those for later when I'm in front of a computer or my Tablet PC) but, instead, I want to find the closest Starbucks. Look up movie listings. Find a nice place to have a steak dinner. Now that cell phones are reporting e911 data (that means that the cell phone system knows approximately where you're located, so can give you just one or two Starbucks, rather than all of the ones in Seattle).
[AB] If search engine users gave up a little of their privacy and allowed their search habits to be monitored, would this allow the search engines to provide better, customized results?
[RS] Yes. I already give Google the ability to watch my search terms (I use the Google Toolbar). But, it always must be a choice. People really hate it when you don't have strict privacy policies that are easy to understand and they hate it if you don't give them a choice to not report anything.
[AB] Robert, you've certainly opened our eyes to the future of search engine technology, is there anything else you would like to add?
[RS] To echo what I said above, I hope the industry sees the opportunities that Longhorn's WinFS opens up. We can either work together and share data with each other, or we can be afraid and keep data to ourselves. It'll be an interesting time to watch in the next three years.
Many thanks to Robert Scoble, Microsoft employee and blog extraordinaire. Please be sure to visit SearchEngineLowdown.com as we continue to highlight the thoughts and views on the future of search engine technology.
About the Author: Andy Beal is Vice President of Search Marketing for KeywordRanking.com and ProRanking.com, global leaders in professional search engine marketing. Highly respected as a source of search engine marketing advice, Andy has had articles published around the world and is a repeat speaker at Danny Sullivan's Search Engine Strategies conferences. Clients include Alaska Air, Peopleclick, Jos. A. Bank and NBC. You can reach Andy at firstname.lastname@example.org and view his daily SEO blog at http://www.searchenginelowdown.com/.
Back to Articles Directory
Copyright © 2002-2003 Web World Index