March 13th, 2008

Great idea on the part of Yahoo to begin incorporating semantic web information into its search open platform. How deep the semantics will go, and in how many directions is still TBA, but I'm please to see interest in microformat and more structured semantic data via RDF. I'll be even more pleased when we start to see working examples.

Marshall Kirkpatrick believes that Google will follow suit. I just don't see it. Google might embrace microformats, but the company has long pit its algorithms against human annotation of data, and the semantic web is based on some human annotation–even if the annotation is based, indirectly, on checking an option in a page.

My biggest concern about all of this is if we were to limit semantics to microformats. It's with relief that I see that Yahoo is going beyond just microformats into the broader scope of the structured semantics based on RDF and its various serializations. Paul Miller also brings up other needed caveats:

The tools to create and embed that structure need to follow, of course. And issues that efforts like Dublin Core struggled with over a decade ago need to be thrashed out in some more detail, as the malicious, the malevolent, the careless and the mischievous rush to ‘game’ the rich structured data with which their web pages will soon be filled.

Putting pressure on the tool makers is essential, though probably not as essential as it once was because most tools provide a plug-in infrastructure that enables expansion. Still, there's a lot more that tools can do, which is one reason why I've been so interested in Drupal: this tools is definitely ahead of this curve.

What's key to all of this is showing people what they can get if they go that little extra step. I read people who write reviews on books. If we start showing more intelligent search results based on adding a little additional information to their writings that reflect that the work is a book review of a certain book by a certain author, etc., they will, most likely, be willing to spend a little time adding this additional information.

Someday when I'm looking for a new book to download from the web, I'll be able to pull up a browser in my Kindle ebook reader and see all the reviews written about this book, online. Everywhere. We are so close to making this work, and I'm not normally the type to to tap dance every time someone comes along, breathing the words "semantic web", through lips moist with anticipation.

Yahoo should have received a hostile takeover bid a long time ago. Lately, the company has been galvanized.

Comments
1
Bud Gibson - 4:36 pm March 13, 2008

Shelley, I appreciate the perspective on semantic approaches and even microformats, but an issue of bias comes to mind, one that is there in all attempts to add semantics to web pages, starting with meta-tags. It's coverage and bias. The most semantically marked pages are not necessarily the most accurate. Further, they represent only a small portion of all possible pages on a topic.

I actually appreciate Google's big data approach in this regard, even if it throws away information.

2
Rip Off - 4:53 pm March 13, 2008

You ripped this story right out of RWW… hack

3
Shelley - 5:00 pm March 13, 2008

Rip Off, no I used a satirical reference based on the title of the RWW, which I then linked in my post.

Oh, I'm sorry — is the word "satirical" too many syllables for you? Would a picture book help?

Bud, we disagree.

4
Bud Gibson - 7:12 pm March 13, 2008

Well, after leaving my comment, I remembered that almost any search over a reasonably sized data set is biased, so I may actually disagree with myself :). That said, I probably do have a bias toward the big data approach, but in the end, the proof is in the pudding

5
Marshall Kirkpatrick - 8:32 pm March 13, 2008

Thanks for your thoughts and for the superior title, Shelley.

6
Laurens Holst - 3:02 am March 14, 2008

Google might embrace microformats, but the company has long pit its algorithms against human annotation of data […]

Are you forgetting Google sitemaps? And the rel="nofollow" attribute?

~Grauw

7
Poster Maniac - 5:06 am March 14, 2008

I think Google might embrace microformats but I know that they would not just copy everything from what is going on with yahoo. I do agree that Drupal has a lot of plug ins that are really worth trying and I also like your wish that when you download some books in the web you can find all the reviews of it , I think that is what we need considering that yahoo searches were really vain nowadays.

Thanks to all those who have contributed to the discussion. Comments are now closed, but you can contact the author of the post directly.