What's real and what's hype

Wolfram Alpha

Shelley Tue, 03/10/2009 - 10:59

Sheila Lennon asked my opinion on the Nova Spivack's recent writing about Wolfram Alpha, and posted my response, as well as other notes. Wolfram Alpha is the latest brain child of Mathematica creator, Stephan Wolfram, and is a stealth project to create a computational knowledge engine. To repeat my response:

First of all, it's not a new form of Google. Google doesn't answer questions. Google collects information on the web and uses search algorithms to provide the best resources given a specific search criteria.

Secondly, I used Mathematica years ago. It's a great tool. And I imagine that WolframAlpha will provide interesting answers for specific, directed questions, such as "what is the nearest star" and the like. But these are the simplest of all queries, so I'm not altogether impressed.

Think of a question: who originated the concept of "a room of one's own". Chances are the Alpha system will return the writing where the term originated, Virginia Woolf's "A Room of One's Own", and the author, Virginia Woolf. At least, it will if the data has been input.

But one can search on the phrase "A room of one's own" and get the Wikipedia entry on the same. So in a way, WolframAlpha is more of a Wikipedia killer than a Google killer.

Regardless, when you look via Google, then you get link to Wikipedia, but you also get links to places where you can purchase the book, links to essays about the original writing, and so on. You don't get just a specific answer, you also get context for the answer.

To me, that's power. If I wanted answers to directed questions, I could have stayed with the Britannica years ago.

Nova Spivack's writing on the Alpha is way too fannish. And too dismissive of Google, not to mention the human capacity for finding the exact right answer on our own given the necessary resources.

Again, though, all we have is hearsay. We need to try the tool out for ourselves. But other than helping lazy school kids, I'm not sure how overly useful it will be. If it's free, yeah. If it's not, it will be nothing more than a novelty.

I also beg to differ with Nova, when he states that Wolfram Alpha is like plugging into a vast electronic brain. Wolfram Alpha isn't brain-like at al.

The human brain is amazing in its ability to take bits and pieces of data and derive new knowledge. We are capable of learning, and extending, but we're really shite, to use the more delicate English variation of the term, when it comes to storing large amounts of data in an easily accessible form.

Large, persistent data storage with easy access is where computers excel. You can store vast amounts of data in a computer, and access it relatively easily using any number of techniques. You can even using natural language processing to query for the data.

Google uses bulk to store information, with farms of data servers. When you search on a term, you typically get hundreds of responses, sorted by algorithms that determine the timeliness of the data, as well as its relevancy. Sometimes the searches work; sometimes, as Sheila found when querying Google for directions to cooking brown rice in a crockpot, the search results are less than optimum.

Wolfram Alpha seems to take another approach, using experts to input information, which is then computationally queried to find the best possible answer. Supposedly if Sheila asked the same question of Wolfram Alpha, it would return one answer, a definitive answer about how to cook brown rice in a crockpot.

Regardless, neither approach is equivalent to how a human mind works. One can see this simply and easily by asking those around us, "How do I cook brown rice in a crockpot?" Most people won't have a clue. Even those who have cooked rice in a crockpot won't be able to give a definitive answer, as they won't remember all the details—all the ingredients, the exact measurements, and the time. We are not made for perfect recall. Nor are we equipped to be knowledge banks.

What we are good at is trying out variations of ingredients and techniques in order to derive the proper approach to cooking rice in a crockpot. In addition, we're also good at spotting potential problems in recipes we do find, and able to improve on them.

So, no, Wolfram Alpha will not be like plugging into some vast electronic brain. And we won't know how well it will do against other data systems until we all have a chance to try the application, ourselves. It most likely will excel at providing definitive answers to directed questions. I'm not sure, though, that such precision is in our best interests.

I also Googled for a brown rice crockpot recipe, using the search term, "brown rice crockpot". The first result was for RecipeZaar, which lists out several recipes related to crockpots and brown rice. There was no recipe for cooking just plain brown rice in a crockpot among the results, but there was a wonderful sounding recipe for Brown Rice Pudding with Coconut Milk, and another for Crocked Brown Rice on a Budget that sounded good, and economical. I returned to the Google results, and the second entry did provide instructions on how to cook brown rice in a crockpot. Whether it's the definitive answer or not, only time and experimentation will tell.

So, no, Google doesn't always provide a definitive answer to our questions. If it did, though, it really wouldn't much more useful than Wikipedia, or our old friend, the Encyclopedia Britannica. What it, and other search engines provide is a wealth of resources for most queries that not only typically provide answers to the questions we're asking, but also provide any number of other resources, and chances for discovery.

This, to me, is where the the biggest difference will exist between our existing search engines and Wolfram Alpha: Alpha will return direct answers, while Google and other search engines return resources from which we can not only derive answers, but also make new discoveries. As such, Alpha could be a useful tool, but I'm frankly skeptical whether it will become as important as Google or other search engines, as Nova claims. I don't know about you all, but I get as much from the process of discovery, as I do the result.


Nova released a second article on Wolfram Alpha, calling it an answer engine, as compared to a search engine. In fairness, Nova didn't use the term "Google killer", but by stating the application could be just as important as Google does lead one to make such a mental leap. After all, we have human brains, and are flawed in this way.

As for artificial intelligence, I wrote my response to it in Twitter: It astonishes me that people spend years and millions on attempting to re-create what two 17 year old can make in the back seat of a car.

Correlation

Shelley Thu, 10/02/2008 - 09:45

I noticed a correlation between my last two posts on the lack of women at Ajax Experience and the seeming lack of RDF or semantic web applications. Both are based on perennial questions: Where are the women in technology? Where are the semantic web applications?

Next time I'm asked either, I think I'll answer that the women in technology are off building RDF-based semantic web applications. Yeah, that's the ticket.

Chromatic Hyperbole

Shelley Tue, 09/02/2008 - 14:18

It would be impossible to miss the excitement over Google's Chrome, though I would assume we would wait to actually see the product, first, before wetting our pants.

Yes, Google entering the browser marketplace is news, but some of the things I've been reading are, well, frankly asinine. For instance, Computerworld breathlessly writes, Google's Chrome aims to kill Windows, make Web the OS of choice. A bit hard, wouldn't you say, when Chrome requires Windows just to be able to run?

Let's kill off Windows with our Web OS.

Cool.

...later...

Well, Windows is dead.

That's great! 

*pause* 

Uh, where's Chrome?

Well, you see...

Do we also need to remember our concerns about Google? You know, the whole privacy thing? Or are we a modern day bunch of Pavlovian dogs, trained to drool on cue whenever Google is involved?

There are issues associated with this browser, babes. First of all, as great as it is that Google is using Webkit for its infrastructure, it's also coming out with its own JavaScript engine. My first question is: is Google going to conform to standards? Or is it going to go its own little way, and just assume we'll tag along? Then there's the issue of the engine being multi-threaded—and here I thought Photoshop was going to be the only pig on my system.

My concerns aren't just related to JS. As I read somewhere—who knows where—we can now see why Google is footing the bill for Ian Hickson to head up the HTML5 effort. However, now that Google is "one of the browser competitors", how will this change the dynamic in all these standards groups? I'm not going to necessarily give HTML5 over to Google to define to its own Chrome standards. I imagine that some of the browser companies would feel the same.

And about those privacy concerns...exactly what kind of information is Google going to be collecting about us as we use the damn thing?

Frankly, I'm all for anything that weakens the abysmally tenacious hold IE6 and IE7 have on desktops, but I'm not sure yet another player in the field is what we need. Especially a player who, frankly, exhibits many of the same tendencies towards arrogance, as well as interest in complete dominance, as the company they supposedly "hate". I can understand Google's impatience with the other browser companies—but Google also has a tendency to act impulsively, and leave the rest of us to pick up the pieces.

As for web applications taking over the world, we're just now starting to hit against issues of broadband caps, not to mention the problems we've had with centralized services recently. Does Twitter ring a bell with you folks? How about Amazon's S3? GMail? In the last month, we've seen outages at a considerable number of centralized web services, and we haven't even started putting our critical operations into "the cloud".

Do you really want your business to hit a stand still because you've lost your internet connection, hit a broadband cap, or "the cloud" is not playing nicely at the moment? Seriously?

Look, yes. Get interested, yes. Peer around under the hood, and take it for a spin, most definitely yes. But get a grip--the web world as we know it hasn't suddenly come to an end just because Google has decided it wants to play the browser game, too.


Downloaded. Installed. Works fast. Chrome doesn't work on the Mac. Thanks to WebKit it does support XHTML and SVG. However, I've hit an odd rendering error for this page, which I don't get with my nightly WebKit download.


Matt Cutts did respond to privacy concerns about Chrome, though I wish he wouldn't categorize these concerns as being the paranoid ramblings of conspiracy theorists.

Watch the Birdie not the Hand: Scandal in Weblogging

Shelley Tue, 07/01/2008 - 18:55

There's pile-ons, and then there's pile-ons. Just when the people who owned Techmeme tried to generate a controlled burst of activity related to Loren Feldman, Shel Israel, and some stupid puppet (actually covered by the Guardian as news, to the ever lasting embarrassment of the British), the real story was going on elsewhere, and not a hint of it anywhere to be seen. It was only when both Rafe and Seth posted on the recent BoingBoing/Violet Blue thing that I became aware of the latest fooflah.

BoingBoing no longer loves Violet Blue and has unpublished several posts related to her. Considering that Violet Blue seems, at least to me, to be a "BoingBoing" kind of gal— equal parts sex and narcissism—I was rather surprised to see such behavior from a "freedom" loving rag mag like BoingBoing. Surprised, but not so much that I would do more than read the Boing Boing post and then move on.

What stopped me and caught me long enough to read more and even comment here was what Teresa Nielsen Hayden wrote in the post at Boing Boing:

Bottom line is that those posts (not "more than 100 posts," as erroneously claimed elsewhere) were removed from public view a year ago. Violet behaved in a way that made us reconsider whether we wanted to lend her any credibility or associate with her. It's our blog and so we made an editorial decision, like we do every single day. We didn't attempt to silence Violet. We unpublished our own work. There's a big difference between that and censorship.

(emph. mine)

I really dislike the all too frequence happenings of, "I know something awful about this person, but am above providing all the details", sort of smug self-satisfied innuendo, which serves not only to generate attention, in a carefully controlled way, but also to leave it to the reader's fevered imagination as to the heinous nature of the act committed to deserve such disapprobation. If you're going to condemn publicly do so explicitly, cleanly, so that the other party at least has a fighting chance to defend themselves. Not this air-kiss-slap that passes too often as honorable behavior in Silicon Valley.

behaved in a way... What did Violet, that bad girl, do? Did she sleep with an entire Catholic School boy's choir? Knowing BoingBoing, the crew would look on this with favor. Maybe she kicks kittens. She does wear spikey shoes...does she kick kittens?

Perhaps Violet Blue secretly voted for George Bush. That might be enough, but how would the BoingBoing crew find out, unless Violet Blue got drunk on lemon drops and spilled the beans.

However, I should have remembered who the parties involved are with this little contretemps. According to several comments, the issue could be related to the fact that Violet Blue had trademarked her name, and then sued a porn star for using it. Who Violet Blew, indeed.

Oh. My. God. The infamy of the act. If this is true, then of course what else could the Boing Boing crew do but wash the Blue dust from their hands and disavow all knowledge of Violet. After all, a person who sues to protect their name is only one step away from supporting the AP. Or worse...the RIAA.

Living in Missouri, where we don't understand these things, I have to think there is more to this than Violet Blue suing to protect her name. However, all we're left with is the words, hanging over all, the Violet behaved in a way that made us reconsider whether we wanted to lend her any credibility or associate with her. Petty words that demonstrate that perhaps being unpublished by an organization like Boing Boing is an actual testament for your character, rather than against.

Two grown men fighting over a puppet, unpublished posts, and the quarrels of the rich and famous...and all we had for entertainment in Missouri this last week was a flood.

update

I would be remiss if I didn't point out one of the worthwhile comments made in the Israel/Feldman puppet fiasco. It was from a site called Hacking Cough, authored by Chris Edwards, who wrote:

Feldman called the puppet "more real": a classic bit of legerdemain. Israel was very real during the whole spat. He was angry. He was upset. He wanted to get even. Faced with what Feldman was doing to him, what would you want to do? Social media's advice: be real, be honest.

But nobody believed the advice. The sensible advice to Israel was to bottle it up, act nice. And that probably would have worked. Had Israel gritted his teeth and pretended that he really loved the puppet, he would probably have come out of the whole episode more famous and better off. In other words, ignore Naked Conversations: Be inauthentic. You can't blog or tweet your way out of a crisis any more than you can knit your way out of a burning building.

In other words, ignore Naked Conversations: be inauthentic. Very astute observation.

A Quiet Take on the AP

Shelley Sun, 06/29/2008 - 13:01

Some people are still "waiting" on the AP to deliver a definitive guide to what can or cannot be copied of the AP material without risk of a DMCA notice. We really don't need to wait, nor do we need anything from the AP. We have copyright laws in this country, and they include the concept of "fair use", which we can continue to use as guide for our own writing.

People do need to look at how they quote and use other's work. If you feel that your use is justified and covered under Fair Use provisions, than full speed ahead and damn the consequences. You may be served a DMCA; you may not. Receiving one is not a judgment, and you won't be pulled into jail. In fact, you don't even have to respond by pulling the material if you really feel you're on the side of the law.

I wouldn't necessarily expect that you would get legal help, though. This environment tends to favor the noisy and the known. If you're neither, chances are you'll be on your own if you get a DMCA. That doesn't mean you shouldn't feel free to quote others, or to use AP material. It just means that you have to accept the consequences of your actions when you publish online, and use other's material.

As for the AP's DMCA notices being supposedly based on title and lede/lead, alone, whereby the lede is the first few sentences of the story, I think we were misdirected into focusing on the content of each individual quote, rather than the context of all the quotes, combined.

AP licenses entire stories, but it also licenses a feed of AP news items reflecting just the title and lede of the story. You can see an example of licensed material at the Huffington Post. Notice that the copyrighted material in this context is not limited to an individual story, but to the grouping of titles and ledes for several different stories.

People have been making an assumption that the AP is upset that people are quoting one title, and one lede. We've ignored the hints given in relation to Drudge Retort that it was a pattern of posting, of quoting multiple titles and multiple ledes over time that ultimately resulted in the AP issuing the DMCA.

If we consider that the ledes are only 30 or 50 words, it seems unreasonable for the AP to resort to the DMCA. However, if something like the Drudge Retort duplicates 3, or 5, or more of these syndicated story titles and ledes, what the site is doing is actually "copying" what amounts to 10, 30, 30% or more of the AP copyrighted material— not a few words of an individual story, as first discussed.

If the AP charges a site like the Huffington Post to publish this syndicated set of titles/ledes at the site, and something like the Drudge Retort is duplicating a significant number from this set, using virtually the same titles and lede wording, without adding additional commentary, the Drudge Retort could very well be violating the AP's copyright, and doing so in such a way as to cause financial harm to the AP.

The issue really is, and the AP stressed this, copy and paste publication. If you copy and past the title and the lede, add no commentary, you're not adding value to what you're publishing. You're just duplicating the content. There's nothing wrong with pulling out an individual quote from a story you like and publishing it by itself. However, if your publication falls into a pattern that is very similar or even equivalent to an individual or group's copyrighted publication of the same, don't expect to get all huffy because you only publish a few words from each story.

We shouldn't extrapolate from the AP to something like delicious or the Planets (RDF, Drupal, Intertwingly, and others), because they're not the same. I don't know of anyone that licenses their syndication feed and would feel financial harm if this syndicated feed was republished with a group of others. The purpose of the Planets is to give exposure to individual publications/people who do not get exposure from being part of a major news source, like the AP. However, taking our syndicated feed and republishing it in its entirety at another site, which then runs ads that benefit the second site is a different story. In fact, if we decry the existence of "splogs" we should find ourselves on the side of the AP, if we're being intellectually honest.

Now, some would say that the AP really will go after us if we only publish one title and one lede. Please forgive if I doubt any such thing would happen. Commonsense would dictate this, if nothing else. And commonsense is what we should be using when it comes to copyright and fair use.

I'm really not defending the AP so much as I am disappointed at how quickly people are willing to pile-on when the right stereotypes are triggered. We see the AP, big company, the Drudge Retort, small publication, and we become effectively blind—to both reason and fairness. More disturbingly, we become ripe for manipulation from those who care little for the consequences of the event, as long as the attention keeps flowing. The AP can protect itself, but the same cannot be said of every target of the pile-on effect.