Blue Cat Blog: More digitizing and indexing

The New York Times reports on competition among search services.

Perhaps the fiercest competition on the Internet these days is among sites offering new ways to search through more information. Yahoo and Microsoft each have hundreds of engineers trying to challenge Google's leadership, and dozens of minor players are trying to find ways of getting their services noticed. A9, Amazon.com's search service, recently sent vans with digital cameras onto the streets of some cities to take pictures of businesses. The photos were later displayed alongside telephone numbers in A9's phone directory.

The Times also featured an essay by Steven Johnson, who elaborated on it in his Tool For Thought blog posting. I have also been commenting on this trend. See Giardia Bares All: Parasite genes reveal long sexual history. Research done by surveying genome data. and Continuing to digitize and index the world.

Where will this take us. It's hard to imagine. Johnson is quite enthusiastic about a Mac tool that he uses, Even now, I find myself feeling handicapped when I don't have access to the web and I want to find something. These new tools not only help you find things, they offer (useful) associations you may not have thought about. Our coming ability to have so much more of the world (and human knowledge) at our fingertips (and within immediate reach of our minds) will change things in ways that we don't yet understand. As Johnson put it,

But 2005 may be the year when tools for thought become a reality for people who manipulate words for a living, thanks to the release of nearly a dozen new programs all aiming to do for your personal information what Google has done for the Internet. These programs all work in slightly different ways, but they share two remarkable properties: the ability to interpret the meaning of text documents; and the ability to filter through thousands of documents in the time it takes to have a sip of coffee. Put those two elements together and you have a tool that will have as significant an impact on the way writers work as the original word processors did.

I recently wondered how much it would cost to cover the current commitments of social security for people currently enrolled. I couldn't find out. (I found a couple of web sites that offered simulators, but I couldn't figure out how to get the answer I wanted.) Will answers to questions of this sort be more immediately available in the near future. Will we develop the habit of just reaching out and grabbing answers to questions that we currently don't even ask ourselves because we have no idea how to answer them? I suspect something like that will happen.

Another example. Last night we were discussing the 1992 election. We couldn't remember Ross Perot's name — although I thought the first name was one syllable beginning with 'R' and the second was two syllables. We also wanted to know the name of his running mate.

I was absolutely sure that I would be able to find the answer quickly. I wasn't disappointed. I asked Google for "1992" and "Presidential Election." It referred me to the U.S. presidential election, 1992 - Wikipedia page that listed all the 1992 candidates. (To try that search now, just highlight everything from '1992' through 'Election' in the preceding sentence. Release the mouse for a Google search.) (Perot's running mate was James Stockdale.)

I could think of no way of asking the question in terms of syllables or starting letters, though. In fact, a weakness in Google is that it has no way to ask for names. If you want to find me, you might look for "R. Abbott," or "Russ Abbott," or "Russell Abbott," or "Russell J. Abbott," or "Russell Joseph Abbott." There should be a special way to say you want to look for a person whose last name is "Abbott" and whose first name is "Russ" or "Russell." It should retrieve the results of all the preceding searches, knowing that names may be abbreviated to initials and that names may have additional intermedate names — but not too many of them.

Not only will information be more readily available, we will expect it to be available (which will put greater pressure on institutions to be more transparent), and we will develop the habit of just reaching out and grabbing it whenever a stray thought crosses our minds.

More current work
There was recently an article in Wired News: Information Wants to be Liquid about the Liquid Information project, which is working on related technololgies.

I'll be presenting a paper at the 4^th IASTED International Conference on Web-Based Education in Switzerland next month on a system I call a Collaborative Knowledge Base (CKB), a web-based wiki-blog web system for the collaborative development of some domain area that has some of these features.

Blue Cat Blog

Sunday, January 30, 2005

More digitizing and indexing

No comments:

Blog Archive