I have read numerous posts on the semantic web this past year or so. The latest one by Marshall KirkPatrick at ReadWriteWeb in which he writes about an academic that warns us to pay attention to the question if the semantic web should have a gender.
The semantic web is greatly inspired and advocated by Sir Tim Berners-Lee, who suggests it will be the next step in web evolution. Already in 2001 Sir Tim Berners-Lee wrote an article in Scientific American describing this semantic web.
Most of the Web’s content today is designed for humans to read, not for computer programs to manipulate meaningfully. Computers can adeptly parse Web pages for layout and routine processing here a header, there a link to another page but in general, computers have no reliable way to process the semantic
The Semantic Web will bring structure to the meaningful content of Web pages, creating an environment where software agents roaming from page to page can readily carry out sophisticated tasks for users. Such an agent coming to the clinic’s Web page will know not just that the page has keywords such as “treatment, medicine, physical, therapy” (as might be encoded today) but also that Dr. Hartman works at this clinic on Mondays, Wednesdays and Fridays and that the script takes a date range in yyyy-mm-dd format and returns appointment times. And it will “know” all this without needing artificial intelligence on the scale of 2001’s Hal or Star Wars’s C-3PO. Instead these semantics were encoded into the Web page when the clinic’s office manager (who never took Comp Sci 101) massaged it into shape using off-the-shelf software for writing Semantic Web pages along with resources listed on the Physical Therapy Association’s site.
The semantic web describes a structure that allows machines to not only process data but also extract meaning (semantics) from it. The idea of course being that if software has access to this knowledge and meaning it could serve its user better.
Personally I would love to have a C-3PO friend walking alongside with me (but then I want the jedi sword as well). But honestly, right now it is hard for me to come up with viable scenario’s in which this would really help me as a user. In the past I have worked in the field of Artificial Intelligence and have seen many promising technologies that would ultimately change the world we live in. Neural Networks, Artificial Intelligent Agents, natural language processing, speech recognition technology. Each of these technologies helped us dream of a world in which machines could understand humans, thus serving them better. If anything I have learned that this isn’t a simple problem to crack. Not just because the technology may provide less capabilities then expected, but even more so because humans are unpredictable in their behavior and usage of the technology.
A simple example in the field of speech recognition. The company I worked for build a speech recognition tool that allowed users to call a phone number and ask for information about departure times of trains. The main driver behind this was cost reduction. Having an operator answer such questions is expensive. If the operator can be replaced by a machine, this would reduce costs. While this sounds perfectly obvious there were always 2 problems that needed to be tackled. One was obviously to train the speech recognition software to recognize speech. That was a daunting task and it brings many difficulties. Just think about users talking in noisy surroundings. There were many other technological difficulties. But the hardest problem to resolve was the user who did unexpected things.
“Where are you traveling too?” -> I want to go to my uncle in San Fransisco
“I’m sorry, I didn’t understand, where are you traveling too?” -> To my uncle
“I’m sorry, I do not understand, where are you traveling too?” -> Are you deaf, I said my uncle, three times already
See the problem in this conversation? The computer/speech recognition software has very limited knowledge and is unable to process the answer of the user. It really wanted to know a destination (in this case a train station or city). The computer is off course asking the wrong question here as it leaves the user with too many choices to answer. But once the answer isn’t recognized, it becomes increasingly difficult to get the user to answer correctly. The example is a bit exaggerated to show you what I mean, but believe me, it is nearly impossible to formulate a question in such a way that users will answer it the way you expect or want them to answer it.
Back to the semantic web. It sounds like a lot of power is unleashed if it becomes possible for machines to “understand” what data means. And I’m sure that there will be cases and situations where this might come in handy for me as a user. But for now I remain skeptic about the power of the semantic web. There is so much more involved in understanding data. There are complex factors that can’t easily be modeled or handled by machines or algorithms. Just think about something as simple as mood. Marshall Kirkpatrick (who is much more of an expert on this than I am BTW), gives an example of how knowledge can be added to data:
The semantic web today is based largely on what are called “triples” – sets of subject, predicate and object. For example Marshall Kirkpatrick [subject], loves [predicate] Punkin’ the Tabby Kitten [object]. (Hypothetical, I don’t have any kittens and please don’t send me any.)
Using these triples we can enrich data and add semantics to it. Now bring in the very human factor of mood. I love ice cream. Does that mean I love it all the time? No it doesn’t. Actually, I rarely eat ice cream, but I do when I feel like it. How can this be modeled into data? Depending on whether or not I had a great cup of coffee in the morning I might feel differently about ice cream in the afternoon etc. etc.
What makes the semantic web such a difficult thing to implement in a useful way is, again, a combination of the limitations in the technology, but most of all the human factor. It just isn’t possible to model human behavior. There is mood, taste, circumstances, irrational behavior and all other types of complexities that we humans can deal with (barely) but machines can’t. Machines might infer semantics from data in the semantic web, but I feel that (unless the task or circumstance is extremely basic) it will add to the confusion the user already has when he interacts with people or machines on the web.
I welcome the research and development being done in the field of the semantic web. But until it provides practical solutions that actually help the user, I remain with many questions about its value. I sure hope that in the mean time people will start developing solutions to current problems in the web. Why not focus on a User-Centric Web first. Easier to do and it provides the user with great value 😉
Totally with you on the user-centric web. I like the idea of machines understanding users and I think we’ll get there someday, however, we should start with solving some basic problems first and then build on the complexity. I’ve wrote about a scenario how semantic web will be helpful in getting the right information quickly when you’re searching on the mobile (speed and accuracy matter a lot when you’re standing in a street corner looking for something with the world moving by than when you’re sitting on a desk in front of a PC). Check it out here:
Another thing, failure of speech recognition is not just because of technical fallacies (although they do play a big role) but also because of Social and Cognitive reasons. I’ve covered this in my post “Why isn’t Voice-based UI mainstream?”
Hi Sachendra, just read your post, excellent. The summary from another post by Alex Iskold:
* Spend less time searching
* Spend less time looking at things that do not matter
* Spend less time explaining what we want to computers
is interesting too. But I think even these three tasks will be quite difficult to implement well using a semantic web. Human behavior and interpretation still sits in the way I think 😉
And yes there are many more reasons why speech recognition is difficult, but that wasn’t the topic of this post so I didn’t touch on all of them, good references though 😉
Excellent thought-provoking post.
It’s interesting that we complex human beings strive for simpler and simpler solutions – that is, simple in appearance (portable phone capability) but rich in deeper functionality (the iphone). And sometimes the richness overwhelms the core functionality – maybe I just want make a phone call.
Our very complexity (of human nature) drives our goal-setting and desires to ever more specific definition. We have a proliferation of products/tools that target as many users as possible, but still seek the perfect solution/tool/killer app. We’re back to that human nature thing – each of us yearning for the specific, personalized solution to our own (richly defined) needs.
It all makes for a fascinating and challenging journey in web application development.
@Fran thank you. My own feeling about this is that humans can deal with certain types of complexity much better than machines. I could never compete with a machine on complex calculations or fact recollection. But interpreting data is very difficult for both human and machine. We might just do a little better there because we can take more than just the data into account.
The semantic-web question is intimately touching the linguistic and psychologic fields. The data and its representation can be viewed as signifier and signified ( see: http://en.wikipedia.org/wiki/Course_in_General_Linguistics).
I strongly believe in semantic web, however, a new language paradigm is needed to make its realization possible and avoid any code “duplication” (for the data and its representation).
A book I highly recommend treating about the things and their representation: the order of things by Michel Foucault (see: http://en.wikipedia.org/wiki/The_Order_of_Things).
I have to agree that we should just find the problem that we feel is solvable and solve it. There is so much data becoming available in last few years that there must be problems that can now be solved, but couldn’t have been before. And we should concentrate on them.
We are still not at the point where computers could understand human affairs, so we should stop marketing semantic web as such. The only result of overmarketing will be disappointed audience, and this is neither good for business nor academia.
And eventually with more data and more CPU power we’ll get computers to understand a bit more of human language and affairs. Eventually…
Andraz Tori, Zemanta