Glorious Generalist

Initial thoughts on Libraries and the Semantic Web

Post author By Margaret
Post date May 20, 2009

Ok, this isn’t quite my initial thoughts. I’ve been thinking about this since January 2008 when I was on the BOBCATSSS panel on Web 2.0/3.0 and Libaries. These are my thoughts after seeing Wolfram Alpha and reading about Google Squared.

First, here’s a great explanation of the semantic web. Basically, it means that a search will present data rather than documents. Here’s an example. Right now, if you want to know something about Mary Cassatt and do a Google search, you’re going to end up with your first link to the Wikipedia page for her. If you do a search in Wolfram Alpha, you get this page of data (and, of course, a link to Wikipedia). You, a person, have to read the Wikipedia entry and extract the data. In the latter case, the data is already extracted and presented in a tabular format.

(Incidentally, if you do that search in Credo Reference, you get a link to the biography of Mary Cassatt in France and the Americas: Culture, Politics, and History.)

The Wolfram Alpha results might not look like much, but that’s because biographical or artistic information requires some intellectual work beyond merely presenting data. Occupation, birth date, and death date are basic data that can be computed by, well, a computer. Try a math, science, or statistical question on it, and you’ll see where it really shines.

For right now, then, the semantic web is doing what computers do better than people–extracting and tabulating data. In a search engine that’s a big deal already. But there’s still a long way to go before the Intelligent Web, which many believe is the next step. In the Intelligent Web, the search engine will not only be able to extract data, but also to apply critical analysis and subjectivity to the data. That is a long long way off, but the semantic web can already help people with processing massive amounts of data. For instance, there is more scientific data published than any one person can read and understand. Even within specialized fields, it’s hard to stay on top of new developments. So pulling out the relevant data and presenting it can make it easier to spot patterns between studies. I’m sure this would have the same problems as current published meta-analyses, which can be problematic when they don’t reflect differences in reseach quality and methods.

Librarians already pull data out of documents and present it or interpret it for people. They also have the advantage, in many cases, of critical thinking and intelligence. So they do need to be concerned about semantic web technologies taking away the need for their expertise– or rather, they need to use their expertise to inform the creation of semantic web technologies. Friends, this means you must finally learn XML (just as a start). Librarians also need to be concerned about the democracy of data, but also the integrity of data. That means the most popular website thanks to hot-shot SEO work shouldn’t be the provider of data if another, less popular site has better data. This is something they already do, but when it’s not immediately apparent what document the data is being pulled from, it’s more important that the back end be honest. Plus it will take a long time before all the books of the world are scanned and searchable as part of the semantic web, so we need to stay on top of that.

What else? Leave your thoughts below.

P.S. I’m currently not able to leave comments. Please email me if you are having a similar problem and I will try to figure out what’s going on.

Internet Television

OMG Gossip Girl!

Most of this year I wasn’t watching Gossip Girl, but I’m finally caught up. It took a few marathon sessions with my parents DVR, but I was able to watch the season finale tonight.

I won’t give away any secrets, in case you haven’t seen it yet (or, you know, don’t care), but mobile informatics was very important in this episode. Also, full sentences and plotting via the text message medium. I suppose I’m young enough that this completely makes sense to me, but I’ve only had an unlimited text message plan for a short time, and a Twitter account for an even shorter time.

One of the fascinating but also oddly troubling aspects of Japan for me is the cell phone novel– if you don’t know about this, it’s common for people, especially teenage girls, to write novels on their cell phones and publish them online. Some of them are also eventually published in book form. These follow a standard romantic formula and from what little I’ve seen, aren’t written that well. But they are a big deal.

I don’t know what it says about America that our version seems to be Texts From Last Night.

Cats Staycation

My day in photos

My day started like any other– waking up, cleaning up a bit, ruminating on the class I taught last night and whether it went all right. Then things got weird.

First, I have crazy cats. For instance, this is where Nasturtium decided she wanted to hang out. Once she was up there, it wasn’t so great. She got down pretty quick and returned to her normal activities of sleeping and meowing at birds.

After this, I was hanging out around my apartment (I currently don’t work Thursdays, and was playing hooky from a meeting at work that I could have gone to, but wouldn’t have been paid for attending). Around 1 or so I got restless and decided to go for a walk to the dry cleaners to drop off some stuff, and then maybe around the neighborhood to the grocery store or the library.

Here’s where I ended up.

Yes, it is only 4.1 miles from my apartment to Diversey Harbor, and it is possible to walk right to the lake without turning or going much out of the way. My dirty little secret is that I love walking around Lincoln Park (the neighborhood). I also love open bodies of water, so this was an entertaining walk, and didn’t even take that long.