Monday, November 17, 2008

Week 11:

Comments:
To Susan on "The Deep Web":

https://www.blogger.com/comment.g?blogID=9004604055760573247&postID=7311586600377229072&page=1

To Nate on the OAI article:

https://www.blogger.com/comment.g?blogID=301150766198525940&postID=3368777487964238988&page=1

Web Search Engines:

I feel that it is important for the GYM search engines to remove dead links, especially because of the huge amount of Web pages. This article contained lots of useful facts such as the amount of web data out there (400 terabytes), examples of infrastrutures, and what a crawling algorithm uses... Part 2 of this series gave a review of how the 400 terabytes of web data are indexed and gives a list of terms that reminded me of the day of memorization in high school. The terms for the most part seemed to be processing and search engine terms. I guess now I know at least what some of the things mean when I mess with my browser settings.

Current Developments and Future Trends...:

Metadata?! Again! Super... In really so many words, this article describes how the OAI was initially developed as a means to federate access to diverse e-print archives through metadata harvesting and aggregation, and how there is continued developments. This would have been an easier read if I would have understood what they were talking about from the beginning. Some sort of defining or use of gigures wouldn't have hurt. This article looks like it was written for someone specifically in the field.
Later described were the developments and missions of the OAI and they even reintroduced XML, HTTP, and even Dublin Core, so I at least understood that part. After reading the ongoing challenges section, I agree that there should be a universal use of a Dublin Core Tag. This is important because as they state, a user must receive the exact information they need. The article lists more challenges and ends with the future development.

The Deep Web:

This was an interesting article that related web content to the depths of the ocean. It makes sense! Like the ocean floor, how much of the information on the World Wide Web is never seen? When you do a google search, you see this:
I know personally, I don't go past '3' ever! So how much are we missing?

Muddiest Point:

I found the "Deep Web" article very intersting and I can't help but wonder a few things... There's parts of the ocean foor that we've never seen, just like there's a bunch of information we never get to see on the web... How much are we missing? Could what we're missing be valuable at all? What exactly is filtered out and how?

No comments: