Thursday, November 02, 2006

Putting a course on content together (5)

Having looked at the value chain and the application fields, a course on content needs also to treat aspects like search engines and preservation.

Searching. Content has become more findable due to search engines. The index principle in books became more extensive when the first search engines of service like Dialog and SDC became available in the early seventies. These engines indexed every word in a document and put them in various indexes (in and out of context, combinations, adjacent search, excluding search). Since that principle a lot has happened, but not to the principle of Boolean operators. Companies like Verity added relevance as a measurable factor. But a break through came with Google.on internet. Their search engine used the same principle of vector searching, but improved algorithms. Recently we saw an improvement on the interface of search engines with Ms Dewey. It is clear that much attention should be paid to searching principles, the history of search engines and the various contemporary services and meta search services. Of course attention should also be paid to metadata design and search engine optimisation (seo).
Besides this text searching attention should also be paid to audio and video searching. For the time being text searching is the main engine, but its is also a laborious process.

Preservation. I have pleaded several times for special attention to preservation. Sites and links disappear, never to be found again. And even if you can find them back, they might not be time-stamped; you will have to guess the date from the lay-out (one column, two columns or three columns and back to two columns gain). But it also means using the system of digital object indicators (DOI), while a good idea is needed of repositories and software emulation.
Much can be learned from the Wayback Machine, the European Internet Archive and EU digital library program.


Blog Posting Number 557

No comments: