return to curriculum vitae

Dagobert Soergel
College of Information Studies, University of Maryland

Enriched thesauri as networked knowledge bases for people and machines

[Full Text PDF]

The presentation will address the opportunities offered through automation generally and the Web environment in particular for structuring thesaurus databases and presenting thesaurus data It will argue for much richer thesaurus structures with much more information - differentiated relationships that allow an extension of thesauri to include precise representation of large amounts of factual information (some of which is included now, but only vaguely, such as organism RT disease rather than organism causes disease); full definitions and not just usage notes; priority levels for thesaurus information to guide display, such as having a short definition with the user being able to access a longer definition and definitions from many different places, including links into texts (parts of documents) that explicate a concept, and links into graphical representation of concept relationships, such as causal influence graphs; and maintenance of information on meaningful sequencing of concepts It will argue for more powerful displays that let the user explore hierarchic and network structures at various levels of detail and amount of information, such as coupled overview and detail windows, choice between linear/text and graphical displays, use of colors. As mentioned, adaptation of the level of detail and amount of information to the user's needs requires support from the thesaurus structure. The presentation will argue for connectedness - clickable relationships within one thesaurus and, more importantly, to specific entries in other thesauri (this requires a standard on how such links should be established and maintained in the face of constant change, including a standard how to create anchors inside a thesaurus Web page and a standard on how to link to specific entries in a thesaurus that exists in form of a Web accessible database). Ultimately, this would lead to a utility that would provide simultaneous access to many thesauri and integrate the information for the user. The presentation will argue for using the Web to support users in maintaining their own personal thesauri (possibly embedded in some large public thesaurus) and to create mechanisms for collaborative maintenance of thesauri. It will also argue for a thesaurus registry that would always direct the user or other systems to the proper URL - URIs for thesauri; such a registry could be used in conjunction with the Dublin Core facility for the identifying the vocabulary of origin for subject metatags to let the user interact with any of these vocabularies directly The presentation will also address the marriage of thesauri and other knowledge organization systems with dictionaries for natural language processing to create more powerful tools for sophisticated text understanding, translation, and retrieval.
top of page