What is it about?

This paper describes and evaluates the tool DigiDoc MetaEdit which allows the semi-automatic indexing of HTML documents. The tool works by identifying and suggesting keywords from a thesaurus according to the embedded information in HTML documents.

Featured Image

Why is it important?

Nowadays, representing the contents of documents with keywords is an essential practice in areas such as information retrieval and e-commerce. DigiDoc MetaEdit identifies the most important keywords in an HTML document based on the embedded HTML information.

Perspectives

Do you want to know a new tool, DigiDoc MetaEdit, which allows the semi-automatic indexing of HTML documents? The tool works by identifying and suggesting keywords from a thesaurus according to the embedded information in HTML documents. This enables the parameterization of keyword assignment based on how frequently the terms appear in the document, the relevance of their position, and the combination of both.

Mari Vallez
Universitat Pompeu Fabra

Read the Original

This page is a summary of: A semi-automatic indexing system based on embedded information in HTML documents, Library Hi Tech, June 2015, Emerald,
DOI: 10.1108/lht-12-2014-0114.
You can read the full text:

Read

Resources

Contributors

The following have contributed to this page