Semantic Compression

Semantic compression is a method for compacting a domain or general dictionary, by reducing language heterogeneity of textual documents. This is basically achieved in two steps, using frequency dictionaries and semantic network:

i) determining cumulated term frequencies to identify target lexicon
ii) replacing less frequent terms with their hypernyms from target lexicon

Semantic compression has already proved to be advantageous in information retrieval tasks, improving their effectiveness (in terms of both precision and recall), and having some positive influence on efficiency as well. The most important features of utilization of semantic compression are:

- precise descriptors (reduced effect of language diversity - no language redundancy, step towards controlled dictionary)
- more compact lexicon (less computational complexity)
- synthetic output, with possibility to display as natural text (applying inflexion, adding stop words)


Project SENECA - Semantic Networks and Categorization »