FRACTALS TEXT MINING USING BIBLIOMETRICS AND DATABASE TOMOGRAPHY
Abstract
Database Tomography (DT) is a textual database analysis system consisting of two major components: (1) algorithms for extracting multi-word phrase frequencies and phrase proximities (physical closeness of the multi-word technical phrases) from any type of large textual database, to augment (2) interpretative capabilities of the expert human analyst. DT was used to obtain technical intelligence from a Fractals database derived from the Science Citation Index/Social Science Citation Index (SCI). Phrase frequency analysis by the technical domain experts provided the pervasive technical themes of the Fractals database, and the phrase proximity analysis provided the relationships among the pervasive technical themes. Bibliometric analysis of the Fractals literature supplemented the DT results with author/journal/institution publication and citation data.
The views in this paper are solely those of the authors, and do not represent the views of the Department of the Navy or any of its components, or the University of Karlsruhe.