Please login to be able to save your searches and receive alerts for new content matching your search criteria.
Formal Concept Analysis (FCA) is a natural framework to learn from examples. Indeed, learning from examples results in sets of frequent concepts whose extent contains mostly these examples. In terms of association rules, the above learning strategy can be seen as searching the premises of rules where the consequence is set. In its most classical setting, FCA considers attributes as a non-ordered set. When attributes of the context are partially ordered to form a taxonomy, Conceptual Scaling allows the taxonomy to be taken into account by producing a context completed with all attributes deduced from the taxonomy. The drawback, however, is that concept intents contain redundant information. In this article, we propose a parameterized algorithm, to learn rules in the presence of a taxonomy. It works on a non-completed context. The taxonomy is taken into account during the computation so as to remove all redundancies from intents. Simply changing one of its operations, this parameterized algorithm can compute various kinds of concept-based rules. We present instantiations of the parameterized algorithm to learn rules as well as to compute the set of frequent concepts.
We present an application of formal concept analysis aimed at representing a meaningful structure of knowledge communities in the form of a lattice-based taxonomy. The taxonomy groups together agents (community members) who develop a set of notions. If no constraints are imposed on how it is built, a knowledge community taxonomy may become extremely complex and difficult to analyze. We consider two approaches to building a concise representation, respecting the underlying structural relationships while hiding superfluous information: a pruning strategy based on the notion of concept stability and a representational improvement based on nested line diagrams and "zooming". We illustrate the methods on two examples: a community of embryologists and a community of researchers in complex systems.
Grand Challenges in Biodiversity Informatics.
Using Biodiversity Information Effectively.
In this paper, three genomic materials — DNA sequences, protein sequences, and regions (domains) are used to compare methods of virus classification. Virus classes (categories) are divided by various taxonomic level of virus into three datasets for 6 order, 42 family, and 33 genera. To increase the robustness and comparability of experimental results of virus classification, the classes are selected that contain at least 10 instances, and meanwhile each instance contains at least one region name. Experimental results show that the approach using region names achieved the best accuracies — reaching 99.9%, 97.3%, and 99.0% for 6 orders, 42 families, and 33 genera, respectively. This paper not only involves exhaustive experiments that compare virus classifications using different genomic materials, but also proposes a novel approach to biological classification based on molecular biology instead of traditional morphology.
Microbial communities in rice plantations display distinct patterns in frequency of abundance and in their relationships with physical and chemical variables. The present study proposes to estimate the diversity of cultivatable bacteria spore-producers present in water samples collected in the rice growing regions of the Brazilian South, and relate them to the physical and chemical characteristics of the water. The samples were obtained in the irrigation and drainage channels of irrigated rice plantation during the 2007/2008 crop year. The bacteria were characterized morphocytochemically and identified by sequencing after extraction of the total DNA and amplification of the PCR of the 16S rRNA gene. The results revealed 152 isolated bacteria including 13 cataloged taxa. The Analysis of Canonical Correspondence demonstrated that the first two axes explained 43.2% of the total variation in the composition of the taxa. Considering the frequencies in the species, the drainage and irrigation channels in Cachoeirinha and the irrigation channel in Camaquã showed similarity of bacterial composition of more than 80%, while the drainage channel in Camaquã was the most dissimilar.
Controlled microbial catabolism of sterols can meet biotechnological applications. Three soil-screened strains: Rhodococcus sp. CIP 105335 (strain GK1), strain GK3 and strain GK12 were found to possess a high capability for sterol degradation. These were identified according to their morphology, morphogenetic cycle, physiology and cell wall chemo type to belong to the genus Rhodococcus. Besides, cholesterol oxidase of rhodococci is either released free and/or linked into the cell surface layer. This typical location was determined for the three strains and confirmed their appurtenance to the specified genus. The nucleotide sequence of the 16S rRNA gene, determined for GK1 and GK12, showed that the strain GK1 might be a R equi or a closely related species, and the isolate GK12 is a strain of Rhodococcus erythropolis.