Please login to be able to save your searches and receive alerts for new content matching your search criteria.
This paper considers the problem and appropriateness of filling-in missing conditional probabilities in causal networks by the use of maximum entropy. Results generalizing earlier work of Rhodes, Garside & Holmes are proved straightforwardly by the direct application of principles satisfied by the maximum entropy inference process under the assumed uniqueness of the maximum entropy solution. It is however demonstrated that the implicit assumption of uniqueness in the Rhodes, Garside & Holmes papers may fail even in the case of inverted trees. An alternative approach to filling in missing values using the limiting centre of mass inference process is then described which does not suffer this shortcoming, is trivially computationally feasible and arguably enjoys more justification in the context when the probabilities are objective (for example derived from frequencies) than by taking maximum entropy values.
In an expert system having a consistent set of linear constraints it is known that the Method of Tribus may be used to determine a probability distribution which exhibits maximised entropy. The method is extended here to include independence constraints (Accommodation).
The paper proceeds to discusses this extension, and its limitations, then goes on to advance a technique for determining a small set of independencies which can be added to the linear constraints required in a particular representation of an expert system called a causal network, so that the Maximum Entropy and Causal Networks methodologies give matching distributions (Emulation). This technique may also be applied in cases where no initial independencies are given and the linear constraints are incomplete, in order to provide an optimal ME fill-in for the missing information.
We provide the exact likelihood ratio testing procedure of the scale parameter of the Erlang and gamma distribution when there is a missing time-to-failure information. This is an important result because the asymptotical χ2-test is oversized an thus inappropriate especially for small merged samples. The small merged samples can arise also for a large sample sizes, e.g. when individual times-to-failure are not available. Data sets with missing time-to-failure data can arise from field data collection systems. Real data and simulated examples are provided to illustrate the methods discussed.
Protein–protein interactions (PPIs) are important for understanding the cellular mechanisms of biological functions, but the reliability of PPIs extracted by high-throughput assays is known to be low. To address this, many current methods use multiple evidence from different sources of information to compute reliability scores for such PPIs. However, they often combine the evidence without taking into account the uncertainty of the evidence values, potential dependencies between the information sources used and missing values from some information sources. We propose to formulate the task of scoring PPIs using multiple information sources as a multi-criteria decision making problem that can be solved using data fusion to model potential interactions between the multiple information sources. Using data fusion, the amount of contribution from each information source can be proportioned accordingly to systematically score the reliability of PPIs. Our experimental results showed that the reliability scores assigned by our data fusion method can effectively classify highly reliable PPIs from multiple information sources, with substantial improvement in scoring over conventional approach such as the Adjust CD-Distance approach. In addition, the underlying interactions between the information sources used, as well as their relative importance, can also be determined with our data fusion approach. We also showed that such knowledge can be used to effectively handle missing values from information sources.