Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleOpen Access

    A SURVIVAL ANALYSIS AND GRADE PREDICTION MODEL FOR LUNG SQUAMOUS CELL CARCINOMA BASED ON MULTIPLE-INSTANCE LEARNING AND MULTI-SCALE TRANSFORMER

    Respiratory diseases are now the third leading cause of death worldwide. Lung squamous cell carcinoma (LUSC) has one of the highest morbidity and mortality rates among respiratory diseases. Therefore, constructing a model based on the pathological images for survival analysis and grade prediction of LUSC is of great significance for designing personalized solutions for LUSC. The current LUSC survival analysis and grade prediction model mainly has two issues. First, it lacks high-performance multi-scale feature methods, limiting the model’s ability to discriminate LUSC case images. Second, the feature extractor has a high degree of redundancy, resulting in many repeated calculations. Excessive feature extraction can be regarded as noise, which not only increases the training cost of the model but also reduces the model performance. This study aimed to address these limitations by proposing a multiple-instance learning-based multi-scale transformer (MSTrans-MIL) for LUSC survival analysis and grade prediction. The contributions of this study were as follows. First, we proposed a feature sampling module (FSM) based on a self-attention mechanism, which was conducive to reducing information redundancy in the input space and improving the model’s applicability. Second, we constructed a multi-scale pathology feature extraction module based on self-supervised learning and introduced a convolution–transformer to adequately extract the local and global features of images on different scales. The multi-scale chains are also beneficial to understanding the interaction between the tissue microenvironment and tumor cells. In addition, a multi-scale feature encoder with sparse Transformer was proposed to further reduce the feature redundancy, and a multi-scale feature aggregation module using the gating unit was constructed to enhance the hierarchy of the feature representations and improve the robustness and accuracy of the model. Abundant ablation and comparison experiments demonstrated that the proposed MSTrans-MIL could reduce feature redundancy and improve the prediction of LUSC grading and prognosis.

  • articleNo Access

    STOCHASTIC PERIODIC SOLUTION OF A NUTRIENT–PLANKTON MODEL WITH SEASONAL FLUCTUATION

    In this paper, a stochastic nutrient–plankton model with seasonal fluctuation was developed to investigate how seasonality and environmental noise affect the dynamics of aquatic ecosystems. First, the survival analysis of plankton was proposed. Then, by using Lyapunov function and Khasminskii’s theory for periodic Markov processes, we derive the sufficient conditions for the existence of positive periodic solution. The numerical simulations were carried out to provide a better understanding of the model, and the results indicate that seasonal fluctuation is beneficial to the coexistence of plankton species.

  • articleNo Access

    MAXIMUM ENTROPY PRINCIPLE AND THE LOGISTIC MODEL

    This paper proposes the use of the maximum entropy principle to construct a probability model under constraints for the analysis of dichotomous data using the odds ratio adjusted for covariates. It gives a new understanding of the now famous logistic model. We show that we can do away with the hypothesis of linearity of the log odds and still effectively use the model properly. From a practical point of view, the result implies that we do not have to discuss the plausability of the linearity hypothesis relative to the data or the phenomenon under study. Hence, when using the logistic model, we do not have to discuss the multiplicative effect of the covariates on the odds ratio. This is a major gain in the use of the model if one does not have to establish or justify the multiplicative effect, for instance, of alcohol consumption while considering low birth weight babies.

  • articleNo Access

    A Survival Analysis of Business Insolvency in ICT and Automobile Industries

    This study examines the differences in business insolvency between the information and communication technology (ICT) industry and the automobile industry, as well as their sub-categories of manufacturers and service providers, based on survival analysis. The results indicate that, unlike the technology diffusion model, the survival analysis provided clearer explanations on the differences in business solvency among ICT, automobile, manufacturing, and service firms. We found that insolvency of ICT firms is influenced by the short life cycle of the products/services that these firms provide to their customers, as compared to a much longer life cycle of the products of manufacturers. We also identified that ICT service providers’ business depends heavily on ICT manufacturers, but to strengthen their survival ability they must continuously introduce technological innovations. In addition, the study examined the causes of business insolvency of ICT firms based on the characteristics of the ICT industry, including the swing effect, the bandwagon effect, and winner-take-all outcomes. The study results provide valuable insights in developing survival strategies for ICT and automobile firms. The study also makes contributions to the literature in that the business insolvency phenomenon can be better explained through statistical significance analysis.

  • articleNo Access

    A TRANSCRIPTOME ANALYSIS BY LASSO PENALIZED COX REGRESSION FOR PANCREATIC CANCER SURVIVAL

    Pancreatic cancer is the fourth leading cause of cancer deaths in the United States with five-year survival rates less than 5% due to rare detection in early stages. Identification of genes that are directly correlated to pancreatic cancer survival is crucial for pancreatic cancer diagnostics and treatment. However, no existing GWAS or transcriptome studies are available for addressing this problem. We apply lasso penalized Cox regression to a transcriptome study to identify genes that are directly related to pancreatic cancer survival. This method is capable of handling the right censoring effect of survival times and the ultrahigh dimensionality of genetic data. A cyclic coordinate descent algorithm is employed to rapidly select the most relevant genes and eliminate the irrelevant ones. Twelve genes have been identified and verified to be directly correlated to pancreatic cancer survival time and can be used for the prediction of future patient's survival.

  • articleNo Access

    Identification of genes associated with cancer prognosis in glioma: In silico data analyses of Chinese Glioma Genome Atlas (CGGA) and The Cancer Genome Atlas (TCGA)

    Glioma is one particular type of brain malignancy which is highly complex and usually has a poor prognosis. Despite the limited diagnostic level of glioma, the survival time of affected patients broadly varies. Here, we conducted a detailed analysis, regarding the differences in patient survival time, to discover potential survival-related genes in glioma as well as their putative regulatory mechanisms. To contextualize the acquisition of these potential prognosis markers in large populations, particularly in China, we combined CGGA and The Cancer Genome Atlas (TCGA) databases to properly identify genes that are significantly related to survival. Our workflow combined a series of analytical approaches, including differential analysis, survival time, co-expression, clinical correlation analysis, ROC curve evaluation and prediction ability. Our results indicate that the four particular genes - PLAT, IGFBP2, BCAT1, SERPINH1 could be used as independent prognostic marker genes. These genes have also shown good prognostic ability in distinct populations, reiterating the robustness and value of these prognostic markers.

  • articleNo Access

    Incorporating biological networks into high-dimensional Bayesian survival analysis using an ICM/M algorithm

    The Cox proportional hazards model has been widely used in cancer genomic research that aims to identify genes from high-dimensional gene expression space associated with the survival time of patients. With the increase in expertly curated biological pathways, it is challenging to incorporate such complex networks in fitting a high-dimensional Cox model. This paper considers a Bayesian framework that employs the Ising prior to capturing relations among genes represented by graphs. A spike-and-slab prior is also assigned to each of the coefficients for the purpose of variable selection. The iterated conditional modes/medians (ICM/M) algorithm is proposed for the implementation for Cox models. The ICM/M estimates hyperparameters using conditional modes and obtains coefficients through conditional medians. This procedure produces some coefficients that are exactly zero, making the model more interpretable. Comparisons of the ICM/M and other regularized Cox models were carried out with both simulated and real data. Compared to lasso, adaptive lasso, elastic net, and DegreeCox, the ICM/M yielded more parsimonious models with consistent variable selection. The ICM/M model also provided a smaller number of false positives than the other methods and showed promising results in terms of predictive accuracy. In terms of computing times among the network-aware methods, the ICM/M algorithm is substantially faster than DegreeCox even when incorporating a large complex network. The implementation of the ICM/M algorithm for Cox regression model is provided in R package icmm, available on the Comprehensive R Archive Network (CRAN).

  • articleNo Access

    Knowledge Flow Determinants of Patent Value: Evidence from Taiwan and South Korea Biotechnology Patents

    Patents are important knowledge assets that enable firms to compete successfully and earn returns. To control spending on patent renewal fees, firms carefully analyze the value of their patents to determine whether patent rights need to be maintained, thereby rendering renewal decisions an opportunity to examine patent values. This study incorporates social network analysis and traditional patent indicators to empirically investigate the determinants affecting patent valuation by firms and research institutes in the Taiwanese and South Korean biotechnology industries. Results indicate that firms and research institutes value their patents differently. Certain indicators derived from knowledge flow and traditional patent indicators significantly affect patent value. Based on the findings, this study provides suggestions for policy-making.

  • articleNo Access

    A THRESHOLD MODEL FOR CELL SURVIVAL

    This paper deals with a threshold model for gene damages in a cell subject to repair. The damages and repairs are considered to be stochastic events. The status of the cell is partitioned into a number of states representing the number of proliferative gene damages it has suffered. The probability of finding the cell with a certain number of gene damages at any time and the expected number of repairs in an arbitrary interval are obtained.

  • articleNo Access

    A new generalized version of Log-logistic distribution with applications in medical sciences and other applied fields

    In this paper, we studied a two-parameter transmuted model of Log-logistic distribution (LLD) using the quadratic rank transmutation map technique studied by Shaw and Buckley1 as a new survival model in medical sciences and other applied fields. Statistical properties of Transmuted LLD (TLLD) are discussed comprehensively. Robust measures of skewness and kurtosis of the proposed model have also been discussed along with graphical overview. The estimation of the model parameters is performed by Maximum Likelihood (ML) method followed by a Monte Carlo (MC) simulation procedure to investigate the performance of the ML estimators and the asymptotic confidence intervals of the parameters. Applications of the proposed model to real-life data are also presented.

  • articleOpen Access

    SIMULATION OF INTERVAL CENSORED DATA IN MEDICAL AND BIOLOGICAL STUDIES

    This research looks at the simulation of interval censored data when the survivor function of the survival time is known and attendance probability of the subjects for follow-ups can take any number between 0 to 1. Interval censored data often arise in the medical and biological follow-up studies where the event of interest occurs somewhere between two known times. Regardless of the methods used to analyze these types of data, simulation of interval censored data is an important and challenging step toward model building and prediction of survival time. The simulation itself is rather tedious and very computer intensive due to the interval monitoring of subjects at prescheduled times and subject's incomplete attendance to follow-ups. In this paper the simulated data by the proposed method were assessed using the bias, standard error and root mean square error (RMSE) of the parameter estimates where the survival time T is assumed to follow the Gompertz distribution function.

  • articleNo Access

    Semi-Parametric Cure Rate Proportional Odds Models with Spatial Frailties for Interval-Censored Data

    In this work, we proposed the semi-parametric cure rate models with independent and dependent spatial frailties. These models extend the proportional odds cure models and allow for spatial correlations by including spatial frailty for the interval censored data setting. Moreover, since these cure models are obtained by considering the occurrence of an event of interest is caused by the presence of any nonobserved risks, we also study the complementary cure model, that is, the cure models are obtained by assuming the occurrence of an event of interest is caused when all of the nonobserved risks are activated. The MCMC method is used in a Bayesian approach for inferential purposes. We conduct an influence diagnostic through the diagnostic measures in order to detect possible influential or extreme observations that can cause distortions on the results of the analysis. Finally, the proposed models are applied to the analysis of a real data set.

  • articleNo Access

    D-Measure: A Bayesian Model Selection Criterion for Survival Data

    An authentic way for assessing the goodness of a model is to estimate its predictive capability. In this paper, we propose the D-measure, which measures the goodness of a model by comparing how close its predictions are from the observed data based on the survival function. The proposed D-measure can be used for all kinds of survival data in the presence of censoring. It can also be used to compare cure rate models, even in the presence of random effects or frailties. The advantages of the D-measure are verified via simulation, in which it is compared to the deviance information criterion, which is a widely used Bayesian model comparison criterion. The D-measure is illustrated in two real data sets.

  • articleNo Access

    Survival analysis via cox proportional hazards additive models

    The Cox proportional hazards model is commonly used to examine the covariate-adjusted association between a predictor of interest and the risk of mortality for censored survival data. However, it assumes a parametric relationship between covariates and mortality risk though a linear predictor. Generalized additive models (GAMs) provide a flexible extension of the usual linear model and are capable of capturing nonlinear effects of predictors while retaining additivity between the predictor effects. In this paper, we provide a review of GAMs and incorporate bivariate additive modeling into the Cox model for censored survival data with applications to estimating geolocation effects on survival in spatial epidemiologic studies.

  • chapterNo Access

    Chapter 120: Survival Analysis: Theory and Application in Finance

    This chapter outlines some commonly used statistical methods for studying the occurrence and timing of events, i.e., survival analysis. It is also called duration analysis or transition analysis in econometrics. Statistical methods for survival data usually include non-parametric method, parametric method and semiparametric method. While some non-parametric estimators (e.g., the Kaplan–Meier estimator and life-table estimator) estimate survivor functions, others (e.g., the Nelson–Aalen estimator) estimate the cumulative hazard function. The commonly used non-parametric test for comparing survivor functions is the log-rank test. Parametric models such as the exponential model, Weibull model, and the generalized Gamma model, etc., are based on different assumptions of survival time. Semiparametric regression models are also called the Cox proportional hazards (PH) model, which is estimated by the method of partial likelihood and do not require the assumption of survival time. Other applications of discrete time data and the competing risks model are also introduced.

  • chapterNo Access

    Chapter 1: Generalized Iterative Modeling for Clinical Omics Data Analysis

    Artificial intelligence has shown great potential in many aspects of human life, including the biomedical research and clinical care. Methods such as artificial neural networks are capable of handling large volumes of data from various medical image modalities, or from “omics” technologies, including genomics, transcriptomics, proteomics, metabolomics, and glycomics. However, these technologies often offered “black box” solutions where the expressibility of these numerical models were not satisfactory. Here, we developed the generalized iterative modeling (GIM) method, extending the conventional generalized linear models with a machine-learning twist. This method features an iterative shaping of highly-expressive polynominal models with automatically determined combinations of clinical and omics variables. The models can be written in a few lines of mathematical equations, allowing human comprehensions and interpretations. It will also facilitate the implementations in a wide-diversity of hardware and software platforms. This GIM software is now available for optimizing the U-statistics, F-statistics, and the log-likelihood in the Cox proportional hazards model. Using real data, the performance of GIM was demonstrated to be better than those from the generalized linear models and the orthogonal partial least squares discriminant analysis. The source code of GIM can be found at the GitHub site https://github.com/khliang/GIM.

  • chapterOpen Access

    Automated phenotyping of patients with non-alcoholic fatty liver disease reveals clinically relevant disease subtypes

    Non-alcoholic fatty liver disease (NAFLD) is a complex heterogeneous disease which affects more than 20% of the population worldwide. Some subtypes of NAFLD have been clinically identified using hypothesis-driven methods. In this study, we used data mining techniques to search for subtypes in an unbiased fashion. Using electronic signatures of the disease, we identified a cohort of 13,290 patients with NAFLD from a hospital database. We gathered clinical data from multiple sources and applied unsupervised clustering to identify five subtypes among this cohort. Descriptive statistics and survival analysis showed that the subtypes were clinically distinct and were associated with different rates of death, cirrhosis, hepatocellular carcinoma, chronic kidney disease, cardiovascular disease, and myocardial infarction. Novel disease subtypes identified in this manner could be used to risk-stratify patients and guide management.

  • chapterOpen Access

    PAGE-Net: Interpretable and Integrative Deep Learning for Survival Analysis Using Histopathological Images and Genomic Data

    The integration of multi-modal data, such as histopathological images and genomic data, is essential for understanding cancer heterogeneity and complexity for personalized treatments, as well as for enhancing survival predictions in cancer study. Histopathology, as a clinical gold-standard tool for diagnosis and prognosis in cancers, allows clinicians to make precise decisions on therapies, whereas high-throughput genomic data have been investigated to dissect the genetic mechanisms of cancers. We propose a biologically interpretable deep learning model (PAGE-Net) that integrates histopathological images and genomic data, not only to improve survival prediction, but also to identify genetic and histopathological patterns that cause different survival rates in patients. PAGE-Net consists of pathology/genome/demography-specific layers, each of which provides comprehensive biological interpretation. In particular, we propose a novel patch-wise texture-based convolutional neural network, with a patch aggregation strategy, to extract global survival-discriminative features, without manual annotation for the pathology-specific layers. We adapted the pathway-based sparse deep neural network, named Cox-PASNet, for the genome-specific layers. The proposed deep learning model was assessed with the histopathological images and the gene expression data of Glioblastoma Multiforme (GBM) at The Cancer Genome Atlas (TCGA) and The Cancer Imaging Archive (TCIA). PAGE-Net achieved a C-index of 0.702, which is higher than the results achieved with only histopathological images (0.509) and Cox-PASNet (0.640). More importantly, PAGE-Net can simultaneously identify histopathological and genomic prognostic factors associated with patients survivals. The source code of PAGE-Net is publicly available at https://github.com/DataX-JieHao/PAGE-Net.

  • chapterOpen Access

    Improving survival prediction using a novel feature selection and feature reduction framework based on the integration of clinical and molecular data

    The accurate prediction of a cancer patient’s risk of progression or death can guide clinicians in the selection of treatment and help patients in planning personal affairs. Predictive models based on patient-level data represent a tool for determining risk. Ideally, predictive models will use multiple sources of data (e.g., clinical, demographic, molecular, etc.). However, there are many challenges associated with data integration, such as overfitting and redundant features. In this paper we aim to address those challenges through the development of a novel feature selection and feature reduction framework that can handle correlated data. Our method begins by computing a survival distance score for gene expression, which in combination with a score for clinical independence, results in the selection of highly predictive genes that are non-redundant with clinical features. The survival distance score is a measure of variation of gene expression over time, weighted by the variance of the gene expression over all patients. Selected genes, in combination with clinical data, are used to build a predictive model for survival. We benchmark our approach against commonly used methods, namely lasso- as well as ridge-penalized Cox proportional hazards models, using three publicly available cancer data sets: kidney cancer (521 samples), lung cancer (454 samples) and bladder cancer (335 samples). Across all data sets, our approach built on the training set outperformed the clinical data alone in the test set in terms of predictive power with a c.Index of 0.773 vs 0.755 for kidney cancer, 0.695 vs 0.664 for lung cancer and 0.648 vs 0.636 for bladder cancer. Further, we were able to show increased predictive performance of our method compared to lasso-penalized models fit to both gene expression and clinical data, which had a c.Index of 0.767, 0.677, and 0.645, as well as increased or comparable predictive power compared to ridge models, which had a c.Index of 0.773, 0.668 and 0.650 for the kidney, lung, and bladder cancer data sets, respectively. Therefore, our score for clinical independence improves prognostic performance as compared to modeling approaches that do not consider combining non-redundant data. Future work will concentrate on optimizing the survival distance score in order to achieve improved results for all types of cancer.

  • chapterNo Access

    Chapter 9: Maximum Likelihood Estimation and Quasi-Maximum Likelihood Estimation

      Conditional probability distribution models have been widely used in economics and finance. In this chapter, we introduce two closely related popular methods to estimate conditional distribution models—Maximum Likelihood Estimation (MLE) and Quasi-MLE (QMLE). MLE is a parameter estimator that maximizes the model likelihood function of the random sample when the conditional distribution model is correctly specified, and QMLE is a parameter estimator that maximizes the model likelihood function of the random sample when the conditional distribution model is misspecified. Because the score function is an MDS and the dynamic Information Matrix (IM) equality holds when a conditional distribution model is correctly specified, the asymptotic properties of MLE is analogous to those of the OLS estimator when the regression disturbance is an MDS with conditional homoskedasticity, and we can use the Wald test, LM test and Likelihood Ratio (LR) test for hypothesis testing, where the LR test is analogous to the J · F test statistic. On the other hand, when the conditional distribution model is misspecified, the score function has mean zero, but it may no longer be an MDS and the dynamic IM equality may fail. As a result, the asymptotic properties of QMLE are analogous to those of the OLS estimator when the regression disturbance displays serial correlation and/or conditional heteroskedasticity. Robust Wald tests and LM tests can be constructed for hypothesis testing, but the LR test can no longer be used, for a reason similar to the failure of the F-test statistic when the regression disturbance displays serial correlation and/or conditional heteroskedasticity. We discuss methods to test the MDS property of the score function, and the dynamic IM equality, and correct specification of a conditional distribution model.