World Scientific
Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

IMPORTANCE MEASUREMENT OF THE INFLUENCING FACTORS OF LONG-TERM NURSING STATUS IN LONG-TERM NURSING INSURANCE BASED ON MULTIPLE LINEAR REGRESSION, RANDOM FOREST AND XGBOOST MODELS

    https://doi.org/10.1142/S0218348X22401776Cited by:0 (Source: Crossref)
    This article is part of the issue:

    Long-term care for the elderly has become one of the prominent social problems globally when the ratios of persons whose ages over 65 steadily increase in almost all countries. One of the solution approaches that could be adapted is called long-term care insurance provided by insurance companies. However, companies need to classify care status types based on price or to provide supports utilizing its organizational structures such as departmental communication, business selection, and market segmentation since long-term care consists of many factors. The motivation of this research aims at filling the gap since there exists no comprehensive research concerning these factors that have impacts on the long-term care status for the elderly. To determine those factors, machine learning (ML) algorithms such as multiple linear regression, random forest, and the XGBoost are selected to be employed. Then, those factors and their important variables are utilized to predict insurance pricing. The 2018 Chinese (CHARLS) data set is used to determine factors that have key impacts on long-term care status in the elderly. Finally, all models are combined as a comprehensive model to generate better prediction accuracies innovatively. The results show that the three ML models can provide relatively consistent important measures of risk factors in determining the nursing status of the elderly. On the other hand, the prediction accuracy of the random forest and the XGBoost was improved by 0.6% and 1%, respectively, when compared to multiple linear regression. Besides, the results show that when the ratios of 2.6, 3.7, 3.7 are assigned to the results of the three models, the prediction accuracy of the comprehensive model is higher in the test set than that of the multiple linear regression, which contributes 1.92% more. The main innovation of this research is to construct a comprehensive model, a weighted combination of three models, with better prediction accuracy. Eventually, the long-term care insurance business can utilize the comprehensive model to classify the long-term care status of the elderly.