Skip main navigation

Cookies Notification

We use cookies on this site to enhance your user experience. By continuing to browse the site, you consent to the use of our cookies. Learn More
×

System Upgrade on Tue, May 28th, 2024 at 2am (EDT)

Existing users will be able to log into the site and access content. However, E-commerce and registration of new users may not be available for up to 12 hours.
For online purchase, please visit us again. Contact us at customercare@wspc.com for any enquiries.

SEARCH GUIDE  Download Search Tip PDF File

  Bestsellers

  • articleNo Access

    National scrap steel price forecasts using Gaussian process regression models

    Governments and investors have historically found faith in price forecasts for a broad variety of commodities. By using time-series data covering 23 August 2013–15 April 2021, this study investigates the complicated challenge of projecting scrap steel prices that are provided daily at the national level for China. Prior research has not given adequate consideration to estimates in this crucial evaluation of commodity prices. In this instance, Gaussian process regression algorithms are developed using cross-validation processes and Bayesian optimization approaches, leading to the construction of price forecasts. Our empirical prediction technique produces reasonably accurate price estimates for the out-of-sample period encompassing 17 September 2019–15 April 2021, with a relative root mean square error of 0.1053%. Governments and investors may utilize price prediction models to make educated decisions about the scrap steel industry.

  • articleOpen Access

    Forecasts of Residential Real Estate Price Indices for Ten Major Chinese Cities through Gaussian Process Regressions

    Due to the rapid growth of the Chinese housing market over the past ten years, forecasting home prices has become a crucial issue for investors and authorities alike. In this research, utilising Bayesian optimisations and cross validation, we investigate Gaussian process regressions across various kernels and basis functions for monthly residential real estate price index projections for ten significant Chinese cities from July 2005 to April 2021. The developed models provide accurate out-of-sample forecasts for the ten price indices from May 2019 to April 2021, with relative root mean square errors varying from 0.0207% to 0.2818%. Our findings could be used individually or in combination with other projections to formulate theories about the trends in the residential real estate price index and carry out additional policy analysis.

  • articleNo Access

    AUTOMATIC KERNEL REGRESSION MODELLING USING COMBINED LEAVE-ONE-OUT TEST SCORE AND REGULARISED ORTHOGONAL LEAST SQUARES

    This paper introduces an automatic robust nonlinear identification algorithm using the leave-one-out test score also known as the PRESS (Predicted REsidual Sums of Squares) statistic and regularised orthogonal least squares. The proposed algorithm aims to achieve maximised model robustness via two effective and complementary approaches, parameter regularisation via ridge regression and model optimal generalisation structure selection. The major contributions are to derive the PRESS error in a regularised orthogonal weight model, develop an efficient recursive computation formula for PRESS errors in the regularised orthogonal least squares forward regression framework and hence construct a model with a good generalisation property. Based on the properties of the PRESS statistic the proposed algorithm can achieve a fully automated model construction procedure without resort to any other validation data set for model evaluation.

  • articleNo Access

    Check Amount Recognition Based on the Cross Validation of Courtesy and Legal Amount Fields

    Check amount recognition is one of the most promising commercial applications of handwriting recognition. This paper is devoted to the description of the check reading system developed to recognize amounts on American personal checks. Special attention is paid to a reliable procedure developed to reject doubtful answers. For this purpose the legal (worded) amount on a personal check is recognized along with the courtesy (digit) amount. For both courtesy and legal amount fields, a brief description of all recognition stages beginning with field extraction and ending with the recognition itself are presented. We also present the explanation of problems existing at each stage and their possible solutions. The numeral recognizer used to read the amounts written in figures is described. This recognizer is based on the procedure of matching input subgraphs to graphs of symbol prototypes. Main principles of the handwriting recognizer used to read amounts written in words are explained. The recognizer is based on the idea of describing the handwriting with the most stable handwriting elements. The concept of the optimal confidence level of the recognition answer is introduced. It is shown that the conditional probability of the answer correctness is an optimal confidence level function. The algorithms of the optimal confidence level estimation for some special cases are described. The sophisticated algorithm of cross validation between legal and courtesy amount recognition results based on the optimal confidence level approach is proposed. Experimental results on real checks are presented. The recognition rate at 1% error rate is 67%. The recognition rate without reject is 85%. Significant improvement is achieved due to legal amount processing in spite of a relatively low recognition rate for this field.

  • articleNo Access

    Bankcheck Recognition using Cross Validation Between Legal and Courtesy Amounts

    A bankcheck reading system using cross validation of both the legal and the courtesy amounts is presented in this paper. Some of the challenges posed by the task are

    (i) segmentation of the legal amount into words,

    (ii) location of boundaries between dollars and cents amounts, and

    (iii) high accuracy in terms of recognition performance.

    Word segmentation in the legal amount is a serious issue because of the nature of the data and patrons' writing habits which tend to clump words together. We have developed a word segmentation algorithm based on the character segmentation results to address this issue. The list of possible amounts generated by the word segmentation hypotheses is used as lexicon for the courtesy amount recognition. The order of magnitude of the amount is estimated during legal amount recognition. We treat the courtesy amount as a numeral string and apply the same word recognition scheme as used for the legal amount.

    Our approach to check recognition differs from traditional methods in two significant aspects: First, our emphasis on both the legal and the courtesy amounts is balanced. We use an accurate word recognizer which performs equally well on alpha words and digit strings. Second, our combination strategy is serial rather than the commonly used parallel method. Experimental results show that 43.8% of check images are correctly read with an error rate of 0%.

  • articleNo Access

    Performance-Estimation Properties of Cross-Validation-Based Protocols with Simultaneous Hyper-Parameter Optimization

    In a typical supervised data analysis task, one needs to perform the following two tasks: (a) select an optimal combination of learning methods (e.g., for variable selection and classifier) and tune their hyper-parameters (e.g., K in K-NN), also called model selection, and (b) provide an estimate of the performance of the final, reported model. Combining the two tasks is not trivial because when one selects the set of hyper-parameters that seem to provide the best estimated performance, this estimation is optimistic (biased/overfitted) due to performing multiple statistical comparisons. In this paper, we discuss the theoretical properties of performance estimation when model selection is present and we confirm that the simple Cross-Validation with model selection is indeed optimistic (overestimates performance) in small sample scenarios and should be avoided. We present in detail and investigate the theoretical properties of the Nested Cross Validation and a method by Tibshirani and Tibshirani for removing the estimation bias. In computational experiments with real datasets both protocols provide conservative estimation of performance and should be preferred. These statements hold true even if feature selection is performed as preprocessing.

  • articleNo Access

    SHALLOW WATER BATHYMETRY FROM MULTISPECTRAL SATELLITE IMAGES: EXTENSIONS OF LYZENGA'S METHOD FOR IMPROVING ACCURACY

    A high-resolution or complete bathymetric map of shallow water based on sparse point measurements (depth soundings) is often needed. One possible approach to such maps is passive remote sensing of water depth by using multispectral imagery in the popular method proposed by Lyzenga et al. [2006]; however, its application has been limited due to insufficient accuracy. To improve accuracy, we have developed 3 extensions of Lyzenga's method by addressing unrealistic optical and statistical assumptions in the method. The purpose of this paper is to compare the accuracy of Lyzenga's method, the 3 extensions, and the combination of the 3 extensions. The accuracy comparison test was performed for 2 coral reef sites by using cross validation.

    The results indicated that for both sites, the extended methods were more accurate than Lyzenga's method when sufficient training data were available. The most accurate extension was the one derived by modeling the spatial autocorrelation in the error term of the regression model used in Lyzenga's method. The combination of the 3 extensions was even more accurate than the extensions.

    The implementations of the extended methods are not difficult in terms of software availability and computational cost.

  • articleFree Access

    Gaussian Process Regression Based Silver Price Forecasts

    A significant number of market participants have placed a high level of importance on price estimates for the primary metal commodities for a considerable amount of time. To tackle the problem, we investigate the daily reported price of silver in our study. The sample that is being analyzed spans a period of 13 years, starting on April 20, 2011, and ending on April 19, 2024. The price series which is being investigated has important implications for the commercial world. Specifically, when it comes to this unique circumstance, Gaussian process regression models are developed utilizing cross-validation strategies and Bayesian optimization procedures. The forecasting of prices is therefore accomplished via the use of the methods that are developed as a result of the situation. For the out-of-sample evaluation period that extends from October 5, 2021 to April 19, 2024, our empirical forecasting approach yields price estimates deemed reasonably accurate. The relative root mean square error reached for the silver price is 0.2257%, with the corresponding root mean square error of 0.0515, mean absolute error of 0.0389, and correlation coefficient of 99.967%. Due to the availability of models that forecast prices, investors and governments are supplied with the information they need to make educated judgments on the silver market by providing them with the knowledge they require. The framework of the Gaussian process regression with Bayesian optimizations demonstrates its good potential for modeling and forecasting sophisticated commodity price series for market participants.

  • articleNo Access

    Machine Learning Coffee Price Predictions

    Most market players have found great significance in price projections for basic agricultural commodities for a substantial duration. We look at the daily price of coffee that is released in this research in order to tackle the issue. The analytical sample runs from 2 January 2013 to 10 April 2024, a period of more than 12 years. A significant influence on the business sector comes from the price series under investigation. Specifically, in this particular situation, Gaussian process regression models are developed using Bayesian optimization techniques and cross-validation processes. Thus, this circumstance prompts the development of price forecasting methodologies. Using our empirical forecasting technique, we produce relatively accurate price projections for the out-of-sample assessment period, which runs from 3 January 2022 to 10 April 2024. It was found that price forecasts of coffee had a relative root mean square error of 2.0500%. With the availability of price forecasting models, investors and governments can make educated decisions about the coffee market given that they have access to the required data.

  • articleNo Access

    Machine learning Brent crude oil price forecasts

    Forecasts regarding the prices of energy commodities have long been significant to many market players. Our research examines the price of Brent crude oil on a daily basis in order to address the issue. The price series under investigation has significant financial ramifications, and the sample under investigation spans 10 years, from April 7, 2014 to March 28, 2024. In this case, cross-validation procedures and Bayesian optimization approaches are used to construct Gaussian process regression methods, and the resulting strategies are used to generate price estimates. For the out-of-sample evaluation period of March 24, 2022 to March 28, 2024, our empirical prediction technique yields relatively accurate projections of prices, as indicated by the relative root mean square error of 0.2814%. Price prediction models provide governments and investors with the knowledge they need to make informed decisions regarding the crude oil market.

  • articleNo Access

    Scrap steel price predictions for southwest China via machine learning

    Forecasts of prices for a wide range of commodities have been a source of confidence for governments and investors throughout history. This study examines the difficult task of forecasting scrap steel prices, which are released every day for the southwest China market, leveraging time-series data spanning August 23, 2013 to April 15, 2021. Estimates have not been fully considered in previous studies for this important commodity price assessment. In this case, cross-validation procedures and Bayesian optimization techniques are used to develop Gaussian process regression strategies, and consequent price projections are built. Arriving at a relative root mean square error of 0.4691%, this empirical prediction approach yields fairly precise price projections throughout the out-of-sample stage spanning September 17, 2019 to April 15, 2021. Through the use of price research models, governments and investors may make well-informed judgments on regional markets of scrap steel.

  • articleOpen Access

    Machine Learning-Based Scrap Steel Price Forecasting for the Northeast Chinese Market

    Throughout history, governments and investors have relied on predictions of prices for a broad spectrum of commodities. Using time-series data covering 08/23/2013–04/15/2021, this study investigates the challenging problem of predicting scrap steel prices, which are issued daily for the northeast China market. Previous research has not sufficiently taken into account estimates for this significant commodity price measurement. In this instance, Gaussian process regression methods are created using Bayesian optimisation approaches and cross-validation processes, and the resulting price forecasts are constructed. This empirical prediction methodology provides reasonably accurate price estimates for the out-of-sample period from 09/17/2019 to 04/15/2021, with a root mean square error of 9.6951, mean absolute error of 5.4218, and correlation coefficient of 99.9122%. Governments and investors can arrive at informed decisions regarding regional scrap steel markets by using pricing research models.

  • articleFree Access

    Forecasts of China Mainland New Energy Index Prices through Gaussian Process Regressions

    Energy index price forecasting has long been a crucial undertaking for investors and regulators. This study examines the daily price predicting problem for the new energy index on the Chinese mainland market from January 4, 2016 to December 31, 2020 as insufficient attention has been paid to price forecasting in the literature for this crucial financial metric. Gaussian process regressions facilitate our analysis, and training procedures of the models make use of cross-validation and Bayesian optimizations. From January 2, 2020 to December 31, 2020, the price was properly projected by the created models, with an out-of-sample relative root mean square error of 1.8837%. The developed models may be utilized in investors’ and policymakers’ policy analysis and decision-making processes. Because the forecasting results provide reference information about price patterns indicated by the models, they may also be useful in building of similar energy indices.

  • chapterNo Access

    HAS STATISTICS A FUTURE? IF SO, IN WHAT FORM?

    The mathematical foundations of statistics as a separate discipline were laid by Fisher, Neyman and Wald during the second quarter of the last century. Subsequent research in statistics and the courses taught in the universities are mostly based on the guidelines set by these pioneers. Statistics is used in some form or other in all areas of human endeavor from scientific research to optimum use of resources for social welfare, prediction and decision-making. However, there are controversies in statistics, especially in the choice of a model for data, use of prior probabilities and subject-matter judgments by experts. The same data analyzed by different consulting statisticians may lead to different conclusions.

    What is the future of statistics in the present millennium dominated by information technology encompassing the whole of communications, interaction with intelligent systems, massive data bases, and complex information processing networks? The current statistical methodology based on simple probabilistic models developed for the analysis of small data sets appears to be inadequate to meet the needs of customers for quick on line processing of data and making the information available for practical use. Some methods are being put forward in the name of data mining for such purposes. A broad review of the current state of the art in statistics, its merits and demerits, and possible future developments will be presented.

  • chapterNo Access

    CHECK AMOUNT RECOGNITION BASED ON THE CROSS VALIDATION OF COURTESY AND LEGAL AMOUNT FIELDS

    Check amount recognition is one of the most promising commercial applications of handwriting recognition. This paper is devoted to the description of the check reading system developed to recognize amounts on American personal checks. Special attention is paid to a reliable procedure developed to reject doubtful answers. For this purpose the legal (worded) amount on a personal check is recognized along with the courtesy (digit) amount. For both courtesy and legal amount fields, a brief description of all recognition stages beginning with field extraction and ending with the recognition itself are presented. We also present the explanation of problems existing at each stage and their possible solutions. The numeral recognizer used to read the amounts written in figures is described. This recognizer is based on the procedure of matching input subgraphs to graphs of symbol prototypes. Main principles of the handwriting recognizer used to read amounts written in words are explained. The recognizer is based on the idea of describing the handwriting with the most stable handwriting elements. The concept of the optimal confidence level of the recognition answer is introduced. It is shown that the conditional probability of the answer correctness is an optimal confidence level function. The algorithms of the optimal confidence level estimation for some special cases are described. The sophisticated algorithm of cross validation between legal and courtesy amount recognition results based on the optimal confidence level approach is proposed. Experimental results on real checks are presented. The recognition rate at 1% error rate is 67%. The recognition rate without reject is 85%. Significant improvement is achieved due to legal amount processing in spite of a relatively low recognition rate for this field.

  • chapterNo Access

    BANKCHECK RECOGNITION USING CROSS VALIDATION BETWEEN LEGAL AND COURTESY AMOUNTS

    A bankcheck reading system using cross validation of both the legal and the courtesy amounts is presented in this paper. Some of the challenges posed by the task are

    (i) segmentation of the legal amount into words,

    (ii) location of boundaries between dollars and cents amounts, and

    (iii) high accuracy in terms of recognition performance.

    Word segmentation in the legal amount is a serious issue because of the nature of the data and patrons' writing habits which tend to clump words together. We have developed a word segmentation algorithm based on the character segmentation results to address this issue. The list of possible amounts generated by the word segmentation hypotheses is used as lexicon for the courtesy amount recognition. The order of magnitude of the amount is estimated during legal amount recognition. We treat the courtesy amount as a numeral string and apply the same word recognition scheme as used for the legal amount.

    Our approach to check recognition differs from traditional methods in two significant aspects: First, our emphasis on both the legal and the courtesy amounts is balanced. We use an accurate word recognizer which performs equally well on alpha words and digit strings. Second, our combination strategy is serial rather than the commonly used parallel method. Experimental results show that 43.8% of check images are correctly read with an error rate of 0%.