Research ArticleOpen Access

Development of calibration models for rapid determination of moisture content in rubber sheets using portable near-infrared spectrometers

Department of Food Engineering, Faculty of Engineering at Kamphaeng Saen, Kasetsart University, Nakorn Pathom, 73140, Thailand

Search for more papers by this author

and

Amornrit Puttipipatkajorn

http://orcid.org/0000-0003-3390-4113

Department of Computer Engineering, Faculty of Engineering at Kamphaeng Saen, Kasetsart University, Nakorn Pathom, 73140, Thailand

E-mail Address: fengarp@ku.ac.th

Corresponding author.

Search for more papers by this author

https://doi.org/10.1142/S1793545820500091Cited by:11 (Source: Crossref)

Abstract

Rubber sheets are one of the primary products of natural rubber and are the main raw material in various rubber industries. The quality of a rubber sheet can be visually examined by holding it against clear light to inspect for any specks and impurities inside, but its moisture content is difficult to evaluate based on a visual inspection and this might lead to unfair trading. Herein, we developed a rapid, robust and nondestructive near-infrared spectroscopy (NIRS)-based method for moisture content determination in rubber sheets. A set of 300 rubber sheets were divided into a calibration (200 samples) and prediction groups (100 samples). The calibration set was used to develop NIRS calibration equation using different calibration models, Partial Least Square Regression (PLSR), Least Square Support Vector Machine (LS-SVM) and Artificial Neural Network (ANN). Among the models investigated, the ANN model with the first derivative of spectral preprocessing presented the best prediction with a coefficient of determination ( $R_{P}^{2})$ $R_{P}^{2})$ of 0.993, root mean square error of calibration (RMSEC) of 0.126% and root mean square error of prediction (RMSEP) of 0.179%. The results indicated that the proposed NIRS-ANN model will be able to reduce human error and provide a highly accurate estimate of the moisture content in a rubber sheet compared to traditional wet chemistry estimation methods according to AOAC standards.

Keywords:

1. Introduction

Natural rubber is an important economic crop in Southeast Asia. Natural rubber is used extensively by many manufacturing companies in the rubber industry. Applications include tires, tank liners and automobile parts. The natural rubber is often collected from smallholders in various forms and separated into specific grades based on a visual inspection. One form of early processing is called a rubber sheet, which can be of many types including ribbed smoked sheet (RSS), unsmoked sheet (USS) and air-dried sheet (ADS) depending on the drying methodology used. Among them, the USS has been more popular with growers in Thailand. Practically, the entire volume of this grade of rubber is produced by small-scale and medium-scale rubber growers, scattered throughout the rubber growing districts in the country. The popularity of this grade is mainly due to the simplicity and low cost of the processing machinery and the easily adoptable processing technology in the manufacturing process for any amount of latex. The rubber sheets are generally graded according to their color, moisture content, and consistency as well as observed impurities.

Near-infrared spectroscopy (NIRS) is a type of high-energy vibrational spectroscopy performed in the wavelength range 750–2500nm (13,333 to 4000cm $^{- 1})$ $^{- 1})$ , which is the region between visible light and classical mid-infrared.¹ NIRS is a fast and nondestructive analytical method. It has proven its effectiveness for both qualitative and quantitative analyses in several fields. NIRS was first used in agricultural applications to measure moisture in grain.² Since then, it has been used for rapid analysis of moisture, protein and fat contents of a wide variety of agricultural and food products.³ Recent studies involving the NIR region have shown that NIRS is a suitable method for quantifying trace amounts of moisture in a rubber sheet, due to the strong combination of absorption bands for water at around 1940nm and the first, second and third overtones at 1450, 970 and 760nm, respectively.⁴ Another paper investigated the prediction of the dry rubber content in concentrated latex by using a portable NIRS (Avantes, the Netherlands) in the wavelength range 370–1085nm.⁵ The results of the prediction showed that NIRS predicted accurately with a high coefficient of determination ( $R^{2} = 0.9741$ $R^{2} = 0.9741$ ) and root mean square error of calibration ( $RMSEC = 1.09$ $RMSEC = 1.09$ %). In 2015, the use was proposed of the NIR System 6500 for determining the moisture content of natural rubber in the form of cup lump rubber using the wavelength range 400–1100nm.⁶ Partial Least Square Regression (PLSR) was selected to develop the calibration model, with the resultant model from experimentation having good statistical results ( $R^{2} = 0.98$ $R^{2} = 0.98$ , $RMSEC = 1.68$ $RMSEC = 1.68$ % and $RMSEP = 1.48$ $RMSEP = 1.48$ %). However, the technique has not been implemented for field use despite this previous research showing that the NIRS technique was suitable for determining the dry rubber and moisture contents of natural rubber.

Interest in NIRS has increased due to the emergence of new mathematical approaches such as artificial neural network (ANN) models as well as the development of fiber optics that allows delocalization of the measurement. In recent years, the use of ANN models has spread to various applications in different fields. In particular, in the field of NIRS, ANN models have been shown to perform well with regard to nonlinear models, resulting in clear improvements in the models developed in a large number of applications.^7,8 Therefore, the objective of this paper was to evaluate the potential of three different types of calibration models (Partial Least Squares Regression (PLSR), Least Square support vector machine (LS-SVM), and ANN) using NIR spectroscopy in the wavelength range 900–1700nm for the prediction of the moisture content in a rubber sheet.

2. Materials and Methods

2.1. Rubber sheet process

The rubber sheets were sampled from a rubber plantation in eastern Thailand. They were prepared from latex by diluting the latex with clean and pure water prior to adding acid to promote coagulation. This dilution helped in achieving quality consistency in the final product. Latex coagulation was promoted by adding 1% diluted formic acid to the already diluted latex as developed by the Rubber Research Institute of Thailand. The addition of acid in diluted form assisted in achieving uniform acid distribution in the latex and thereby ensured complete coagulation and a soft coagulum to produce sheets free of air bubbles and stickiness. Water was squeezed out of the rubber coagulum using a series of rollers. This process was continued by pressing down on the marking roller to thin down the coagulum to form a sheet with even thickness (3mm). The rubber sheets were then hung up to allow the remaining water to drip off for about 1–2 days and to remove the surface moisture. Drying was completed after the rubber sheets had been continually dried in a chamber for about 10–15 days.

2.2. Portable NIRS design

The portable NIRS is designed for use in reflectance mode in the range 900–1700nm with 3.5nm resolution, as shown in Fig. 1. Its dimensions are $6.5 \times 17.0 \times 4.5$ $6.5 \times 17.0 \times 4.5$ cm. Inside, it contains the DLP NIR Scan Nano EVM (evaluation module) from Texas Instruments, Inc., USA and a light source for generating the near-infrared light. The portable NIRS is powered by a battery (Li-Polymer 3.7V 1800mAh) and is connected to a computer via a USB port or a smartphone using Bluetooth Low Energy (BLE).

In this paper, the spectral data of rubber sheet were collected using the DLP NIR Scan_Nano_GUI software. Each measurement point was based on the average of 10 individual readings. The spectral data were then transferred to the MATLAB software package for spectral preprocessing and multivariate analysis.

2.3. Spectrum acquisition and correction

In the experiment, a rubber sheet was placed on an aluminum plate. The rubber sheet spectrum was acquired using the portable NIRS in reflectance mode. Due to the nonsmooth surface of the rubber sheet, the reflected light traveled different distances in different directions from the sample surface to the spectrometer detector, resulting in different spectral results from different locations. There are other factors affecting the spectral characteristics that may not be easily defined and these can cause a spectral shift in terms of linear and nonlinear translation and thus influence the performance of the calibration model. Consequently, spectral pretreatment is a necessary part of the spectral analysis and can improve the accuracy of the analysis results. For this reason, the spectral transformations were applied to the rubber sheet spectra to reduce the problems associated with noise, light-scattering and external effects prior to implementing the regression analysis. Presently, there are several chemometric pretreatment methods available such as moving average smoothing (MAS), multiplicative scatter correction (MSC), standard normal variate transformation (SNV) and first and second derivative transformation⁹ that could be applied to reduce the noise and normalize the spectra.¹⁰ In this paper, SNV and the first and second derivatives were selected to compare their effects on the proposed calibration models.

2.4. Calibration models

2.4.1. Partial Least Squares Regression (PLSR)

The PLSR method is the most commonly used regression algorithm in the field of chemometric spectroscopy.¹¹ In algorithm, PLSR searches for a set of components or latent variables (LV) that performs a simultaneous decomposition of spectral data ( $X)$ $X)$ and reference value ( $Y)$ $Y)$ as a product of a common set of orthogonal factors and a set of specific loadings with the constraint that these components explain as much as possible of the covariance between $X$ $X$ and $Y$ $Y$ . Using more LV in the model causes over-fitting resulting in low model performance. Therefore, the optimal number of the latent variable is selected to avoid over-fitting but to maximize the covariance between the $X$ $X$ and $Y$ $Y$ space.¹² In this paper, the optimal latent variable was discovered using a cross-validation technique.¹³ The latent variable that results in the minimum of the mean square error (MSE) was considered as the optimal latent variable of the model. PLSR is a good alternative to classical multiple linear regression and principal component regression methods because it is more robust in that the model parameters do not change very much when new calibration samples are taken from the total population. In this paper, the PLSR algorithm in MATLAB R2017a (The Math Works, Natick, Ma, USA) was used to build the calibration models for moisture content prediction in rubber sheets.

2.4.2. Least Square Support Vector Machine (LS-SVM)

The support vector machine (SVM) was initially developed for classification problems and was then expanded to treat regression problems. An SVM is a supervised classification and regression method capable of dealing well with both linear and nonlinear data.¹⁴ SVM has important roles in the classification task regarding pattern recognition and machine learning¹⁴ and has proven to be a reliable and efficient method in NIR spectroscopy.^15,16,17 LS-SVM is an alternate formulation of SVM that simplifies the training process of SVM and significantly reduces the computation time while maintaining similar performance to SVM.¹⁸ The implementation of SVM requires three tuning parameters ( $γ$ $γ$ , $σ$ $σ$ , $ϵ)$ $ϵ)$ , while LS-SVM requires only two parameters ( $γ$ $γ$ , $σ)$ $σ)$ .¹⁹ In this paper, LS-SVM was implemented to establish a calibration model to correlate the mean spectra and the moisture content of the rubber sheets. In this model, the radial basis function (RBF) as shown in Eq. (1) was selected as the kernel function. A grid-search technique using LS-SVM lab toolbox was used to tune the regularization parameter ( $γ)$ $γ)$ that determines the trade-off between the training error minimization and smoothness²⁰ and the RBF kernel function parameter ( $σ^{2})$ $σ^{2})$ which is the squared bandwidth of the Gaussian curve.²⁰

K(xi,xj)=e−∥xi−xj∥2σ2.K(xi,xj)=e−∥∥xi−xj∥∥2σ2.<math display="block" altimg="eq-00035.gif"><mi>K</mi><mo stretchy="false">(</mo><msub><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>,</mo><msub><mrow><mi>x</mi></mrow><mrow><mi>j</mi></mrow></msub><mo stretchy="false">)</mo><mo>=</mo><msup><mrow><mi>e</mi></mrow><mrow><mo>−</mo><mfrac><mrow><msup><mrow><mfenced separators="" open="∥" close="∥"><mrow><msub><mrow><mi>x</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>−</mo><msub><mrow><mi>x</mi></mrow><mrow><mi>j</mi></mrow></msub></mrow></mfenced></mrow><mrow><mn>2</mn></mrow></msup></mrow><mrow><msup><mrow><mi>σ</mi></mrow><mrow><mn>2</mn></mrow></msup></mrow></mfrac></mrow></msup><mo>.</mo></math>(1)

2.4.3. Artificial neural network (ANN)

An ANN is a set of algorithms and is similar to synaptic strength in biological neurons.²¹ It consists of interconnected neurons between the input, hidden and output layers. A node in each layer combines input from the data with a set of weights. An ANN has high processing speed, robustness, and generalization capabilities and is able to deal with large dimensional data spaces.²² In particular, a feed-forward back-propagation network is capable of distinguishing interesting features from voluminous and noisy datasets having distorted patterns.²³ This paper proposed a multilayer feed-forward neural network with one input layer, one hidden layer, and one output layer topology, as shown in Fig. 2. The network had a fixed number of inputs relative to the number of spectral wavelengths in this layer. The neurons of one layer were connected with each neuron of the previous layer. This connection had a feed-forward; no backward connection was allowed. At the neuron level, a bias was added to the weighted sum of the inputs and the Tan-sigmoid transfer function was applied. The output of a single neuron was calculated using Eq. 2 :

y j i = f j i (x) = f (m \sum k = 1 w j k, i y j - 1 k + b j i), y_{i}^{j} = f_{i}^{j} (x) = f (m \sum k = 1 w_{k, i}^{j} y_{k}^{j - 1} + b_{i}^{j}), <math display="block" altimg="eq-00036.gif"><msubsup><mrow><mi>y</mi></mrow><mrow><mi>i</mi></mrow><mrow><mi>j</mi></mrow></msubsup><mo>=</mo><msubsup><mrow><mi>f</mi></mrow><mrow><mi>i</mi></mrow><mrow><mi>j</mi></mrow></msubsup><mo stretchy="false">(</mo><mi>x</mi><mo stretchy="false">)</mo><mo>=</mo><mi>f</mi><mfenced separators="" open="(" close=")"><mrow><munderover accentunder="true" accent="true"><mrow><mo>\sum</mo></mrow><mrow><mi>k</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>m</mi></mrow></munderover><msubsup><mrow><mi>w</mi></mrow><mrow><mi>k</mi><mo>,</mo><mi>i</mi></mrow><mrow><mi>j</mi></mrow></msubsup><msubsup><mrow><mi>y</mi></mrow><mrow><mi>k</mi></mrow><mrow><mi>j</mi><mo>-</mo><mn>1</mn></mrow></msubsup><mo>+</mo><msubsup><mrow><mi>b</mi></mrow><mrow><mi>i</mi></mrow><mrow><mi>j</mi></mrow></msubsup></mrow></mfenced><mo>,</mo></math> (2)

Fig. 2. A neural network with feed-forward architecture and one hidden layer.

where $m$ $m$ is the number of inputs, $i$ $i$ is the number of the current neuron in layer $j$ $j$ and $w_{k, i}^{j}$ $w_{k, i}^{j}$ is the synaptic weight factor for the connection of the neuron $i j$ $i j$ with the neuron $k j - 1$ $k j - 1$ . The training occurs in a supervised style. The basic idea is to present the input vector to the network and to calculate in the forward direction the output of each layer and the final output of the network. In the output layer, the desired values are known and therefore the weights can be adjusted using the Levenberg–Marquardt (LM) algorithm according to the gradient descent rule. The LM algorithm is one of the most powerful and rapid methods used in the training of feed-forward multilayer networks²⁴ being a second-order optimization method based on the determination of the Jacobian matrices corresponding to the partial derivatives of the cost function and descends through the error surface by also using the information provided by the changing rhythm of the slope. The LM algorithm is an effective modification of the Gauss–Newton method, which combines the excellent local convergence properties of that method with the consistent error decrease provided by the gradient descent method.²⁵

2.5. Model evaluation

The predictive performances of PLSR, LS-SVM, and ANN were compared using various parameters. The coefficient of determination defined in Eq. (3) gives information about the goodness of fit of a model in calibration ( $R_{C}^{2})$ $R_{C}^{2})$ and prediction ( $R_{P}^{2})$ $R_{P}^{2})$ . The root mean square errors (RMSE) of calibration (RMSEC) and prediction (RMSEP) as shown in Eq. (4) represent the mean absolute error of the time-series that calculated by the model between the reference $y_{i}$ $y_{i}$ and predicted $ŷ_{i}$ value.²⁶ The residual predictive deviation of calibration (RPD $_{C})$ and prediction (RPD $_{P})$ were calculated by dividing the standard deviation (SD) of the reference values by RMSEC or RMSEP, respectively²⁷ and are described by Eq. (5). In general, a good model should yield high values of RPD_C, RPD_P, $R_{C}^{2}$ , and $R_{P}^{2}$ , while also producing low values of RMSEC and RMSEP, as well as only a small difference between them.

R2=1−∑ni=1(yi−ŷi)2∑ni=1(yi−ˉyi)2,<math display="block" altimg="eq-00051.gif"><msup><mrow><mi>R</mi></mrow><mrow><mn>2</mn></mrow></msup><mo>=</mo><mn>1</mn><mo>−</mo><mfrac><mrow><msubsup><mrow><mo>∑</mo></mrow><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></msubsup><msup><mrow><mo stretchy="false">(</mo><msub><mrow><mi>y</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>−</mo><msub><mrow><mi>ŷ</mi></mrow><mrow><mi>i</mi></mrow></msub><mo stretchy="false">)</mo></mrow><mrow><mn>2</mn></mrow></msup></mrow><mrow><msubsup><mrow><mo>∑</mo></mrow><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></msubsup><msup><mrow><mo stretchy="false">(</mo><msub><mrow><mi>y</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>−</mo><msub><mover><mrow><mi>y</mi></mrow><mrow><mo>̄</mo></mrow></mover><mrow><mi>i</mi></mrow></msub><mo stretchy="false">)</mo></mrow><mrow><mn>2</mn></mrow></msup></mrow></mfrac><mo>,</mo></math>(3)

RMSE=√1nn∑i=1(yi−ŷi)2,<math display="block" altimg="eq-00052.gif"><mstyle><mtext mathvariant="normal">RMSE</mtext></mstyle><mo>=</mo><msqrt><mrow><mfrac><mrow><mn>1</mn></mrow><mrow><mi>n</mi></mrow></mfrac><munderover accentunder="true" accent="true"><mrow><mo>∑</mo></mrow><mrow><mi>i</mi><mo>=</mo><mn>1</mn></mrow><mrow><mi>n</mi></mrow></munderover><msup><mrow><mo stretchy="false">(</mo><msub><mrow><mi>y</mi></mrow><mrow><mi>i</mi></mrow></msub><mo>−</mo><msub><mrow><mi>ŷ</mi></mrow><mrow><mi>i</mi></mrow></msub><mo stretchy="false">)</mo></mrow><mrow><mn>2</mn></mrow></msup></mrow></msqrt><mo>,</mo></math>(4)

RPD=SDRMSE.<math display="block" altimg="eq-00053.gif"><mstyle><mtext mathvariant="normal">RPD</mtext></mstyle><mo>=</mo><mfrac><mrow><mstyle><mtext mathvariant="normal">SD</mtext></mstyle></mrow><mrow><mstyle><mtext mathvariant="normal">RMSE</mtext></mstyle></mrow></mfrac><mo>.</mo></math>(5)

3. Results and Discussion

3.1. Sample preparation

In total, 300 rubber sheets were divided into two groups of 200 samples and 100 samples. The first group (calibration set) was used for developing the calibration model and the second group (prediction set) was used for testing the model. The spectrum data of the samples were collected using a portable NIRS. Then, the samples were weighed for initial mass and the moisture content was calculated using the gravimetric method and hot-air oven drying defined by AOAC standards.²⁸ Table 1 shows the rubber sheet moisture content statistics such as the range and mean for both the calibration and prediction sets. These values indicated that the sample range covered the moisture content of rubber sheets available in the market and was almost the same for calibration and prediction. This indicated good conditions for model testing because the moisture content of the prediction set was within the range of the calibration set.

**Table 1. Reference values for moisture content in rubber sheets analyzed using standard laboratory method.**
Sample set	No. of samples	Moisture content (%db)	Average $\pm$ Standard deviation
Calibration set	200	0.24–4.87	$2.06 \pm 1.48$
Prediction set	100	0.24–4.81	$2.00 \pm 1.50$

Note: db-dry basis.

Finally, in the formulation of the rubber sheet, a pattern was printed on its surface using an engraved roller that roughened the surface. Thus, the spectra acquired from different locations (Fig. 3(a)) on the rubber sheet surface were not exactly the same due to the light scattering associated with each location. For the reasons above, if only one spectrum was acquired from the sample, the calibration model may result in poor performance. Therefore, the experiment was conducted to acquire four spectra at different locations on the rubber sheet surface. These spectra were processed using SNV (Fig. 3(b)) to reduce the light scattering effect at different locations. Then, the processed spectra were averaged to obtain a mean spectrum of the sample and all samples (both calibration and prediction sets) were treated in the same way. Finally, the mean spectrum of all samples was preprocessed using different techniques (SNV, the first derivative or the second derivative) before testing them using the model algorithm to compare their effects on model performance.

Fig. 3. SNV spectral processing based on different locations.

3.2. Spectra characteristics of rubber sheets

The mean absorbance spectra of rubber sheets in the region 900–1700nm with different moisture contents (4.72%db, 2.43%db, 1.70%db, and 0.53%db) are shown in Fig. 4. In the region of 925–1200nm in Fig. 4, the absorbance value of spectra differ clearly, while the higest spectrum line corresponds to the high rubber content in the sample. The functional groups of the rubber polymer (CH, CH $_{3})$ in the sample strongly vibrated in this region,⁶ especially at 925, 1140, 1200, 1351, and 1659nm as shown in Fig. 5, where peaks were visible in the spectra preprocessed using the first derivative. The CH functional groups vibrated in the second and third overtone regions, around 1200 and 925nm,⁵ respectively, while around 1140nm and 1351nm, the CH₃ functional groups vibrated in the second overtone region, and at around 1659nm in the first overtone region.²⁹ In region 1400–1600nm in Fig. 4, the lowest line (0.53%db) has the lowest absorbance value corresponding to the lowest moisture content, whereas the higher lines have higher absorbance associated with the higher moisture content of the samples. The spectra were clearly different in this region due to the water (OH functional groups) in the second overtone region, especially around 1410nm²⁹ (Fig. 5) where a peak was visible. When the spectra were preprocessed using the second derivative as shown in Fig. 6, peaks were visible near the peaks found in the first derivative spectrum. These peaks (933, 1125, 1185, 1211, 1333, 1365, 1444, and 1649nm) were related to the CH, CH₃, CH, CH, CH₃, CH₃, OH, CH₃ functional groups,³⁰ respectively. The difference in the spectral characteristics indicated that NIR spectroscopy has the potential to discriminate among rubber sheets with different moisture contents.

Fig. 4. Mean spectra of the rubber sheets with different moisture contents.

Fig. 5. Mean spectra of rubber sheets with different moisture contents preprocessed by the first derivative.

Fig. 6. Mean spectra of rubber sheets with different moisture contents preprocessed by the second derivative.

3.3. Calibration model analysis

PLSR was used to build regression models of the moisture content in rubber sheets using the mean absorbance spectra in the calibration set. The optimal number of LV was determined for each spectral preprocessing option used (none, SNV, the first derivative, and the second derivative) based on cross-validation of each model. Table 2 shows the optimal LV as well as the performance of the PLSR models, with the best PLSR model for predicting the moisture content in rubber sheets being the model preprocessed using the first derivative based on prediction with $R_{P}^{2}$ , RMSEP and RPD_P values of 0.977%db, 0.227%db and 6.576%db, respectively. In general, for regression models, values of $R^{2}$ in the range 0.82–0.90 usually indicate good performance, while higher than 0.90 indicates excellent performance.¹⁴ Similarly for RPD, a value greater than 2.0 indicates a good quantitative model, while greater than 3.0 indicates that the model is excellent.³¹ It should be noted that the optimal LV of PLSR models were rather high due to producing their lowest value of MSE in the cross-validation process. In other ways, the number of latent variable down to 12, 13, 11, and 12 for none, SNV, first derivative, and second derivative pretreatment, respectively could be used to obtain a good result with a few deterioration of performances because of having very small differences in the value of MSE compared with the optimal latent variable. The established PLSR model showed excellent performance with high values of $R_{P}^{2}$ and RPD_P for predicting the moisture content in rubber sheets, ensuring that the model could be used in the field.

**Table 2. Statistical comparison of different NIR calibration models.**
			Calibration set			Prediction set
Calibration model	Spectra pretreatment	LV	$R_{C}^{2}$	RMSEC	RPD_C	$R_{P}^{2}$	RMSEP	RPD_P
PLSR	None	20	0.986	0.176	8.373	0.977	0.229	6.555
	SNV	20	0.986	0.175	8.405	0.976	0.230	6.508
	1st derivative	16	0.986	0.177	8.312	0.977	0.227	6.576
	2nd derivative	14	0.981	0.206	7.131	0.969	0.263	5.559
LS-SVM	None	—	0.992	0.134	10.945	0.983	0.192	7.597
	SNV	—	0.991	0.141	10.453	0.983	0.193	7.588
	1st derivative	—	0.991	0.142	10.321	0.983	0.193	7.451
	2nd derivative	—	0.993	0.120	12.214	0.980	0.212	6.709
ANN	None	—	0.996	0.132	11.008	0.992	0.221	6.575
	SNV	—	0.995	0.165	8.647	0.988	0.223	6.398
	1st derivative	—	0.997	0.126	11.757	0.993	0.179	8.276
	2nd derivative	—	0.997	0.114	12.862	0.989	0.220	6.665

The regression coefficient of the best PLSR model (using first derivative spectral preprocessing) in Fig. 7 shows the impact of spectral wavebands on the moisture content of a rubber sheet. The wavebands at 1125 and 1389nm had a high positive correlation, while at 1156nm and 1430nm, there was a high negative correlation with the model prediction of moisture content. The wavebands around 1125, 1156, and 1389nm were related to the CH₃ functional groups²⁹ of the rubber polymer (poly-isoprene) that is the primary chemical constituent of natural rubber and the waveband at 1430nm was related to the OH functional groups of water³² in the rubber sheet. The above significant wavebands of the model were closely related to the moisture content and dry rubber content of the rubber sheet. Based on these observations, NIR spectroscopy using the PLSR model based on the spectral range 900–1700nm was feasible to rapidly and nondestructively determine the moisture content in a rubber sheet.

Regression analysis using the LS-SVM algorithm was also used to build a calibration model based on the mean spectra of samples in the calibration set for predicting the moisture content in a rubber sheet. In this procedure, the parameters (regularization ( $γ)$ and the squared bandwidth of the Gaussian curve ( $σ^{2}))$ of the model were tuned to discover the optimal value using coupled simulated annealing (CSA) for the initial estimation and the simplex method for fine-tuning. The model performance of LS-SVM with the optimal parameters for each spectral preprocessing option used are shown in Table 2 and indicate that LS-SVM was slightly superior to PLSR regardless of the preprocessing techniques used in both the calibration and prediction sets. The best LS-SVM model for predicting the moisture content in a rubber sheet was the model with no spectral preprocessing that produced a moisture content prediction for the prediction set with $R_{P}^{2}$ , RMSEP and RPD_P values of 0.983%db, 0.192%db and 7.597%db, respectively. The LS-SVM model performed better than the PLSR model in many spectroscopic areas. Nevertheless, recent studies have reported that artificial intelligence such as the neural network algorithm has been successful in regression tasks as well and can improve the model performance in many applications. Therefore, in this research, a neural network was used to build a regression model to compare its predictive performance with the two previous algorithms.

In order to compare the results of the prediction models developed using PLSR and LS-SVM with the ANN model, the same dataset was imported into the MATLAB R2017a software environment. In this paper, the ANN was designed with only one hidden layer due to the high computation requirements of the Jacobian matrix of the error function and the need for invert matrices with the same size as the number of network weights.²⁶ To avoid over-fitting for a small number of training samples (200 samples in the calibration set), the number of hidden neurons varied from 1 to 16 units to find an optimal solution. A model of two units in the hidden layer was selected due to its minimum error (MSE) compared with the others. This meant that if the number of hidden nodes were too large for the number of training samples, then the network would converge more easily and fit well with the training data, but it would not be suitable for generalizing well for other data.²² Table 2 shows that the performance of the ANN model with the first derivative had the best result with values for $R_{P}^{2}$ , RMSEP, and RPDP of 0.993%db, 0.179%db and 8.276%db, respectively. Figures 8(e) and 8(f) presents the scatter plots of the best results for the ANN model for both the calibration and prediction sets. In addition, it can be observed that the pretreated spectra of the first derivative could clearly separate the wavelengths at about 925, 1200 and 1410nm. These wavelengths corresponded to the functional groups of rubber polymer (CH, CH $_{3})$ and the second overtone of water (OH).⁴ In this case, importantly, RMSEP value reduced from around 0.22–0.179 when compared with the other spectral processing methods. These results indicated that the ANN model also had the best RMSEP value compared with the PLSR and LS-SVM models. Therefore, the LM algorithm had excellent convergence properties with less hidden nodes and this allowed for a more precise estimation of the prediction error.²⁶

Fig. 8. Scatter plot of predicted and reference moisture content for PLSR, LS-SVM, and ANN models using calibration and prediction sets.

4. Conclusion

The results indicated that NIRS had high potential as a tool for predicting the moisture content in a rubber sheet because it could identify the powerful absorption bands of the functional groups of the rubber polymer and water in the near-infrared region. The current results show an important correlation between moisture content and NIR generated spectral data. As part of the spectral preprocessing, the results showed that the first derivative method resulted in the best outcomes for the PLSR and ANN models, because of its ability to arrange the wavelengths corresponding with the functional groups of rubber polymer and water. This paper developed and compared different prediction models (PLSR, LS-SVM, and ANN). The results showed that using the ANN model provided the best outcome in predicting the moisture content compared with the other models in terms of the value of RMSEP. As a nonlinear model, the ANN model was able to utilize the predictive ability associated with the nonlinearity that existed in the data. In terms of the complexity of the network, it should be noted that the network topology should have enough complexity to avoid over-fitting as stated in the previous section. Based on this case, it can be confirmed that classical regression methods such as PLSR do not always provide the optimum option when dealing with spectroscopy.

Acknowledgment

This research was supported by the Faculty of Engineering at Kamphaeng Saen, Kasetsart University, Thailand.

References

1. C. Pasquini, “Near infrared spectroscopy: A mature analytical technique with new perspectives,” J. Anal. Chim. 1026, 8–36 (2018). Crossref, Web of Science, Google Scholar
2. K. H. Norris, “Design and development of a new moisture meter,” J. Agric. Eng. 45(7), 370–372 (1964). Google Scholar
3. A. M. Davies, A. Grant, “Review: Near-infra-red analysis of food,” J. Food Sci. Technol. 22, 191–207 (1987). Crossref, Google Scholar
4. R. Rittiron, W. Seehalak, “Moisture content in rubber sheet analyzed by transflectance near infrared spectroscopy,” J. Innov. Opt. Health Sci. 7, 1350068 (2014). Link, Web of Science, Google Scholar
5. P. Sirisomboon, A. Kaewkuptong, P. Williams, “Feasibility study on the evaluation of the dry rubber content of field and concentrated latex of Para rubber by diffuse reflectance near infrared spectroscopy,” J. Near Infrared Spec. 21, 81–88 (2013). Crossref, Web of Science, Google Scholar
6. S. Suchata, P. Theanjumpolb, S. Karrila, “Rapid moisture determination for cup lump natural rubber by near infrared spectroscopy,” J. Ind Crop Prod. 76, 772–780 (2015). Crossref, Web of Science, Google Scholar
7. V. R. Sharabiani, A. S. Naza, “Prediction of protein content of winter wheat by canopy of near infrared spectroscopy (NIRS) using partial least squares regression (PLSR) and artificial neural network (ANN) models,” J. Agric. Sci. 29(1), 43–51 (2019). Google Scholar
8. M. J. Martelo-Vidal, M. Vázquez, “Application of artificial neural networks coupled to UV-VIS-NIR spectroscopy for the rapid quantification of wine compounds in aqueous mixtures,” J. Food Eng. 13(1), 32–39 (2015). Google Scholar
9. A. Savitzky, M. J. Golay, “Smoothing and differentiation of data by simplified least squares procedures,” J. Anal. Chem. 36(8), 1627–1639 (1964). Crossref, Web of Science, Google Scholar
10. N. M. Nawi, T. Jensen, G. Chen, “The application of spectroscopic methods to predict sugarcane quality based on stalk cross-sectional scanning,” J. Am. Soc. Sugar Cane Technol. 32, 16–27 (2012). Google Scholar
11. S. Wold, J. Trygg, A. Berglund, H. Antti, “Some recent development in PLS modeling,” J. Chemometr. Intell. Lab Syst. 58, 131–150 (2001). Crossref, Web of Science, Google Scholar
12. X. Yu, H. Lu, D. Wu, “Development of deep learning method for predicting firmness and soluble solid content of postharvest Korla fragrant pear using Vis/NIR hyperspectral reflectance imaging,” J. Postharvest Biol. Technol. 141, 39–49 (2018). Crossref, Web of Science, Google Scholar
13. R. Redaellia, M. Alfieria, G. Cabassi, “Development of a NIRS calibration for total antioxidant capacity in maize germplasm,” Talanta 154, 164–168 (2016). Crossref, Web of Science, Google Scholar
14. J. Li, W. Huang, C. Zhao, B. Zhang, “A comparative study for the quantitative determination of soluble solids content, pH and firmness of pears by Vis/NIR spectroscopy,” J. Food Eng. 116(2), 324–332 (2013). Crossref, Web of Science, Google Scholar
15. V. N. Vapnik, Statistical Learning Theory, Springer, New York (1998). Google Scholar
16. H. Y. Yua, X. Y. Niua, H. J. Lina, Y. B. Yinga, B. B. Lib, X. X. Panc, “A feasibility study on on-line determination of rice wine composition by Vis-NIR spectroscopy and least-squares support vector machines,” J. Food Chem. 113(1), 291–296 (2009). Crossref, Web of Science, Google Scholar
17. A. Morellosa, X. E. Pantazia, D. Moshoua, T. Alexandridisc, R. Whettonb, G. Tziotziosa, J. Wiebensohnd, R. Billd, A. M. Mouazen, “Machine learning based prediction of soil total nitrogen, organic carbon and moisture content by using VIS-NIR spectroscopy,” J. Biosyst. Eng. 152, 104–116 (2016). Crossref, Web of Science, Google Scholar
18. Z. Han, S. Cai, X. Zhang, Q. Qian, Y. Huang, F. Dai, G. Zhang, “Development of predictive models for total phenolics and free p-coumaric acid contents in barley grain by near-infrared spectroscopy,” J. Food Chem. 227, 342–348 (2017). Crossref, Web of Science, Google Scholar
19. J. A. Suykens, J. Vandewalle, “Least squares support vector machine classifiers,” Neural Process. Lett. 9(3), 293–300 (1999). Crossref, Web of Science, Google Scholar
20. F. Chauchard, R. Cogdillb, S. Rousselc, J. M. Rogera, V. Bellon-Maurela, “Application of LS-SVM to non-linear phenomena in NIR spectroscopy: Development of a robust and portable sensor for acidity prediction in grapes,” J. Chemometr Intell Lab Syst. 71(2), 141–150 (2004). Crossref, Web of Science, Google Scholar
21. K. De Brabanter, P. Karsmakers, F. Ojeda, C. Alzate, J. De Brabanter, K. Pelckmans, “LS-SVM toolbox user’s guide,” ESAT-SISTA Technical Report, 10-146, (2011). Google Scholar
22. D. Kruzlicova, J. Mock, B. Balla, J. Petka, M. Farkova, J. Havel, “Classification of Slovak white wines using artificial neural networks and discriminant techniques,” J. Food Chem. 112, 1046–1052 (2009). Crossref, Web of Science, Google Scholar
23. S. Minaei, H. Bagherpour, M. A. Noghabi, M. E. Khorasani Fardvani, F. Forughimanesh, “A comparative study concerning linear and nonlinear models to determine sugar content in sugar beet by near infrared spectroscopy,” J. Food Biosci. Technol. 6(1), 13–22 (2016). Google Scholar
24. Y. Zhai, J. A. Thomasson, J. E. Boggess, R. Sui, “Soil texture classification with artificial neural networks operating on remote sensing data,” J. Comput Electron Agr. 54, 53–68 (2006). Crossref, Web of Science, Google Scholar
25. M. T. Hagan, M. Menhaj, “Training feed forward networks with the Marquardt algorithm,” IEEE Trans. Neural Netw. 5, 989–993 (1994). Crossref, Web of Science, Google Scholar
26. D. Pérez-Marín, A. Garrido-Varo, J. E. Guerrero, J. C. Gutiérrez-Estrada, “Use of artificial neural networks in near-infrared reflectance spectroscopy calibrations for predicting the inclusion percentages of wheat and sunflower meal in compound feeding stuffs,” J. Appl Spectrosc. 60(9), 1062–1069 (2006). Crossref, Web of Science, Google Scholar
27. R. Stone, “Improved statistical procedure for the evaluation of solar radiation estimation models,” Solar Energy 51(4), 289–291 (1993). Crossref, Web of Science, Google Scholar
28. AOAC, Official Methods of Analysis of AOAC International, 21st edition, https://www.aoac.org/official-methods-of-analysis-21st-edition-2019/ (2019). Google Scholar
29. G. Hans, B. Leblon, P. Cooper, A. L. Rocque, J. Nader, “Determination of moisture content and basic specific gravity of Populus tremuloides (Michx.) and Populus balsamifera (L.) logs using a portable near-infrared spectrometer,” J. Wood Mater Sci Eng. 10, 3–16 (2015). Crossref, Web of Science, Google Scholar
30. B. G. Osborne, T. Fearn, P. H. Hindle, Practical NIR Spectroscopy with Applications in Food and Beverage Analysis, Longman Group UK Limited, UK (1993). Google Scholar
31. W. Saeys, A. M. Mouazen, H. Ramon, “Potential for onsite and online analysis of pig manure using visible and near infrared reflectance spectroscopy,” J. Biosyst. Eng. 91(4), 393–402 (2005). Crossref, Web of Science, Google Scholar
32. M. Taurines, L. Brancheriau, S. Palu, D. Pioch, E. Tardan, N. Boutahar, P. Sartre, F. Meunier, “Determination of natural rubber and resin content of guayule fresh biomass by near infrared spectroscopy,” J. Ind Crops Prod. 134, 177–184 (2019). Crossref, Web of Science, Google Scholar

Vol. 13, No. 02

Metrics

Downloaded 1,333 times

History

Received 24 October 2019

Accepted 16 December 2019

Published: 6 February 2020

Information

This is an Open Access article. It is distributed under the terms of the Creative Commons Attribution 4.0 (CC-BY) License. Further distribution of this work is permitted, provided the original work is properly cited.

Keywords

PDF download