In-situ monitoring of saccharides removal of alcohol precipitation using near-infrared spectroscopy
Abstract
As unsafe components in herbal medicine (HM), saccharides can affect not only the drug appearance and stabilization, but also the drug efficacy and safety. The present study focuses on the in-line monitoring of batch alcohol precipitation processes for saccharide removal using near-infrared (NIR) spectroscopy. NIR spectra in the 4000–10,000-cm−1 wavelength range are acquired in situ using a transflectance probe. These directly acquired spectra allow characterization of the dynamic variation tendency of saccharides during alcohol precipitation. Calibration models based on partial least squares (PLS) regression have been developed for the three saccharide impurities, namely glucose, fructose, and sucrose. Model errors are estimated as the root-mean-square errors of cross-validation (RMSECVs) of internal validation and root-mean-square errors of prediction (RMSEPs) of external validation. The RMSECV values of glucose, fructose, and sucrose were 1.150, 1.535, and 3.067mg⋅mL−1, and the RMSEP values were 0.711, 1.547, and 3.740mg⋅mL−1, respectively. The correlation coefficients (r) between the NIR predictive and the reference measurement values were all above 0.94. Furthermore, NIR predictions based on the constructed models improved our understanding of sugar removal and helped develop a control strategy for alcohol precipitation. The results demonstrate that, as an alternative process analytical technology (PAT) tool for monitoring batch alcohol precipitation processes, NIR spectroscopy is advantageous for both efficient determination of quality characteristics (fast, in situ, and requiring no toxic reagents) and process stability, and evaluating the repeatability.
1. Introduction
The safety and efficacy of herbal medicine (HM) are dependent not only on the quality of medicinal materials but also on the manufacturing processes. Owing to the complexity of HM, a number of purification technologies have been used by the pharmaceutical industry to manufacture HMs with high safety standards and efficacy. Efficacy ingredients, toxic and harmful components, and potential unsafe components are considered the three main groups for analysis in the purification processes.1 The removal of potential unsafe components can improve drug safety and reduce drug dosage. Although not considered as efficacy components, saccharides, such as glucose, fructose, and sucrose, are commonly found in HM extracts. These saccharides in HMs can affect not only the drug appearance and stability, but also the efficacy and safety. Color, degradation of glucose, fructose, and sucrose, and organic acid formation have been shown to be highly correlated.2 Coca et al.3 further confirmed that colored compounds were formed in juices and syrups due to sugar degradation and reactions between amine compounds and carbohydrates. Researchers have also focused on the transformation pathways and products of these saccharides. Yang and Montgomery4 and Asghari and Yoshida5 reported that fructose and other saccharides can possibly produce 5-hydroxymethylfurfural (5-HMF) under certain conditions. The production of 5-HMF, which has reported cytotoxicity, genotoxicity, and tumorigenicity, is limited at different safe concentrations in different pharmacopeia.6 Therefore, the further reduction and removal of saccharides during manufacturing process is still needed to assure high quality of their end products.
Alcohol precipitation is among the most effective methods for saccharide removal, with the advantage of being simple, rapid, easily scalable, and cost-effective.7 This method has also been identified as a key operation affecting the downstream processes and even product quality, based on previous research and risk assessment conducted by ICH-Q8 and ICH-Q9 guidelines.8,9,10 Traditionally, various laboratory-based analytical techniques, such as colorimetric measurements, high-performance liquid chromatography (HPLC), and capillary electrophoresis, have been used for the quantitative determination of saccharides in HM, biological systems, fruit juice, and wine.11,12,13 However, these offline methods suffer from various disadvantages, such as time delays, nonrepresentative sample collection, analytical errors in sample preparation, and labor-intensive analysis.9,14
In contrast, process analytical technology (PAT) tools initiated by the Food and Drug Administration (FDA)15 may overcome the difficulties of offline laboratory techniques and allow at-line, in-line, or online measurements of critical process performance and quality attributes. Following such initiative, various spectroscopic (UV/vis, IR, and Raman) and chromatographic methods for process monitoring have been developed. Among these methods, near-infrared (NIR) spectroscopy is the most widely used in the pharmaceutical industry,16,17,18,19 mainly owing to its rapid turnaround (in seconds or minutes), nondestructive nature, high sensitivity, and minimal or no sample preparation.20 NIR spectroscopy, a type of vibration spectroscopy, has a wavelength range of 700–2500nm (14,300–4000cm−1). Fundamental absorptions found in mid-IR are mainly due to overtones and combination bands of hydrogen bonds, such as –OH, –NH, and –CH.21 Luypaert et al.22 published a review on pharmaceutical applications of NIR, including NIR technique development in the pharmaceutical field, where it could be applied to raw material identification, manufacturing process monitoring, and final product release. A book reviewing the pharmaceutical and medical applications of NIR spectroscopy by Ciurczak and Drennen17 describes NIR applications at different stages of manufacture, including blending, granulation, drying, and coating. Compared with conventional laboratory analysis, a significant amount of analysis time can be saved and sampling errors reduced by in-situ processes measurement.9 However, the application of NIR spectroscopy to HM manufacturing processes remains limited. HM manufacture is complex and involves many process stages and constituents, presenting a great challenge for in-line monitoring and quality control. NIR spectroscopy can be used to measure many chemical compositions and physical properties (such as temperature, viscosity, and granularity) and can, therefore, play an important role in HM quality control and improving the continuous process control strategies, quality verification, and real-time release (RTR).23,24,25 However, NIR calibrations often require a sampling procedure for reference measurements, which is particularly important in PAT applications using spectroscopic probes, because reference values of samples are biased against the real process state.
Herein, the alcohol precipitation process of Danshen (Salvia miltiorrhiza bunge) was selected for study. Danshen preparations made from aqueous extracts of Salvia miltiorrhiza root are among the most widely used HMs for the treatment of cardiovascular disease, heart stroke, and cerebrovascular disease.26,27,28 Potential unsafe saccharide components, such as glucose, fructose, sucrose, and raffinose, are commonly found in Danshen extracts.1 Herein, NIR spectroscopy was evaluated as a method for monitoring the in-situ batch alcohol precipitation processes for saccharide removal. An NIR transflectance probe was directly inserted into an alcohol precipitation system (in situ) and corresponding spectral measurements were obtained in several seconds. The spectral results were subsequently used to model three key quality attributes: glucose, fructose, and sucrose. These attributes were selected because they are nonbioactive impurities and are typically used to characterize the effectiveness of batch alcohol precipitation processes. NIR predictions were then employed directly to evaluate the batch processes and enhance the process understanding. All experiments were performed on a laboratory scale to simulate the industrial process of alcohol precipitation for saccharide removal from Danshen.
2. Experiment and Methods
2.1. Materials
Danshen extracts and 95% EtOH were obtained from Chiatai Qingchunbao Pharmaceutical Co., Ltd. (Hangzhou, China) and Changqin Chemical Co., Ltd. (Hangzhou, China), respectively. Standards for D-glucose, D-fructose, and sucrose were purchased from Sinopharm Chemical Reagent Co., Ltd. (Shanghai, China). HPLC-grade acetonitrile and formic acid were supplied by Merck (Darmstadt, Germany) and Tedia Company (Fairfield, OH, USA), respectively. Deionized water was prepared using a Milli-Q water purification system (Milford, MA, USA).
2.2. Experiment setup and design
The batch alcohol precipitation process was performed in a custom glass beaker with a maximum volume of about 4L. The experimental setup is shown in Fig. 1. An aqueous solution of Danshen (approximately 1.23kg) was introduced into the beaker and kept in a water bath at 20∘C under mechanical stirring with a three-impeller stirrer and a motor starter. Stirring was continued while 95% EtOH was added with a peristaltic pump. The rate of EtOH addition was set at 100mL/min. The solution was then allowed to stand for an hour without stirring. The NIR transflectance probe was then carefully inserted into the beaker at a fixed point to collect the NIR spectra continually during alcohol precipitation.

Fig. 1. Schematic of experiment setup for alcohol precipitation of Danshen.
All experiments were conducted on a laboratory scale. A design matrix with two factors, six center-point (CP) replicates, and two abnormal experimental runs was constructed. Two experimental factors, EtOH volume and initial density (solid concentration of the aqueous solution of Danshen), were included in the design of experiments (DoE), as summarized in Table 1. In addition to DoE, abnormal operation conditions were added to two batches (Table 2) named process deviation (PD) batches to simulate the real-world process deviations and assess model capability.
Batch name (abbreviation) | Initial density (kg⋅m−3103) | EtOH volume (L) |
---|---|---|
Center point 1 (CP1) | 1.23 | 3.0 |
Center point 2 (CP2) | 1.23 | 3.0 |
Center point 3 (CP3) | 1.23 | 3.0 |
Center point 4 (CP4) | 1.23 | 3.0 |
Center point 5 (CP5) | 1.23 | 3.0 |
Center point 6 (CP6) | 1.23 | 3.0 |
Experiment 7 (DoE7) | 1.23 | 2.0 |
Experiment 8 (DoE8) | 1.18 | 3.0 |
Experiment 9 (DoE9) | 1.23 | 4.0 |
Experiment 10 (DoE10) | 1.28 | 3.0 |
Process deviation 11 (PD11) | 1.23 | 3.0 |
Process deviation 12 (PD12) | 1.23 | 3.0 |
Batch name (abbreviation) | Description of process deviation |
---|---|
Process deviation (PD11) | Agitation failure: Stirrer stopped at t=16min, restarted at t=26min |
Process deviation (PD12) | Peristaltic pump failure:Increase rate of EtOH addition to 200mL/min at t=11min;Decrease rate of EtOH addition to 100mL/min at t=22min;Decrease rate of EtOH addition to 50mL/min at t=32min |
2.3. NIR instrumentation
NIR spectra were continuously collected in situ during alcohol precipitation using a Thermo ScientificTM Antaris II Fourier Transform (FT) NIR analyzer equipped with a transflectance probe with an adjustable optical path length (S-650, Thermo Nicolet Corporation, WI, USA). The transflectance probe (path length fixed to about 2mm) was connected directly to the NIR spectrometer via an optic fiber and inserted into the beaker to record the process spectra. Each spectrum was collected in the 5000–10,000-cm−1 range with a total of 64 scans at a resolution of 8.0cm−1. The software was programmed to acquire one spectrum every minute. The total duration of the batch process was 80min, comprising 10min for raw material addition, 37min for 95% EtOH addition, and 33min for the standing phase.
2.4. Samples
To obtain a good sampling distribution, samples (2.0mL) were taken at random every 5–30min for each batch. A total of 115 samples were collected throughout alcohol precipitation from different experimental batches, and their glucose, fructose, and sucrose contents were analyzed by HPLC using reference methods (see Sec. 2.5 for the specific method). To obtain representative samples, both CP batches and DoE batches with a wide range of saccharides were used during model development. Glucose, fructose, and sucrose models used in this study were developed based on 100 samples from these CP and DoE batches. A total of 15 samples from the PD batches (eight samples from PD11 and seven samples from PD12) were used as an external validation set.
2.5. Reference methods
The glucose, fructose, and sucrose contents were determined by HPLC-evaporative light scattering detector (ELSD) analysis. An Agilent 1100 series HPLC instrument (Agilent Technologies, Waldbronn, Germany) equipped with a vacuum degasser, a quaternary gradient pump, an autosampler, and a thermostatted column compartment was used. The signal from a Sedex75 ELSD (Sedere, France) was transmitted to the Chemstation for processing by an Agilent 35900E A/D interface (Agilent Technologies, Santa Clara, USA). Chromatographic separations were performed in a WATO44355 carbohydrate column (250mm×4.6mm, 4μm, Waters, MA) at 1.0mL⋅min−1 using acetonitrile–water (78:22, v/v) as the mobile phase. The column temperature was set at 30∘C, and the injection volume was 10μL. Samples of crude extract and alcohol precipitation liquid were diluted twofold and 20-fold, respectively, and centrifuged at 12,000rpm for 10min using a centrifuge (Eppendorf Co., Germany) before HPLC-ELSD analysis.
2.6. Chemometrics
Calibration models based on NIR spectra for glucose, fructose, and sucrose were developed using the partial least squares (PLS) algorithm.29 PLS regression analysis was performed to establish a linear relationship between the two matrices, NIR spectral data (Xn×m) and saccharide concentration (Yn×k), which were recorded for n samples. The objective of PLS was to project the data down onto a number of latent variables (LVs), such as ti and ui, where i is the number of latent variables, and then develop a regression model between ti and ui :
PLS regression is a multivariate technique with no restriction on the number of wavelengths, and is especially useful for ranking deficiency, when the number of samples is less than that of variations or in the case of highly correlated variables.30 Further details on PLS modeling can be found in the literature.31 The two most commonly used validation methods are cross-validation and test set validation. The k-fold cross-validation is a commonly used cross-validation technique for estimating the prediction error. From k subsamples, a single subsample is retained as the validation data to test the model, and the remaining k−1 subsamples are used as training data. The cross-validation process is then repeated k times (the folds), with each of the k subsamples being used exactly once as the validation data. The k results from the folds can then be averaged (or otherwise combined) to produce a single estimate. This method has the advantage that all observations are used for both training and validation, and each observation is used for validation exactly once. In this study, tenfold cross-validation was applied as the internal validation. All data were chemometrically processed using Unscrambler 9.6 (Camo, Oslo, Norway).
The performance of the constructed model was evaluated using the correlation coefficient (r), the root-mean-square error of cross-validation (RMSECV), and the root-mean-square error of prediction (RMSEP). The r value between the NIR prediction and measured value was calculated as follows :
RMSECV was determined from the cross-validation, calculated as follows :
RMSEP was used as a model accuracy indicator of the external testing, and was calculated as follows :
Different spectra preprocessing techniques,32 such as derivatives, standard normal variate (SNV), and multiplicative signal correction (MSC), were applied to minimize the physical differences of samples, especially in terms of particle size and undesirable systematic variations.33 In this study, considering the characteristics of the spectra, SNV was applied for preprocessing the spectra to minimize the effect due to different kinds of interference. This method centers each spectrum at around zero by subtracting from the average (additive adjustment) or dividing by the standard deviation (multiplicative adjustment) in all variables of the spectrum.
2.7. In-line monitoring advantages
Predictive saccharide values based on the constructed models were produced every minute to provide updates on the process status. In-line operation and integration with the predictive system reduced the sampling and test requirements, and, therefore, freed up resources. An increase in sampling frequency led to more effective monitoring and control of alcohol precipitation for saccharide removal. Process deviations and disturbance were detected in a timely fashion based on the process trajectory and evolution of batches for saccharide removal performance. The process trajectory of alcohol precipitation was determined using SIMCA-P+ 12.0.1 software (Umetrics, Umeå, Sweden).
3. Results and Discussion
3.1. NIR process spectra
NIR spectra of the beaker contents were measured every minute (in-situ measurement). Figure 2 shows the raw and preprocessed spectra of a normal-operating-conditions batch run of alcohol precipitation from Danshen using SNV. Preprocessing enabled most spectral variations to be eliminated due to physical differences. The processed NIR spectra exhibited intense absorbance bands in the 5300–7274-cm−1 wavelength region. The peak at around 5361cm−1 might be attributed to the absorption of a combination of O–H stretching and bending, while the peak around 6888cm−1 was likely due to the first overtone of O–H stretching in alcohol and water.34 The 5682–6024-cm−1 region was related to the first overtone of C–H stretching modes in CH3 and CH2 groups.35

Fig. 2. NIR spectra obtained every minute from in-situ monitoring during the alcohol precipitation: (a) raw NIR spectra and (b) preprocessed spectra with SNV.
3.2. Saccharides calibration
The variability observed among the 100 Danshen samples used to model the glucose, fructose, and sucrose parameters could be measured by dividing the standard deviation of each parameter by the corresponding range (Table 3). Three PLS1 models, one each for glucose, fructose, and sucrose, were developed based on NIR spectra acquired in situ. For all calibration models, the best preprocessing method was found to be SNV and the optimal region was 5681–6024cm−1 combined with 6329–7274cm−1. Optimal PLS models of glucose, fructose, and sucrose, cross-validated with 100 samples, were obtained using four components, and the corresponding RMSECV values were 1.150, 1.535, and 3.067mg⋅mL−1, respectively.
Sample set | Unit | Range | Sample number | Mean | S.D.a |
---|---|---|---|---|---|
(a) | 100 | ||||
Glucose | mg⋅mL−1 | 1.02–17.75 | 4.56 | 3.14 | |
Fructose | mg⋅mL−1 | 1.47–25.71 | 7.09 | 4.77 | |
Sucrose | mg⋅mL−1 | 2.41–49.98 | 12.21 | 8.43 | |
(b) | 15 | ||||
Glucose | mg⋅mL−1 | 1.73–13.05 | 3.40 | 2.94 | |
Fructose | mg⋅mL−1 | 2.86–22.41 | 5.35 | 5.04 | |
Sucrose | mg⋅mL−1 | 4.67–16.91 | 7.47 | 3.82 |
Figure 3 represents the reference and cross-validated PLS predicted values for glucose, fructose, and sucrose. Good agreement was found between reference and predictive values both in high correlation coefficient and low RMSECV. The models exhibited a good utility value for the quality control of alcohol precipitation. The RMSEP values of external validation for glucose, fructose, and sucrose (Table 4) were 0.711, 1.547, and 3.740mg⋅mL−1, respectively, which were obtained in line. As shown in Table 4, higher sugar concentrations resulted in larger predictive errors. A possible explanation was that sampling at high sugar concentrations presented a significant challenge owing to the rapid decrease in sugar content at the beginning of the purification process with increasing ethanol mass fraction.

Fig. 3. PLS NIR-model validation plots for saccharides of Danshen: (a) glucose, (b) fructose, and (c) sucrose.
Glucose (mg⋅mL−1) | Fructose (mg⋅mL−1) | Sucrose (mg⋅mL−1) | ||||
---|---|---|---|---|---|---|
Sample | Predictive | Reference | Predictive | Reference | Predictive | Reference |
1 | 11.156 | 13.052 | 17.177 | 17.177 | 28.048 | 14.030 |
2 | 4.162 | 4.310 | 6.250 | 6.250 | 10.928 | 11.574 |
3 | 2.502 | 3.212 | 3.984 | 3.984 | 7.369 | 8.060 |
4 | 2.533 | 2.385 | 3.924 | 3.924 | 6.704 | 6.025 |
5 | 1.693 | 2.031 | 2.714 | 2.714 | 4.887 | 5.418 |
6 | 1.27 | 1.731 | 2.133 | 2.133 | 4.498 | 4.757 |
7 | 1.743 | 1.854 | 2.843 | 2.843 | 5.01 | 4.673 |
8 | 1.743 | 1.854 | 2.843 | 2.843 | 5.01 | 4.673 |
9 | 7.074 | 6.224 | 10.932 | 10.932 | 19.102 | 16.912 |
10 | 3.976 | 3.658 | 6.027 | 6.027 | 10.137 | 8.991 |
11 | 2.196 | 2.687 | 3.504 | 3.504 | 6.349 | 6.480 |
12 | 1.442 | 2.103 | 2.401 | 2.401 | 4.186 | 5.288 |
13 | 0.964 | 2.139 | 1.694 | 1.694 | 3.488 | 5.284 |
14 | 1.44 | 1.885 | 2.411 | 2.411 | 4.432 | 5.079 |
15 | 1.888 | 1.851 | 3.076 | 3.076 | 5.44 | 4.839 |
RMSEP | 0.711 | 1.547 | 3.740 |
3.3. Implementation of in-line monitoring
The effectiveness of NIR spectroscopy in predicting the tendency of these saccharide parameters during the batch alcohol precipitation process could enhance the process understanding. Figure 4(a) shows the results of sucrose PLS calibration during the dynamic alcohol precipitation process generated using data from one CP batch, as compared with the reference measurements. Good agreement between predictive and reference measurement values was observed, except for at the timepoint 40. At timepoints 37–47 during the end stage of alcohol addition, precipitates from adsorption cross-linked with each other and agglomerated to form larger precipitates. NIR spectra showed interference due to these particle variations, which made prediction challenging. Six CP batches with sucrose variation were predicted using the sucrose PLS calibration shown in Fig. 4(b), in which three stages (feeding at timepoints 1–10, EtOH addition stage at timepoints 11–47, and standing at timepoints 48–80) of the batch process are distinguished. The predictive sucrose values remained constant until EtOH addition at the filling stage. The sucrose content tended to decrease during the purification process, probably owing to the solubility of sucrose being significantly reduced with an increasing ethanol mass fraction. The number of small particles increased and became flocculent, but did not aggregate until the purification had ended. When the alcohol content of the supernatant reached a certain level at timepoints 37–47 (the critical point), precipitates of small particles continued to cross-link into flocs, followed by aggregate formation. During this process, various particles increased the complexity of the system, making the process trajectory highly volatile. The sucrose content in the supernatant after about 60 min was stable, implying that the system had reached a balance. The other two kinds of saccharides (not shown here) exhibited the same trends and behaviors. Furthermore, the results in the chart highlighted the reproducibility of the CP batches based on in-line NIR measurement of saccharides.

Fig. 4. PLS models for in-line monitoring applications: (a) sucrose prediction during the alcohol precipitation for certain batch runs and (b) process trajectory and evolution of batches for sugar removal based on the six CP batches.
In the derived principle component analysis (PCA) model, the first principle component (PC) captured 99.9% of the overall variation, representing a good summary of batch variations. A control chart of the three saccharides based on the six CP batches is shown in Fig. 5(a). The control limits of the batch control chart corresponded to an average saccharide content of ±3 standard deviations. Two PD batches were used for validation, with both abnormal batches exceeding the control limits at certain timepoints. Agitator failure, caused by the agitator stirring stopping at timepoint 16, was introduced to create process deviation during the alcohol precipitation of batch PD11. As a result, batch PD11 evolved outside the control limits accurately at timepoint 16. The process trajectory fell back into the control line until the paddle returned to normal at timepoint 26. Similar behavior was observed in the batch PD12 with a peristaltic pump failure. Contribution plots in Figs. 5(b) and 5(c) were used to determine which of the original variables were most related with these phenomena. Three saccharides were positively correlated during the alcohol precipitation process. The trajectories of the PD batches recovered and returned to within normal limits after the process disturbance disappeared. The agitation break or peristaltic pump failure for a short period was shown to have an insignificant effect on the end product, which was in complete agreement with the conclusions of our previous work.9

Fig. 5. Process monitoring of the PD baths: (a) A control plot of the PD baths based on the six PC batches, (b) score contribution of bath PD11 at the timepoint 24, and (c) score contribution of bath PD12 at the timepoint 15.
According to these results, it was concluded that the process faults could be detected by developing a total sugar control chart based on an in-line monitoring model. While the HPLC method (used as the reference method here) took about 30min to provide a set of results for a sample, the NIR technique required only a few seconds to provide an estimate. The prediction accuracy and increased number of results provided by NIR spectroscopy allowed minimal manual sampling and traditional testing. Additional benefits included reduced worker exposure to alcohol mixtures and reduced environmental pollution caused by reagents.
4. Conclusion
NIR spectroscopy has been applied to the in-situ monitoring of a laboratory-scale batch alcohol precipitation process. Key parameters of glucose, fructose, and sucrose were modeled by PLS regression based on the in-situ NIR spectra. The internal and external validations of the model gave the RMSECV values of 1.038, 1.390, and 2.217mg⋅mL−1, and RMSEP values of 0.711, 1.547, and 3.740mg⋅mL−1, for the above three saccharides, respectively. The good utility values of the models were found to predict the tendency of saccharide parameters during the batch alcohol precipitation process. The application of in-situ NIR spectroscopy in this study confirms that NIR spectroscopy, combined with multivariate analysis, can be successfully applied to modeling and predicting the kinetic variables of alcohol precipitation (namely, glucose, fructose, and sucrose), despite the complexity inherent in studies of HM systems. The major advantages of the proposed method in this study are the use of no toxic reagents, in-situ monitoring, and high cost-effectiveness. The availability of alcohol precipitation descriptors in real-time could be applied to both laboratory and industrial scales. Future extensions of this work include the application of the same methods to laboratory and industrial samples, determination of other quality parameters, such as tannins or other specific impurities, and the combination of process parameters with NIR spectroscopy to better understand the processes. This will allow nondestructive, accurate, and near-real-time determination, monitoring, and control of the whole batch process to be realized.
Conflict of Interest
The authors declare that there is no conflict of interest.
Acknowledgments
This work was supported by the State Administration of Traditional Chinese Medicine of Zhejiang Province Project (No. 2015ZQ022) and the Zhejiang TCM Health Science and Technology Project (No. 2015KYB110).