Prediction of holocellulose and lignin content of pulp wood feedstock using near infrared spectroscopy and variable selection

https://doi.org/10.1016/j.saa.2019.117515Get rights and content

Highlights

  • Holocellulose and lignin content of multispecies hardwoods were predicted accurately by NIR spectroscopy.

  • CARS method effectively enhanced the accuracy and robustness of NIR prediction models.

  • An efficient and concise quantitative analysis tool was proposed to guide future pulp feedstock quality assessment.

Abstract

Wood is the main feedstock source for pulp and paper industry. However, chemical composition variations from multispecies and multisource feedstock heavily affect the production continuity and stability. As a rapid and non-destructive analysis technique, near infrared (NIR) spectroscopy provides an alternative for wood properties on-line analysis and feedstock quality control. Herein, near infrared spectroscopy coupled with partial least squares (PLS) regression was used to predict holocellulose and lignin contents of various wood species including poplars, eucalyptus and acacias. In order to obtain more accurate and robust prediction models, a comparison was conducted among several variable selection methods for NIR spectral variables optimization, including competitive adaptive reweighted sampling (CARS), Monte Carlo-uninformative variable elimination (MC-UVE), successive projections algorithm (SPA), and genetic algorithm (GA). The results indicated that CARS method displayed relatively higher efficiency over other methods in elimination of uninformative variables as well as enhancement of the predictive performance of models. CARS-PLS models showed significantly higher robustness and accuracy for each property using lowest variable numbers in cross validation and external validation, demonstrating its applicability and reliability for prediction of multispecies feedstock properties.

Introduction

As the main raw materials in pulp and paper production, wood consists mainly of cellulose, hemicellulose, lignin, and extractive content [1,2]. These chemical compositions are related to the properties of finished products. Holocellulose, composed of cellulose and hemicelluloses, is an important index to evaluate the pulp potentials of wood feedstock. Lignin and extractive directly affect pulp yield and properties. Lignocellulosic feedstock with high cellulose and low lignin contents is desirable for pulp and paper production [3,4]. However, chemical composition varies considerably in different sources, different wood species, and even different parts of an individual tree [5,6]. These variations within wood properties are especially great in low quality small log and wood residues which are widely used by pulp and paper mills in China due to shortage of quality wood resources supplies [7]. In order to obtain a product with even quality, it is necessary to monitor variations in the raw material entering the pulping process. But traditional chemical analysis methods for the assessment of wood properties are not applicable for industrial online monitoring because of time consuming and high costly [[8], [9], [10], [11]]. As a rapid and non-destructive analysis technique, near infrared (NIR) spectroscopy provides an alternative for characterizing wood properties. Previous studies have demonstrated the potential of NIR spectroscopy to predict various wood chemical and physical properties such as chemical compositions content, basic density, structural changes, fibre morphological characteristics, and so on [[12], [13], [14], [15]]. Usually, a typical application of NIR in wood properties analysis requires an initial calibration process, of which both NIR spectra and composition data are collected from a calibration sample set with a specific content range of interested composition, and then used for the establishment of calibration model through multivariate calibration algorithms [16,17]. Once the prediction model is established, rapid assessment of wood properties will be implemented for unknown wood samples using the easily obtained NIR data within minutes.

NIR absorption bands mainly reveal the overtones and combinations of vibrations from hydrogen-containing groups (Csingle bondH, Osingle bondH, and Nsingle bondH). However, these bands are usually highly-overlapping, broad, and can hardly be assigned directly to distinct chemical composition or molecular structure of an individual wood component [18,19]. Previously, NIR-based models for wood properties predication are usually developed using the full spectral range that contain abundant noise, interferences and uninformative variables [20,21]. These collinearity and irrelevant information in NIR absorption signals easily lead to over-fitting problem during the modelling process, which greatly influence the robustness and reliability of calibration [22]. Recently some studies have begun to realize the importance of spectral variables (wavelength or wavenumber) optimization for NIR quantitative analysis of wood properties [5,23,24]. The spectral ranges associated with interested property can be effectively selected from full spectrum by manual selection or mathematic selection methods. In order to avoid the interference of water band and redundant noise, Ishizuka et al. used spectral bands between 6800 cm−1 and 5800 cm−1 and between 5000 cm−1 and 4050 cm−1 to measure the lignin and holocellulose content in decayed wood [25]. Fernández et al. used a reduced wavenumber range of 7500–5500 cm−1 to create more accurate and robust calibration models for olive tree pruning biomass analysis [26]. However, these manual selection methods are mainly based on the band assignment of characteristic spectral absorption, and the selected spectral ranges still include some redundant bands in order to avoid a loss of information. In contrast, mathematic selection methods based on various optimization algorithms can substantially eliminate irrelevant variables while improve the performance of the model, making the application of NIR on portable or in-line instruments easier [[27], [28], [29]]. Li et al. successfully identified the most significant NIR variables for the prediction of extractive content in heartwood of eucalyptus by a significant multivariate correlation (sMC) algorithm [30,31]. Yu et al. used the combination of uninformative variable elimination (UVE) and successive projections algorithm (SPA) to simplify the PLS models for modulus of elasticity of Fraxinus mandschurica [32]. Through the comparison among several variable selection strategies, Li et al. found competitive adaptive reweighted sampling (CARS) was an efficient variables optimization strategy to enhance predictive performance of NIR models for estimating chemical composition and theoretical ethanol yield of bioenergy sorghum [33]. However, little research reported the comparison of variable selection algorithms to optimize NIR model for quantitative prediction of multispecies wood, especially for pulp and paper feedstock. Therefore, the present research used near infrared spectroscopy to predict rapidly the holocellulose and acid-insoluble lignin content of various hardwoods species including poplars, eucalyptus and acacias. A comparison was made among four variable selection methods based on different optimization strategies, including competitive adaptive reweighted sampling (CARS), Monte Carlo-uninformative variable elimination (MC-UVE), successive projections algorithm (SPA), and genetic algorithm (GA), for improving the predictive performance of PLS calibrations. Ultimately, the main objective of this study was to propose an efficient and stable calibration model constructed solely with relevant informative variables for fast chemical quantitative analyses of wood properties.

Section snippets

Materials

In this study, all wood samples were obtained from actual manufacturing process. Wood chips of eucalyptus and acacia were supplied by a pulp and paper mill in southern China (Gold East Paper Co. Ltd., Zhenjiang City, Jiangsu Province). In order to extend the diversity of wood properties, sampling procedure was performed at different sites in wood chips stacking and approximately 500 g weight chips were collected as a sample at each site, at last 43 eucalyptus chip samples and 43 acacias chip

Diversity of chemical composition content of wood feedstock

Diversity of chemical composition content of same wood species is mainly due to diverse cultivation background, growth status, transportation, and storage conditions. Both of eucalyptus and acacia samples were the mixtures of sapwood and heartwood from multiple origin regions, and therefore displayed broad distributions of holocellulose and lignin content (Fig. 2(a), (b)). Poplar samples displayed relatively narrow chemical composition distributions because they originated from same logging

Conclusion

This work presented the utility of near infrared spectroscopy coupled with partial least squares regression to predict holocellulose and lignin content of multispecies pulp feedstock including poplar, eucalyptus and acacia. Proper spectral pre-processing and variable selection was crucial to obtain accurate and robust models. MSC + 2ndDer pre-processing can efficiently resolve undesirable scatter effect and overlapping peaks. However, different variable selection strategies exhibited different

Declaration of competing interest

On behalf of all authors, the corresponding author states that there is no conflict of interests.

Acknowledgements

This work was financially supported by the National Key Research and Development Program of China: High Efficiency Clean Pulping and Functional Product Production Technology Research (Grant Number: 2017YFD0601005).

References (50)

  • Å. Rinnan et al.

    Review of the most common pre-processing techniques for near-infrared spectra

    TrAC Trends Anal. Chem.

    (2009)
  • R.K.H. Galvao et al.

    A method for calibration and validation subset partitioning

    Talanta

    (2005)
  • T. Mehmood et al.

    A review of variable selection methods in partial least squares regression

    Chemometr. Intell. Lab.

    (2012)
  • H. Li et al.

    Key wavelengths screening using competitive adaptive reweighted sampling method for multivariate calibration

    Anal. Chim. Acta

    (2009)
  • W. Cai et al.

    A variable selection method based on uninformative variable elimination for multivariate calibration of near-infrared spectra

    Chemometr. Intell. Lab.

    (2008)
  • H. Jiang et al.

    Comparison of algorithms for wavelength variables selection from near-infrared (NIR) spectra for quantitative monitoring of yeast (Saccharomyces cerevisiae) cultivations

    Spectrochim. Acta, Part A

    (2019)
  • Z. Xiaobo et al.

    Variables selection methods in near-infrared spectroscopy

    Anal. Chim. Acta

    (2010)
  • R. Leardi et al.

    Genetic algorithms applied to feature selection in PLS regression: how and when to use them

    Chemometr. Intell. Lab.

    (1998)
  • S. Tsuchikawa et al.

    A review of recent near-infrared research for wood and paper (part 2)

    Appl. Spectrosc. Rev.

    (2013)
  • W. He et al.

    Rapid prediction of different wood species extractives and lignin content using near infrared spectroscopy

    J. Wood Chem. Technol.

    (2013)
  • M.N. Uddin et al.

    Method for predicting lignocellulose components in jute by transformed FT-NIR spectroscopic data and chemometrics

    Nordic Pulp & Paper Research Journal

    (2019)
  • L. Karlinasari et al.

    Near infrared (NIR) spectroscopy for estimating the chemical composition of (Acacia mangium Willd.) wood

    J. Indian Acad. Wood Sci.

    (2014)
  • S. Park et al.

    Rapid prediction of the chemical information of wood powder from softwood species using near-infrared spectroscopy

    Bio Resources

    (2018)
  • T. Wu et al.

    Analysis of mixed pulping raw materials of Eucalyptus globulus and Acacia mangium by near infrared spectroscopy technique combined with lasso algorithm

    BioResources

    (2018)
  • T. Baldin et al.

    Evaluation of alternative sample preparation methods for development of NIR models to assess chemical properties of wood

    BioResources

    (2018)
  • Cited by (63)

    • Fast analysis of straw proximates based on partial least squares using near-infrared spectroscopy

      2024, Spectrochimica Acta - Part A: Molecular and Biomolecular Spectroscopy
    View all citing articles on Scopus
    View full text