Application of near-and mid-infrared spectroscopy combined with chemometrics for discrimination and authentication of herbal products : A review

Abdul Rohman1*, Anjar Windarsih1, M A. Motalib Hossain2, Mohd Rafie Johan2, Md. Eaqub Ali2, Nurrulhidayah Ahmad Fadzilah3 1Faculty of Pharmacy, Universitas Gadjah Mada, Yogyakarta, Indonesia. 2Nanotechnology and Catalysis Research Centre (NANOCAT), University of Malaya, Kuala Lumpur, Malaysia. 3International Institute for Halal Research and Training, International Islamic University Malaysia, Jalan Gombak, Kuala Lumpur, Malaysia.


INTRODUCTION
Herbal medicine is the most widely used complementary and alternative medicine therapies throughout the world (Joos et al., 2012).Herb is part or whole of plants used for medicinal and therapeutic applications.Herbal medicines typically consisted of plants or plant extracts containing some active constituents, which frequently work synergistically (Folashade et al., 2012).Chemical constituents having some medicinal benefits are referred as active ingredients or active principles, such as curcuminoid present in Curcuma species.The presence and levels of active components depend on several factors, including plant species, time and season of harvesting, soil types, and other environmental conditions (Heinrich, 2015).Currently, over 80% of the world population use herbal medicines as preventive and promotive agents either in developing or developed countries (Barnes, 2003).As a consequence, the increased use of the herbal product has also driven to some actions of adulteration and abuse of the herbal products, leading to consumers and producers disappointment, and in some instances, the abuse and adulteration can cause health problems (Bodeker et al., 2005).
The discrimination and authenticity of herbal products are emerging issues (Georgiev et al., 1999), especially in countries which develop alternative medicines as primary health care such as China, India, Germany, and Indonesia (Liang et al., 2004).Herbal authentication is mainly related to improper labeling and economic adulteration.Motivated by economic profits, high-quality herbal medicines may be adulterated with lower quality herbals having less expensive price to defraud the consumer.The adulteration practice also involved the substitution, either in part or whole of expensive herbal components with cheaper and inferior herbal products.An authentic herbal product can be defined as herbal that complies with the description or labeling provided by the producers, which included plants composition, its geographic region of origin, and the variety or species of ingredients (Jordan et al., 2010).Han et al. (2016) have reported adulteration practices in Chinese Herbal Medicines.Using DNA barcode database of traditional Chinese Medicine (TCM), 1,436 samples representing 295 medicinal species from seven primary TCM markets in China have been investigated.Of the 1,260 samples, approximately 4.2% of herbal medicines were identified to be adulterated.Some herbal components such as Ginseng Radix et Rhizoma, Radix Rubi Parvifolii, Dalbergiae odoriferae Lignum, Acori Tatarinowii Rhizoma, Inulae Flos, Lonicerae Japonicae Flos, Acanthopanacis Cortex, and Bupleuri Radix are among target of adulteration.The survey also reported that adulterants were present in the Chinese market.In order to assure the quality of labeled herbal medicines, it is essential to establish the methods to identify its authenticity either by checking the composition of herbal ingredient or monitoring batch-to-batch reproducibility (Kulkarni et al., 2014).

DISCRIMINATION AND AUTHENTICATION TESTING
Several methods have been used for identification, discrimination, and authentication of herbal ingredients, including macroscopic and microscopic evaluation, simple wet chemical tests such as color or precipitation, and application of some sophisticated instruments like spectrophotometer, real-time polymerase chain reaction, and chromatographs (especially in thin layer chromatography, gas chromatography, and high-performance liquid chromatography) (Kamboj, 2012).During last years, a new strategy for herbal medicine discrimination and authentication has been introduced by applying analysis of targeted compounds (classical approach) and non-targeted analysis by fingerprint profiling and metabolomics approach (Riedl et al., 2015).In the classical approach, authenticity testing was based on the analysis of specific marker compounds which are indicative for certain herbal products.For example, curcuminoid was determined using validated analytical method that can be used for identification of Curcuma species.The addition of Curcuma longa with other herbal materials would decrease curcuminoid content and could be used as an indicator of adulteration practice (Lestari et al., 2017).Targeted analysis has its own advantage and disadvantages.The advantages of this approach are high sensitivity, high selectivity, and simple data analysis, while its disadvantage included that high efforts are being put on the validation of the analytical method to ensure the validity of method, time-consuming, typically involving extensive sample preparation and multiple analysis, and only known compounds can be detected (Esslinger et al., 2014).Chromatographic based-methods are ideal methods for analysis of targeted compounds due to its capability to provide good separation among analytes present in herbal medicines (Daszykowski and Walczak, 2006).
In fingerprinting approaches, many compounds or features are used for authentication by investigating response profiles and then comparing the profiles between authentic and adulterated herbal medicines.Spectroscopic including near-and mid-infrared, nuclear magnetic resonance (NMR), and chromatographic methods were commonly used for profiling patterns between authentic and adulterated herbal medicines (Rafi et al., 2015).Metabolomics is described as the study of small molecules and metabolites based on comprehensive chemical analysis with the aim to detect as many substances as possible.NMR spectroscopy and chromatographic techniques combined with mass spectrometer detectors such as gas chromatography-mass spectrometry (GC-MS) and liquid chromatography-mass spectroscopy (LC-MS) were widely used for metabolomics study (Markley et al., 2017;Savorani et al., 2013).Non-targeted analyses have advantages of high-throughput approach, simple or no sample preparation, and allowing detection of unexpected additives/deviations, and they also have main disadvantage of the need of sample databases and multivariate modeling (Esslinger et al., 2014).
The fingerprint and metabolomics approaches have been applied for quality control of herbal medicines.Fingerprinting and metabolomics in herbal authentication represents the characteristic patterns of the components using certain analytical methods (Razmovski-Naumovski et al., 2010).The data generated during non-targeted approach (fingerprint profiling and metabolomics) are usually huge and complex and difficult to be interpreted; therefore, an advance statistical evaluation known as chemometrics was used (Small, 2006).Owing to its properties such as fingerprint nature which allow to make fingerprinting profile of herbal components, near-IR (NIR) and mid-IR (MIR) spectroscopies have been widely used for discrimination and authentication of herbal components (Rohman et al., 2014).

INFRARED SPECTROSCOPY
Infrared (IR) spectroscopy has been defined as the interaction between electromagnetic radiations in the infrared region with samples.This spectroscopy measures the vibrational energy levels in a compound.Each chemical bond has a unique vibrational energy level (Bunaciu et al., 2015).The IR electromagnetic radiation was divided into three regions, namely, NIR with wavenumbers range of 14,000-4,000 cm −1 (corresponding to wavelength of 800-2,500 nm), MIR at 4,000-400 cm −1 (2,500-50,000 nm), and far IR at wavenumbers of 400-50 cm −1 (50-1,000 µm).NIR and MIR spectroscopies were commonly used for confirmation (identification), qualitative and quantitative analyses of herbal medicines and pharmaceutical products, and offered an alternative to wet-chemical techniques (Lohumi et al., 2015).
NIR and MIR spectroscopies have gained popularity in identification and authentication analysis due to several advantages, namely, its rapidness with respect to fast acquisition, non-destructive analytical technique meaning that samples analyzed by IR spectroscopy can be analyzed using other instrumental technique like chromatography, low cost, ease in sample preparation with minimal sample treatment, and can be applied for analysis of liquid, semi-solid, and solid samples.The fast-growing application of IR spectroscopy was also supported by the advancement in computer and software technology, especially in chemometrics (Mazivila and Olivieri, 2018).However, IR spectroscopy also has its own disadvantages due to the nature of spectra and environment condition.The physical state of evaluated samples and the testing environment could influence IR spectra, which make the interpretation of spectra more complicated.In addition, spectra obtained are frequently complex which make the extraction of relevant information difficult.Fortunately, the statistical software known as chemometrics could facilitate these problems.IR spectroscopy and chemometrics are complementary methods and widely applied for authentication of several fields covered in pharmaceutical products, namely, drugs (Mazivila and Olivieri, 2018), cosmetics (Rohman and Man, 2009), herbal products (Rohman et al., 2014), as well as food sectors, including fats and oils (Baeten and Aparicio, 2000), honey (Mehryar andEsmaiili, 2011), andmeat (Schmutzler et al., 2015).

CHEMOMETRICS
Chemometrics is defined as the application of mathematics and statistics to treat chemical data (Gemperline, 2006) and is considered as part of analytical chemistry sciences.Chemometrics can help analytical chemists to deal with all steps of analytical procedures, starting from the experimental design and optimization through extraction of chemical information and final decision (Daszykowski and Walczak, 2006).Chemical data typically include properties and values of numerous compounds as determined by instrumental methods and having various sources of variance.Accordingly, the statistical evaluation of such data should use one or more multivariate data statistics (chemometrics).Multivariate statistics allows the simultaneous analysis of several independent variables (factors) against several dependent variables or responses (Granato et al., 2018).
The most common types of chemometrics used in NIR and MIR spectroscopies for discrimination and authentication of herbal medicines include: • Spectral processing using derivatization (first and second derivatives), standard normal variate (SNV), multiplicative scattering correction (MSC), filtering, wavelet transformations (WT), feature selection, and folding-unfolding.The main objective for data pre-processing is improving the accuracy and robustness of subsequent classification or quantitative analyses, improving interpretability of data by transforming raw data into an understandable format, detecting and removing of outliers, and reducing the dimensionality of the data (Lasch, 2012).The first and second derivatives can eliminate baseline variations among samples analyzed significantly and enhance the small spectral differences.MSC was capable of correcting light scattering, additive, and multiplicative spectral effects.SNV is a mathematical transformation method normally applied to the log (1/R) spectra to minimize slope variation and to correct for scatter effect.WT enables infrared spectrum to be analyzed as the sum of wavelet functions with different spatial and frequency properties (Lai et al., 2011).et al., 2013).
Figure 1 exhibited the scheme of general steps during the application of chemometrics methods to treat NIR and MIR spectra data to assess the discrimination and authentication of herbal medicine products, including spectral treatment, classification, and quantification (Nunes, 2014).
Currently, there are a number of user-friendly chemometrics software packages which are free or commercially available to carry out statistical calculations of complex data.Each software has its own advantages and features.Unscrambler ® , SIMCA ® SIRIUS ® , and Pirouette ® offered standard methods of multivariate statistical analysis, such as classification with PCA and SIMCA and multivariate calibration with PCR, PLS, and SMLR, but there is a few capacity for writing personal programs.On the other hand, Minitab ® and Matlab ® are routinely designed to facilitate the writing of personal programs, while Grams ® 32 is particularly useful for calibration modeling during quantitative analysis rather than for the exploration of a data matrix and classification by different pattern recognition techniques (Gad et al., 2013;Rodionova and Pomerantsev, 2006).Currently, Chemoface, a user-friendly and free interface software is used for chemometrics analysis (Nunes et al., 2012).

AUTHENTICATION OF HERBAL MEDICINE USING NEAR INFRARED SPECTROSCOPY
Near-infrared (NIR) spectroscopy is a fast and nondestructive analytical technique which provides chemical and physical information of samples (Roggo et al., 2007).The combination of NIR spectroscopy and multivariate data analysis offered many interesting perspectives either qualitative or quantitative analyses, which shows the authentication of herbal medicines (Reich, 2005).The American Society of Testing and Materials has defined NIR spectroscopy as the interaction between samples and electromagnetic radiation in the NIR region, corresponding to the wavelength of 780-2,526 nm, located between visible light and MIR region (Wang and Yu, 2015).The most prominent peaks of NIR absorption originated from the overtones and combinations of the fundamental vibrations appeared in MIR peaks related to functional groups containing hydrogen bonds such as -O-H, -S-H, -C-H, and -N-H (Jamrógiewicz, 2012).Table 1 shows the application of NIR spectroscopy for authentication of herbal medicines.
Discrimination and spectral fingerprinting of Wolfiporia cocos (F.A. Wolf) Ryvarden & Gilb, one of the traditional Chinese medicine, have been performed with NIR spectroscopy and PCA.The identification and discrimination of W. cocos based on its geographical origin are one of the acceptance prerequisites for its worldwide recognition because the therapeutic effect of W. cocos depends on its chemical components.The active compounds contained in W. cocos revealed difference due to geographical origins.Hence, authentication based on geographical origins was very essential.The powder of samples was subjected to NIR spectrophotometer at wavenumbers 10,000-4,000 cm −1 using 64 scanning and resolution of 4 cm −1 .NIR spectra of W. cocos were pre-treated with Norris, mean centering, standardization, and the second derivative, successively.After optimizing spectral treatment, the wavenumbers of 7,501.74-4,088.35cm −1 were finally used for discrimination and classification.PCA using these wavenumbers could discriminate W. cocos between poriae cutis and the inner part of the sclerotia of W. cocos in the pattern space of PCA (Yuan et al., 2018).
Authentication of Picea abies L. Karst seed from five seed lots in Sweden, Finland, Poland, and Lithuania has been carried out using NIR reflectance spectra at a wavelength of 780-2,500 nm using a resolution of 0.5 nm (Farhadi et al., 2017).Classification model of samples was performed using OPLS-DA.The model performance was validated using two test sets, namely, internal validation using 600 seeds and external validation using 1,158 seeds.In internal validation, the same seed lots were included during modeling, while in external validation, seed lots were excluded.OPLS-DA model could correctly classify 99% of Swedish, Finnish, and Polish seeds and 97% of Lithuanian seeds in internal validation, while during external validation, OPLS-DA model could correctly classify of Swedish, Finnish, Lithuanian, and Polish seeds with accuracy levels of 81%, 96%, 98%, and 93% of seeds according to their respective classes, respectively.The mean classification accuracy was 99% and 95% for the internal and external test set, respectively.

Continued
Echinacea is one of the most popular herbals commonly used in dietary supplements and has an expensive price in the market, with immune-stimulatory and anti-inflammatory properties, especially the alleviation of cold symptoms (Tharun et al., 2017); therefore, the authentication of this plant is very essential.Several parts of this plant are used in the manufacturing of dietary supplement products, and NIR spectroscopy was used for identification of plant parts to comply with current good manufacturing practices (cGMPs).The differentiation of Echinacea angustifolia root, E. purpurea root, E. purpurea tops, and E. purpurea seed from various sources and E. pallida root from a single source, as well as adulterated E. angustifolia root were performed using NIR spectroscopy in combination with SIMCA (Neal-Kababick and Flora, 2010).Powdered samples (40 mesh) were placed in scintillation vials and scanned using reflectance NIR at wavenumbers of 12,000-4,000 cm −1 using 25 scanning at a resolution of 8 cm −1 .NIR spectra were subjected to data processing of noise weighting, MSC normalization, and spectral derivatization before being analyzed using SIMCA algorithm.SIMCA model was able to properly identify authentic and adulterated Echinacea materials.SIMCA using variables of absorbances on NIR spectra was also successfully used for classification of 48 herbal samples commonly used in food and pharmaceutical industries (Yang et al., 2013).
The reflectance spectral data were converted to absorbance using log 1/R (R = reflectance).PCA and OPL-DA for discrimination and classification.
PCA exhibited distinguishable clusters between P. reniforme and P. sidoides, while OPS-DA showed distinct groupings P. reniforme and P. sidoides using seven main absorption peaks which contain putative biomarkers responsible for the discrimination of two species. (

Hibiscus mutabilis L. and Berberidis radix
Rapid recognition of H. mutabilis and Berberidis radix through fingerprinting patterns.
NIRS were collected at the region of 10,000-4,000 cm −1 with 60 scanning and a resolution of 8 cm −1 .
PLSDA, PCA, and LDA for classification PLSDA model showed good classification of samples according to different collection parts, collection time, and different origins or various species belonging to the same genera of herbal medicines.
NIR spectroscopy combined with chemometrics of PLS-DA has been used for fast identification of three varieties of Chrysanthemum, namely, Dabaiju, Huju, and Xiaobaiju (Chen et al., 2014).A total of 139 Chrysanthemum samples were analyzed and divided randomly into a calibration set (92 samples) and prediction set (47 samples).NIR diffuse reflectance spectra of Chrysanthemum varieties were preprocessed using a first-order derivative (D1) and auto-scaling and then subjected to PLS-DA.Using absorbance values at wavenumbers of 10,000-4,000 cm −1 , PLS-DA could differentiate three Chrysanthemum varieties with accuracy rates of Dabaiju, Huju, and Xiaobaiju were of 97.60%, 96.65%, and 94.70%, respectively, in calibration sets and 95.16%, 86.11%, and 93.46% in validation (prediction) sets, respectively.
A rapid NIR spectroscopy coupled with multivariate calibration of PLS was used to discriminate Paeoniae Radix (dried root of Paeonia lactiflora Pallas, Family of Paeonaceae) in cultivation area of Hangshao, Boshao, and Chuanshao from different geographical origins in China.The different levels of active components (Paeoniflorin, albiflorin, and benzoylalbiflorin) contained in Paeoniae Radix as determined by HPLC-UV detection contribute to such discrimination.NIR spectra of samples were acquired at 10,000-4,000 cm −1 using 32 scanning at a resolution of 16 cm −1 and recorded as absorbance, using air as the background.The diffuse reflectance NIR spectra were subjected to several treatments, including MSC, first derivative, and Savitsky-Golay for correcting the scattering effect and eliminating the baseline shift to offer good correlation between results obtained with NIR spectroscopy and those obtained using HPLC-UV.The chemometrics of PCA can successfully classify Paeoniae Radix according to different cultivation area using the contents of paeoniflorin, albiflorin, and benzoylalbiflorin as variables (Luo et al., 2008).

AUTHENTICATION OF HERBAL MEDICINES USING MIR SPECTROSCOPY
Among infrared regions, MIR spectroscopy was the most commonly used technique for analytical purposes due to its fingerprint properties.The interaction between electromagnetic radiation in MIR regions with molecules can be analyzed in three different ways as emission, absorption, and reflection (Türker-Kaya and Huck, 2017).This interaction causes chemical bonds to vibrate at specific wavenumbers (frequencies), which depends on the mass of the constituent atoms, the molecule shape, and the stiffness of the bonds, according to Hooke's law (Baeten and Dardenne, 2002).The MIR region lies at wavenumbers of 4,000-400 cm −1 , which can be segmented into four regions, namely, 4,000-2,500 cm −1 (X-H stretching vibration), 2,500-2,000 cm −1 (the triple bond region), 2,000-1,500 cm −1 (the double bond region), and 1,500-400 cm −1 (the fingerprint region) (Karoui et al., 2010).The main advantages of MIR spectroscopy employed for discrimination and authentication of herbal medicines are found on sample preparations.Herbal medicines samples can be rapidly and directly tested to obtain a MIR spectrum because they are not separated or extracted and the procedure of sample preparation is nondestructive.The MIR spectrum fingerprint also contains the "whole" chemical information of all chemical compositions present in the herbal medicines (Sun et al., 2010).Table 2 listed the application of MIR spectroscopy in combination with chemometrics for authentication of herbal medicines.
Fourier Transform (FT) MIR Spectroscopy in combination with chemometrics has been developed as a rapid tool for classification of Baccharis species samples from the Atlantic Forest.For this purpose, 28 specimens were collected from different locations in Brazil.The samples were analyzed using FT-MIR spectrometer, using reflectance drift module at 4,000-400 cm −1 , 64 scans with a resolution of 4 cm −1 .MIR spectra data were subjected to pre-processing by normalization and auto-scaling.PCA was successfully used for the classification of samples into five groups, namely, B. articulata, B. trimera, B. uncinella, B. organensis, and B. aracatubaensis (Lourenço et al., 2015).FT-MIR spectroscopy combined with PCA using variables of absorbance values at fingerprint regions (2,000-400 cm −1 ) was also successfully used for classification of five herbal medicines from different locations in India, namely, Arjuna (Terminalia arjuna), Aswagandha (Withania somnifera), Aawala (Emblic myrobalan), Vaasa (Adhatoda vasica), and Tulsi (Ocimum sanctum) (Singh et al., 2010).
Panax ginseng C.A. Meyer, one of the popular herbs commonly used for medicinal purposes, has been discriminated using FT-MIR spectroscopy combined with chemometrics of PCA and PLS-DA based on cultivation age (5 and 6 years) and parts (rhizome, tap root, and lateral root), while Partial least square regression (PLSR) was used to predict the ages and parts of ginseng samples based on PLS components numbers.Each FT-MIR spectrum of samples was collected in wavenumbers of 4,000-650 cm −1 with a resolution of 4 cm −1 .FT-MIR spectra were pre-processed differently using various normalization methods, including area normalization, minimum-maximum normalization, and vector normalization.Cross-validation using leave-one-out technique was used to minimize model overfitting and give the predictive capability of classification models.PLS-DA could discriminate ginseng into three parts (taproot, rhizome, and lateral root) and classify ginseng with 5-and 6-year cultivation ages.PLSR using two PLS components could predict the ages and parts of ginseng with a low root mean square error of prediction (RMSEP) value of 0.161 (Lee et al., 2017).
Two-dimensional (2D) correlation MIR spectroscopy (the measurement of MIR spectral changes due to time course) has been used for the authentication of Lignosus spp., medicinal mushroom used as a folk remedy, especially for clearing heat and moistening the lungs.Lignosus rhinocerotis, the most common species of Lignosus, was differentiated from different origins in Malaysia.The 2D-MIR spectra at the combined wavenumbers of 1,800-1,300 cm −1 , 1,300-900 cm −1 , and 900-400 cm −1 could be applied for differentiation of Lignosus rhinocerotis from different origins (Choong et al., 2014).2D-MIR spectroscopy in combination with PCA using variables of absorbance values at 1,000-1,500 cm −1 could be successfully applied for differentiation of Phyllagathis praetermissa from P. rotundifolia (Tan et al., 2011).
FT-MIR spectroscopy in combination with multivariate analysis of PCA and CVA has been devoted for discrimination of Curcuma longa (turmeric), Curcuma xanthorrhiza (java turmeric), and Zingiber cassumunar (ginger) from different regions.The rhizomes of these plants had similar rhizome color and were used widely in herbal medicines.FTIR spectra in the MIR region (4,000-400 cm −1 ) were subjected to SNV and first and second derivatives.PCA using variable of absorbances values at wavenumbers of 2,000-400 cm −1 could be used for making the patterns of samples, groupings, similarities, and differences.These wavenumbers were preferred due to their capability to provide valuable information, which attributed to the chemical compounds present in the studied samples.Using the same variables, CVA gave better discrimination than PCA.Subsequently, the developed method could be used for the identification and discrimination of the three closely-related plant species (Rohaeti et al., 2015).
The quality of herbal medicines depends on the harvesting period and cultivated areas (geographical origins).FT-MIR using attenuated total reflectance (ATR) has been used for classification of Dendrobium officinale, a tonic herb commonly used in Traditional Chinese Medicine.MIR spectra at wavenumbers of 4,000-550 cm −1 were used as variables during classification modeling.Random forest model could classify D. officinale from different harvesting periods with accuracy levels of 94.44% and 97.92% in calibration and validation set (Wang et al., 2018), respectively.

CONCLUSION
NIR and MIR spectroscopies are fingerprint analytical techniques commonly used for discrimination and authentication of herbal medicines.Coupled with chemometrics of pattern recognitions, NIR and MIR spectra could be treated to be more easily interpreted for making a decision regarding the adulteration practices.The combination of NIR-MIR spectra and chemometrics offered rapid and powerful techniques for discrimination and authentication of herbal medicines.

Figure 1 .
Figure 1.The general steps involved in the application of chemometrics methods to treat near-and mid-infrared spectra data to assess the authentication of herbal medicine products [adapted from Gad et al. (2013)].

Table 1 .
The application of NIR spectroscopy for authentication of herbal medicines.

Table 2 .
The application of MIR spectroscopy in combination with chemometrics for authentication of herbal medicines*.
K-means, HCA, PCA, and SOM K-means, HCA, PCA, and SOM were able to discriminate two medicinal seeds, H. niger and P. harmala from other herbal samples (Qi et al., 2017) *See Table 1 for abbreviation used.