Chromatographic fingerprinting by comprehensive two-dimensional chromatography: Fundamentals and tools
Introduction
The terms ‘profiling’ and ‘fingerprinting’ have been adopted for metabolomics [1,2] to refer to distinct analytical approaches capable of informing about compositional differences between samples. For profiling, analytical platforms are set to provide detailed information (retention, mass spectrum, detector response, etc.) on qualitative and/or quantitative distributions of samples' components. Profiling can be conducted on a targeted basis [3], if analytes of interest are defined a priori and monitored across all samples. However, if the analytical process is capable of generating individual yet distinctive features for all components, the process can be conceptually extended toward a comprehensive evaluation of all detected constituents and referred to as “untargeted profiling” [4,5]. Fingerprinting, as defined by Fiehn [2], is a high-throughput process capable of unravelling compositional differences between samples, not necessarily achieving accurate quantitative data or compound identifications for all individual constituents. A fingerprint provides a comprehensive set of features ideally corresponding to all chemical constituents and aims to extract the non-evident chemical information included in the whole signal acquired from an analytical instrumental technique. This information mining process is carried out by application of statistical-mathematic tools of chemometric multivariate analysis. Note that chemometrics does not work magic. The information of concern to be mined must be previously embedded in the analytical signal, even if it is hidden to an observer, and the analytical methods to obtain that signal must be specifically designed and optimized keeping this crucial fact in mind.
Fingerprinting methodology can be effectively performed by different approaches:
- 1.
The fingerprint is directly obtained from the sample in its natural state without any pre-treatment except, if applicable, dissolution.
- 2.
The fingerprint is recorded from a particular fraction or family of compounds after a separation or fractionation step. Thus, the fingerprints would be specific of a compound family (e.g., the volatile organic compounds).
- 3.
The fingerprint is obtained after a chemical reaction step (e.g., derivatization), so that there is an alteration of the initial chemical composition of the sample and new compounds are produced (e.g., the fatty acids methyl esters).
A sample's fingerprint can be considered as a totally unspecific signal when the first approach is applied and a partially specific signal when the second and third approaches are employed.
In this sense, signals from spectroscopic techniques fit well with this definition; and nuclear magnetic resonance (NMR), chromatography, mass spectrometry (MS), and Fourier transform infrared spectroscopy (FT-IR) spectra are in fact the most popular fingerprinting methods in metabolomics [6]. Fingerprinting and related concepts have been extended to other fields, e.g.: foodomics [7], sensomics [8,9], nutrimetabolomics [10], and petroleomics [11]. With the rapid evolution of analytical techniques, more stable and informative multidimensional platforms now are readily available, offering further possibilities to develop the concept of fingerprinting.
Regarding analytical signals recorded by each analytical technique, there is a proper nomenclature for the different working data [6], based on the instrumental signals with different measuring setups, namely with different detection systems. In order to fully understand and to extract the relevant information of a signal, it is important to define and clarify the meaning of the terms usually employed during the step of treatment of data: dimension, way, order, vector, matrix, cube, tensor and array. The terms dimension, way and order refer to the type of signal acquired by the analytical instrument. Each analytical signal is described by a main dimension or way that is related with the signal intensity and one or more complementary dimensions or ways which characterize the position scores of each intensity value into the signal. The number of complementary dimensions defines the data order. A conventional chromatogram (e.g., 1D GC-FID) is an instance of a two-way signal (retention times and detector intensities) and constitutes a first-order data. Note that in the particular case of signals defined by two chromatographic dimensions with two retention times, the term 2D chromatogram is then applied which in turn is a three-way signal. The terms vector, matrix and cube are usually employed to name a mathematical layout where the working data are arranged once the acquired signal is exported from the instrument. For example, a vector denotes a first-order data (two-way signal), a matrix containing a second-order data (three-way signal), and a cube is used for third-order data. The term tensor is used to name collectively all of these. Finally, the term array should refer to a structure consisting of a set of tensors including the working data from a group of samples. Every array has an additional dimension, i.e., the ordinal number of each sample, with regard to the dimensionality of each sample data.
Usually the raw chromatographic signal exported from the instrument consists of several thousand intensity values and could be used as a whole to apply fingerprinting. However, the number of elements may be reduced by applying mathematical methods (e.g., resampling) or scientific-technical operations (e.g., computing peak areas). This strategy is typical of profiling. The reduction of the number of elements may reduce the dimensionality, e.g., obtaining a peak-response vector (first-order data) from a 2D chromatogram (second-order data), although this is not always applicable. A tutorial on analytical chromatographic fingerprinting is provided by Cuadros et al. [6].
Most multidimensional analytical (MDA) platforms, provide physico-chemical discrimination of a sample's constituents by chromatographic processes, e.g., gas chromatography (GC) and liquid chromatography (LC), accompanied by spectroscopic processes, e.g., MS, to achieve suitable specificity and selectivity thereby expanding discrimination potentials. When chromatography is conducted by comprehensively coupling two separation dimensions, as in the case of comprehensive two-dimensional chromatography (C2DC), the analytical output requires suitable processing to enable data visualization and interpretation.
In particular, in C2DC (e.g., GC × GC, LC × LC, or SFC × SFC), two columns are serially connected and components eluting from the first-dimension (1D) column are periodically trapped and on-line re-injected into a second-dimension (2D) column. In GC × GC, this operation is governed by a modulator, e.g., a thermal or valve-based focusing interface with a brief modulation time-period (PM), typically between 0.5 and 8 s. The detector, connected to the end of the 2D column, produces sequential data values that vary as a function of the quality/identity and amount of eluting analytes. An analog-to-digital (A/D) converter collects the signal output at a certain frequency and in a sequential order. Two-dimensional chromatogram visualization therefore is rendered by arranging data values from single modulation period (or cycle) as a column of pixels (picture elements) where each pixel corresponds to a single detector event. This process is known as rasterization. Pixel columns are sequenced along the abscissa (X-axis, left-to-right) according to 1D separation time and 2D data is presented in a right-handed Cartesian coordinate system, where the ordinate (Y-axis, bottom-to-top) corresponds to the 2D separation elapsed time [12].
2D peak patterns generated by C2DC can be treated as sample's unique fingerprint with detected compounds providing minutiae features to be used for effective cross-comparative analysis. The term minutiae derives from fingerprint recognition technology, exploited in forensic applications, where the term corresponds to ridge endings and ridge bifurcations on fingertips. Automatic biometric fingerprint verification systems localize and extract a set of minutiae from inked impressions, or detailed images of human fingertips, for cross-matching with stored templates [13].
By translating the concept of biometric fingerprinting into C2DC, any process that detects, re-aligns, and compares minutiae features extracted from 2D peaks patterns across a series of 2D chromatograms, can be classified as fingerprinting. Moreover, because, at the processing level, the 2D chromatographic fingerprint “contains unspecific and non-evident information which should be extracted by chemometric tools” [6], such an approach can be deemed “chromatographic fingerprinting”. This is in keeping with established views that chromatographic fingerprints refer “to the entire chromatogram from a certain test material which is distinctive of its composition” and that “chromatograms provide a specific and differentiating tool, as an identity card, which could be used in order to ‘identitate’ or identify a certain material” [6]. Fig. 1 illustrates how chromatographic signals can be processed according to fingerprinting or profiling principles to achieve a high level of information. The types of features available will be introduced at Section 3.
In this review, by following this conceptual track, data processing approaches and workflows that comply with the above-mentioned definition are presented, illustrated by selected applications, and critically discussed in view of their capabilities to provide higher levels of information. If 2D chromatographic signals [6], together with all their metadata, informing about components' identity and physico-chemical characteristics (retention times, detector response, spectral signatures, etc.), are subjected to chromatographic fingerprinting, the overall process achieves a truly comprehensive meaning.
Section snippets
Analytical platforms, dimensions of information available and fingerprinting specificity
To maximize the information achievable by 2D chromatographic fingerprinting, the analytical platform must be appropriately configured and sample preparation, the zeroth dimension of the system [14], should be tuned to avoid biases that compromise investigational meanings. Moreover, as stated by Fiehn [2], to access hidden information in metabolomics, fingerprinting should take into consideration that the resolution of the analytical devices must be high enough to handle critical information".
Data processing principles and tools
Here, our discussion of data processing focuses on feature extraction for pattern recognition (PR), but these data-analysis steps may require preprocessing such as for rasterization, modulation-phase adjustment, baseline correction, retention-times alignment, and peak detection. Some recent developments in these areas are discussed here as they relate to feature extraction and analysis, but several reviews discuss methodologies in these areas more comprehensively [12,[40], [41], [42], [43], [44]
Chromatographic fingerprinting with visual images and datapoint features
Comparative visualization is a chromatographic fingerprinting approach that enables prompt and intuitive evidence of compositional differences between samples pairs. It could be classified within datapoint features approaches since chromatograms pairs are compared pixel-by-pixel with or without pattern re-alignment or transformation. It has been applied to reveal differences in petrochemical applications [[93], [94], [95]], food [80,[96], [97], [98], [99]], body fluids metabolites composition [
Challenging scenarios
Chromatographic fingerprinting faces several challenges when severe misalignment occurs between the chromatograms of a set. As previously discussed, by template matching fingerprinting, retention times variations can be compensated by applying suitable transformations (see Section 3.4). However, severe misalignment might need analyst supervision in setting critical processing parameters. Stilo et al. [141] tackled pattern misalignment and detection inconsistencies, such as those occurring in
Machine learning for effective data exploration
Generally, PR with C2DC has proceeded with established methods rather than developing new methods. A fundamental division of PR is between supervised and unsupervised problems. For supervised PR, a training set of feature vectors with class labels (e.g., healthy or unhealthy) are provided; then, the training set is used to develop a method(s) to discern differences between classes. For unsupervised PR, methods must discern both natural groupings/clusters and differences between those clusters.
Concluding remarks
Chromatographic fingerprinting by C2DC is undoubtedly a profitable strategy for cross-comparative analysis of large set of samples with an almost comprehensive coverage of their constituent components. Dedicated data processing on instrumental fingerprint is necessary to extract meaningful high-level information from different types of features, while tackling issues related to retention times misalignment and MS detection inconsistencies.
Multidimensional analytical platforms combining
Declaration of Competing Interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Prof. Stephen E. Reichenbach has financial interests on GC Image, LLC. Dr. Federico Stilo, Dr. Ana M. Jimenez-Carvelo, Prof. Luis Cuadros-Rodriguez, Prof. Carlo Bicchi and Prof. Chiara Cordero declare no conflict of interest.
References (165)
- et al.
Current approaches and challenges for the metabolite profiling of complex natural extracts
J. Chromatogr., A
(2015) - et al.
Chromatographic fingerprinting: an innovative approach for food “identitation” and food authentication - a tutorial
Anal. Chim. Acta
(2016) Foodomics, foodome and modern food analysis
TrAC Trends Anal. Chem.
(2017)- et al.
Characterization of odorant patterns by comprehensive two-dimensional gas chromatography: a challenge in omic studies
TrAC - Trends Anal. Chem.
(2019) Chapter 4 data acquisition, visualization, and analysis, compr
Anal. Chem.
(2009)- et al.
Comparison of comprehensive two-dimensional gas chromatography in conventional and stop-flow modes
J. Chromatogr., A
(2006) - et al.
Evaluation of conditions of comprehensive two-dimensional gas chromatography that yield a near-theoretical maximum in peak capacity gain
J. Chromatogr., A
(2015) Sample dimensionality: a predictor of order-disorder in component peak distribution in multidimensional separation
J. Chromatogr., A
(1995)- et al.
High concentration capacity sample preparation techniques to improve the informative potential of two-dimensional comprehensive gas chromatography-mass spectrometry: application to sensomics
J. Chromatogr., A
(2013) - et al.
Black tea volatiles fingerprinting by comprehensive two-dimensional gas chromatography – mass spectrometry combined with high concentration capacity sample preparation techniques: toward a fully automated sensomic assessment
Food Chem.
(2017)
Application of comprehensive two-dimensional gas chromatography for the assessment of oil contaminated soils
J. Chromatogr., A
Comparative study of differential flow and cryogenic modulators comprehensive two-dimensional gas chromatography systems for the detailed analysis of light cycle oil
J. Chromatogr., A
Parallel dual secondary column-dual detection: a further way of enhancing the informative potential of two-dimensional comprehensive gas chromatography
J. Chromatogr., A
A procedure for comprehensive two-dimensional gas chromatography retention time locked dual detection
J. Chromatogr., A
Qualitative and quantitative analysis of vetiver essential oils by comprehensive two-dimensional gas chromatography and comprehensive two-dimensional gas chromatography/mass spectrometry
J. Chromatogr., A
Enhancing the chemical selectivity in discovery-based analysis with tandem ionization time-of-flight mass spectrometry detection for comprehensive two-dimensional gas chromatography
J. Chromatogr., A
Comprehensive two-dimensional gas chromatography coupled with time of flight mass spectrometry featuring tandem ionization: challenges and opportunities for accurate fingerprinting studies
J. Chromatogr., A
Concepts, selectivity options and experimental design approaches in multidimensional and comprehensive two-dimensional gas chromatography
TrAC - Trends Anal. Chem.
Metabolomic analysis in food science: a review
Trends Food Sci. Technol.
Trends in data processing of comprehensive two-dimensional chromatography: state of the art
J. Chromatogr. B Anal. Technol. Biomed. Life Sci.
Review of chemometric analysis techniques for comprehensive two dimensional separations data
J. Chromatogr., A
Management and interpretation of capillary chromatography-mass spectrometry data
Features for non-targeted cross-sample analysis with comprehensive two-dimensional chromatography
J. Chromatogr., A
Pattern recognition of jet fuels: comprehensive GC × GC with ANOVA-based feature selection and principal component analysis
Chemometr. Intell. Lab. Syst.
A principal component analysis based method to discover chemical differences in comprehensive two-dimensional gas chromatography with time-of-flight mass spectrometry (GC × GC-TOFMS) separations of metabolites in plant samples
Talanta
Pixel-level data analysis methods for comprehensive two-dimensional chromatography
Data Handl. Sci. Technol.
Tile-based Fisher-ratio software for improved feature selection analysis of comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry data
Talanta
Pixel-level data analysis methods for comprehensive two-dimensional chromatography
Data Handling Sci. Technol.
Review of the role and methodology of high resolution approaches in aroma analysis
Anal. Chim. Acta
Current state of comprehensive two-dimensional gas chromatography-mass spectrometry with focus on processes of ionization
TrAC - Trends Anal. Chem.
Advanced data handling in comprehensive two-dimensional gas chromatography
Comparison of two algorithmic data processing strategies for metabolic fingerprinting by comprehensive two-dimensional gas chromatography-time-of-flight mass spectrometry
J. Chromatogr., A
Identification of common molecular subsequences
J. Mol. Biol.
Improving the quality of biomarker candidates in untargeted metabolomics via peak table-based alignment of comprehensive two-dimensional gas chromatography-mass spectrometry data
J. Chromatogr., A
A peaklet-based generic strategy for the untargeted analysis of comprehensive two-dimensional gas chromatography mass spectrometry data sets
J. Chromatogr., A
Bayesian peak tracking: a novel probabilistic approach to match GCxGC chromatograms
Anal. Chim. Acta
Automating data analysis for two-dimensional gas chromatography/time-of-flight mass spectrometry non-targeted analysis of comparative samples
J. Chromatogr., A
Informatics for cross-sample analysis with comprehensive two-dimensional gas chromatography and high-resolution mass spectrometry (GCxGC-HRMS)
Talanta
Two-dimensional gas chromatographic profiling as a tool for a rapid screening of the changes in volatile composition occurring due to microoxygenation of red wines
Anal. Chim. Acta
Profiling analysis of volatile compounds from fruits using comprehensive two-dimensional gas chromatography and image processing techniques
J. Chromatogr., A
Peak pattern variations related to comprehensive two-dimensional gas chromatography acquisition
J. Chromatogr. A
Classification of gasoline data obtained by gas chromatography using a piecewise alignment algorithm combined with feature selection and principal component analysis
J. Chromatogr., A
Two-dimensional semi-parametric alignment of chromatograms
J. Chromatogr., A
Pixel-by-pixel correction of retention time shifts in chromatograms from comprehensive two-dimensional gas chromatography coupled to high resolution time-of-flight mass spectrometry
J. Chromatogr., A
BARCHAN: blob alignment for robust CHromatographic ANalysis
J. Chromatogr., A
Comprehensive multidimensional separations for the analysis of petroleum
J. Chromatogr., A
Comprehensive two-dimensional gas chromatography for biogas and biomethane analysis
J. Chromatogr., A
Profiling food volatiles by comprehensive two-dimensional ga schromatography coupled with mass spectrometry: advanced fingerprinting approaches for comparative analysis of the volatile fraction of roasted hazelnuts (Corylus avellana L.) from different ori
J. Chromatogr., A
Toward a definition of blueprint of virgin olive oil by comprehensive two-dimensional gas chromatography
J. Chromatogr., A
Urinary metabolic fingerprinting of mice with diet-induced metabolic derangements by parallel dual secondary column-dual detection two-dimensional comprehensive gas chromatography
J. Chromatogr., A
Cited by (42)
Unraveling the complexity of pyrolysates from residual fuels by Py-GCxGC-FID/SCD/TOF-MS with an innovative data processing method
2023, Journal of Analytical and Applied Pyrolysis