Untargeted metabolomics

Untargeted metabolomics usually involves comparing the metabolome of control and test groups to identify differences between their metabolite profiles. These metabolomic differences may be relevant to specific biological conditions.

There are typically three steps in an untargeted metabolomics workflow:

  1. Profiling, also known as differential expression, this step involves finding metabolites with statistically significant variations within control and test sample sets.
  2. Compound identification is the determination of the chemical structure of the discovered metabolites.
  3. Interpretation is the final step, uncovering biological connections between the metabolites and resulting in insights for the next experiment.

Profiling

In an untargeted metabolomics workflow, analytical or technical reproducibility and stringent data analysis are the keys to a successful experiment. High analytical reproducibility means that the data are a direct expression of biological variance; it also allows a smaller number of samples to be tested because technical replicates are minimized.

The four steps in the profiling workflow include the following:

  • Sample preparation and data acquisition
    • Separation and detection of metabolites by GC-MS, LC-MS or IC-MS. For untargeted metabolomics, both the chromatographic separations and mass spectrometers must have high reproducibility to minimize analytical variation. 
    • Because untargeted metabolomics aims to be comprehensive, it is imperative that the analytical technologies perform the following functions:
      1. Cover a breadth of metabolites with diverse physico-chemical properties. This includes GC, IC and HPLC samples resulting from positive/negative ionization of the MS.
      2. Possess a large dynamic range to analyze the varying abundances of metabolites.
      3. Have high sensitivity in order to detect and quantify low abundance metabolites.
      4. Possess high resolution accurate mass capability to separate isobaric species.
      5. Perform HRAM MSn for compound identification and structural elucidation.

  • Spectral pre-processing
    • Remove background noise and perform baseline correction, peak normalization and deconvolution. 
    • Use different deconvolution programs to understand the unique ionization and mass characteristics of GC-MS and LC-MS analytes.

  • Feature extraction
    • Locate and quantify all metabolites in an analyzed sample. (link to section on data analysis page)

  • Statistical analysis
    • Metabolomics samples are typically complex and there are multiple interactions between metabolites in biological states. To uncover significant events, univariate and multivariate statistical analysis (chemometric methods) platforms use visualization tools to assess abundance relationships between different lipid components.
      1. Univariate methods are the most common statistical approach and analyze metabolite compound features independently. When assessing differences between two or more groups, parametric tests such as student’s t-test and ANOVA (analysis of variance) are commonly used.
      2. Multivariate methods analyze metabolite or compound features simultaneously and can identify relationships between them. Principal component analysis (PCA) is a common example of a multivariate method approach.

Compound identification

After profiling, the compounds or metabolites are typically identified or annotated.

IC-MS workflows and LC-MS workflows

High resolution accurate mass (HRAM) features derived from profiling experiments are searched against MS databases or MS/MS spectral libraries such as mzCloud, METLIN and HMDB.

GC-MS workflows

Accurate mass electron ionization (EI) fragment patterns are matched against the widely available NIST and Wiley libraries for compound identification.

GC-MS Untargeted Metabolomics Workflow

Interpretation

There are several ways to interpret and display the data once all metabolites have been identified. For example, interactive graphic displays map identified metabolites and position them on pathways that help to deduce their function.

Because knowledge about biological processes has been continuously increasing, groups of metabolites that are related to the same biological process can be placed onto unique metabolic pathways. There are also many biological databases available, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and MetaCyc.

KEGG Pathways: Global and context-specific

Profiling

In an untargeted metabolomics workflow, analytical or technical reproducibility and stringent data analysis are the keys to a successful experiment. High analytical reproducibility means that the data are a direct expression of biological variance; it also allows a smaller number of samples to be tested because technical replicates are minimized.

The four steps in the profiling workflow include the following:

  • Sample preparation and data acquisition
    • Separation and detection of metabolites by GC-MS, LC-MS or IC-MS. For untargeted metabolomics, both the chromatographic separations and mass spectrometers must have high reproducibility to minimize analytical variation. 
    • Because untargeted metabolomics aims to be comprehensive, it is imperative that the analytical technologies perform the following functions:
      1. Cover a breadth of metabolites with diverse physico-chemical properties. This includes GC, IC and HPLC samples resulting from positive/negative ionization of the MS.
      2. Possess a large dynamic range to analyze the varying abundances of metabolites.
      3. Have high sensitivity in order to detect and quantify low abundance metabolites.
      4. Possess high resolution accurate mass capability to separate isobaric species.
      5. Perform HRAM MSn for compound identification and structural elucidation.

  • Spectral pre-processing
    • Remove background noise and perform baseline correction, peak normalization and deconvolution. 
    • Use different deconvolution programs to understand the unique ionization and mass characteristics of GC-MS and LC-MS analytes.

  • Feature extraction
    • Locate and quantify all metabolites in an analyzed sample. (link to section on data analysis page)

  • Statistical analysis
    • Metabolomics samples are typically complex and there are multiple interactions between metabolites in biological states. To uncover significant events, univariate and multivariate statistical analysis (chemometric methods) platforms use visualization tools to assess abundance relationships between different lipid components.
      1. Univariate methods are the most common statistical approach and analyze metabolite compound features independently. When assessing differences between two or more groups, parametric tests such as student’s t-test and ANOVA (analysis of variance) are commonly used.
      2. Multivariate methods analyze metabolite or compound features simultaneously and can identify relationships between them. Principal component analysis (PCA) is a common example of a multivariate method approach.

Compound identification

After profiling, the compounds or metabolites are typically identified or annotated.

IC-MS workflows and LC-MS workflows

High resolution accurate mass (HRAM) features derived from profiling experiments are searched against MS databases or MS/MS spectral libraries such as mzCloud, METLIN and HMDB.

GC-MS workflows

Accurate mass electron ionization (EI) fragment patterns are matched against the widely available NIST and Wiley libraries for compound identification.

GC-MS Untargeted Metabolomics Workflow

Interpretation

There are several ways to interpret and display the data once all metabolites have been identified. For example, interactive graphic displays map identified metabolites and position them on pathways that help to deduce their function.

Because knowledge about biological processes has been continuously increasing, groups of metabolites that are related to the same biological process can be placed onto unique metabolic pathways. There are also many biological databases available, such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and MetaCyc.

KEGG Pathways: Global and context-specific

Resources

  • Metabolomics Software Solutions
Share