Skip to content

Current Capabilities

This page describes the documented production path for the current release.

What Is Built

SpectraSherpa is already a full spectroscopy workbench for first-pass method development, exploratory chemometrics, calibration review, and guided reporting. It is not just a Python package wrapped in a UI. The product combines spectroscopy-aware data handling, a visual workflow engine, model artifacts, reporting/export, optional scientific reference data, and Cloud AI assistance in one place.

Key capabilities built today:

  • Browser-based analysis UI for importing data, inspecting spectra, building workflows, running templates, reviewing plots, and exporting results.
  • Data transparency at import with file names, extensions, metadata, and data-matrix shape shown before users commit to modeling.
  • Spectroscopy-aware dataset model with spectral axes, sample metadata, processing history, and data-role semantics carried through the workflow.
  • Workflow DAG builder with reusable nodes for data, preprocessing, modeling, validation, plots, tables, reports, and exports.
  • Template library for PCA, PLS calibration, classification, SIMCA QC, MCR-ALS, peak workflows, and spectroscopy-specific starters.
  • Core chemometrics for PCA, PLS regression, KNN, PLS-DA, SIMCA-style classification/QC, MCR-ALS, peak finding, variable selection, and validation.
  • Model artifacts so useful calibrations/classifiers can be saved with the metadata needed for later review and application.
  • Report and export path for moving from interactive exploration to shareable scientific records.
  • Reference and synthesis workflows around NIST data and optional HITRAN/HAPI line-by-line synthesis.
  • Optional SpectroChemPy support for additional spectroscopy readers, datasets, and coordinate-aware algorithms.
  • Cloud Advisor and Ambient Guidance for LLM-enabled onboarding, interpretation drafts, and contextual next-step suggestions.
  • Extension surfaces for OSS users and developers to add nodes, providers, export behavior, and deployment-specific policy without rewriting the workbench.

Spectroscopy Focus

SpectraSherpa is currently documented for FTIR, NIR, Raman, and UV-VIS spectroscopy. The strongest path is:

  1. Import spectra from user files, example datasets, or reference libraries.
  2. Inspect file names, extensions, metadata, and the data matrix.
  3. Apply spectral preprocessing.
  4. Run PCA, PLS calibration, classification, SIMCA QC, MCR-ALS, or peak/library workflows.
  5. Review plots, tables, metrics, and reports.
  6. Save models or export results.

Product Strengths

SpectraSherpa is designed as a spectroscopy application, not only a collection of numerical routines. Its current strengths are:

  • GUI-first workflow building: users can import, preprocess, model, validate, and report without writing notebooks for the common path.
  • Transparent data provenance: the Import view exposes loaded file names, extensions, metadata, and data-matrix shape so users can see what entered the analysis.
  • Reproducible workflow graphs: analysis steps are represented as connected nodes with parameters, inputs, outputs, plots, tables, and artifacts.
  • Spectroscopy-aware data handling: wavenumber/wavelength axes, spectral matrices, sample metadata, reference libraries, and file formats are first-class concepts.
  • Model and validation outputs: PLS, classification, SIMCA, PCA, and MCR workflows surface interpretable plots, metrics, and saved artifacts.
  • Report and export path: results can be carried from exploratory analysis into reports and portable outputs.
  • Optional AI assistance: Cloud adds Sherpa Advisor and Ambient Guidance for scientific onboarding, interpretation drafts, and contextual next-step suggestions.

Compared with generic visual analytics tools such as Quasar or Orange, SpectraSherpa's advantage is the product layer around spectroscopy: file provenance, spectral axes, workflow templates, chemometrics node contracts, scientific reporting, and deployment from local OSS to managed enterprise Cloud.

Documented Scientific Scope

The public docs cover the following current capabilities:

  • CSV, JCAMP-DX, NumPy, MAT, and optional SpectroChemPy-backed spectral formats
  • FTIR, NIR, Raman, and UV-VIS data import and preprocessing
  • PCA exploratory analysis and diagnostics
  • PLS regression calibration, VIP scores, coefficients, and CV predictions
  • KNN, PLS-DA, and SIMCA classification
  • SIMCA-style acceptance/QC concepts
  • MCR-ALS and self-modeling curve-resolution workflows
  • peak finding and library comparison
  • NIST reference workflows and synthetic FTIR examples
  • HITRAN/HAPI synthesis when the optional extra and API key are configured
  • workflow templates, model artifacts, reports, and exports

Reference Foundations

NIST and HITRAN are both important spectroscopy foundations, but they enter SpectraSherpa differently.

  • NIST supports reference-library and quantitative infrared workflows around public scientific data resources such as the NIST Chemistry WebBook and NIST Quantitative Infrared data.
  • HITRAN/HAPI supports line-by-line gas-phase spectral synthesis when the optional extra, API key, and network access are configured.

SpectroChemPy is an optional software foundation for additional readers, example datasets, and coordinate-aware spectroscopy algorithms. NumPy, SciPy, pandas, and scikit-learn provide much of the numerical computing base.

Out of Scope for First-Run Onboarding

The production documentation does not currently teach unverified modality stories such as DOE, 96-well workflows, HPLC, or NMR. Those may exist as exploratory code or future product directions, but they should not be treated as the supported first-run path until the data source, template, plots, metrics, and user story are verified.