Exploratory Nodes

Exploratory nodes reveal structure before supervised modeling.

PCA, Decomposition, and Curve Resolution

Node	Use When	Inputs	Outputs	Key Configuration
PCA (`model.pca`)	Explore variance, scores, loadings, outliers, and compressed features.	`default: Array2D`	`scores`; `loadings`; `explained_variance`; `model`	`n_components`; `standardized`; `scaled`. Requires SpectroChemPy. Keep scaling choices consistent with your spectroscopy convention.
PCA Transform (`model.pca_transform`)	Project new spectra into an already fitted PCA model.	`X_new: SpectralDataset`; `model: DecompositionResult`	`scores: ScoreMatrix`	no parameters. Use the same preprocessing as the fitted PCA model.
NMF (`model.nmf`)	Resolve non-negative concentration-like and spectrum-like factors.	`default: SpectralDataset`	`concentrations`; `spectra`; `reconstruction_error`; `model`	`n_components`; `solver`; `max_iter`; `tol`. Input must be non-negative; use Clip Floor or baseline correction first if needed.
FastICA (`model.ica`)	Blind source separation when independent latent sources are plausible.	`default: SpectralDataset`	`sources`; `mixing_matrix`; `components`; `model`	`n_components`; `algorithm`; `fun`; `max_iter`; `tol`.
MCR-ALS (`model.mcr_als`)	Resolve mixture concentration profiles and pure spectra with constraints.	`default: SpectralDataset`	`C`; `St`; `residuals`; `ground_truth_comparison`; `model`	`n_components`; `non_negative_C`; `non_negative_St`; `max_iter`; `tol`; `normSpec`; validation indices. Requires SpectroChemPy.
EFA (`model.efa`)	Estimate evolving rank/component count in ordered mixture or process data.	`default: SpectralDataset`	`forward_eigenvalues`; `backward_eigenvalues`; `model`	`n_components`. Requires SpectroChemPy.
SIMPLISMA (`model.simplisma`)	Estimate pure variables/components by purity maximization.	`default: SpectralDataset`	`concentrations`; `spectra`; `purity_values`; `model`	`n_components`; `tol`; `noise`. Requires SpectroChemPy.

SpectroChemPy's MCR-ALS and baseline documentation are useful background for constrained curve-resolution thinking: https://www.spectrochempy.fr/0.7.0/userguide/analysis/mcr_als.html and https://www.spectrochempy.fr/0.8.3/userguide/processing/baseline.html.

Clustering

Node	Use When	Inputs	Outputs	Key Configuration
HCA (`model.hca`)	Build hierarchical clusters and dendrograms from spectra or scores.	`default: Array2D`	`labels`; `cluster_summary`; `linkage_matrix`; `dendrogram_data`; `embedding`; `model`	`n_clusters`; `linkage`; `metric`. Ward linkage expects Euclidean distance.
K-Means (`model.kmeans`)	Partition samples into a chosen number of compact clusters.	`default: Array2D`	`labels`; `centroids`; `cluster_summary`; `embedding`; `model`	`n_clusters`; `n_init`; `max_iter`; `random_state`.
DBSCAN (`model.dbscan`)	Find density-based clusters and noise/outlier samples.	`default: Array2D`	`labels`; `cluster_summary`; `embedding`; `model`	`eps`; `min_samples`; `metric`. Tune `eps` carefully after scaling.

Peak and Library Nodes

Node	Use When	Inputs	Outputs	Key Configuration
Peak Finding (`analysis.peak_finding`)	Detect candidate spectral peaks for interpretation, masking, or library workflows.	`default: SpectralDataset`	`peaks`; `annotated_spectrum`; `spectrum`	`height`; `threshold`; `distance`; `prominence`; `width`. These mirror the main controls in SciPy `find_peaks`.
Peak ID (`analysis.peak_id`)	Ask the configured primary LLM for tentative vibration assignments for detected peaks.	`peaks: Array1D`	assigned peak list	`compound`; `max_peaks`; `min_relative_height`. Treat as interpretive assistance, not proof.
Compare vs. Library (`analysis.compare_library`)	Rank a sample against selected reference spectra using HQI and cosine similarity.	`sample: SpectralDataset`; `library: SpectralDataset`	ranking dictionary with scores and diagnostics	`top_n`; `library_filter`; `hqi_mode`; diagnostic bands and overlap thresholds.

SciPy documents the find_peaks controls for height, threshold, distance, prominence, and width here: https://docs.scipy.org/doc/scipy-1.16.0/reference/generated/scipy.signal.find_peaks.html.

Practical Use

Use exploratory nodes to understand variation, outliers, clusters, pure-component estimates, and candidate spectral features before locking in a calibration or classification model. Prefer scores plots for sample structure, loadings or coefficients for variable interpretation, and residual/limit plots for model adequacy.