API Reference

Complete documentation of all RobustNMF.jl functions.

Data Generation and Preprocessing

Utilities for creating and preparing data for NMF.

RobustNMF.generate_synthetic_data — Function

generate_synthetic_data(m::Int, n::Int; rank::Int=10, noise_level::Real=0.0, seed=nothing)

Generate a synthetic non-negative data matrix X as W * H with random non-negative factors. Optionally adds Gaussian noise and clips negative values to keep X ≥ 0.

Arguments

m::Int: Number of rows of X.
n::Int: Number of columns of X.

Keyword Arguments

rank::Int=10: Rank of the factorization.
noise_level::Real=0.0: Standard deviation of Gaussian noise.
seed=nothing: Optional random seed for reproducibility.

Returns

X::Matrix{Float64}: Generated non-negative data matrix.
W::Matrix{Float64}: Left factor of size (m, rank).
H::Matrix{Float64}: Right factor of size (rank, n).

Side Effects

None.

Errors

None.

Notes

Uses a local RNG seeded with seed (if provided) for deterministic output.
When noise_level > 0, the result is clipped at 0.0 to enforce non-negativity.

Examples

julia> X, W, H = generate_synthetic_data(20, 15; rank=3, seed=42);

julia> size(W), size(H)
((20, 3), (3, 15))

julia> minimum(X) >= 0
true

source

RobustNMF.add_gaussian_noise! — Function

add_gaussian_noise!(X::AbstractMatrix; σ::Real=0.1, clip_at_zero::Bool=true)

Add Gaussian noise with standard deviation σ to the matrix X in-place. Optionally clip negative entries to preserve non-negativity.

Arguments

X::AbstractMatrix: Data matrix to be corrupted.

Keyword Arguments

σ::Real=0.1: Noise standard deviation.
clip_at_zero::Bool=true: Enforce non-negativity after corruption.

Returns

X: The modified input matrix (in-place).

Side Effects

Modifies X in-place.

Errors

None.

Notes

Uses randn! to generate Gaussian noise.
If clip_at_zero=true, negative values are replaced by 0.0.

Examples

julia> X = abs.(randn(5, 5));

julia> add_gaussian_noise!(X; σ=0.2);

julia> minimum(X) >= 0
true

source

RobustNMF.add_sparse_outliers! — Function

add_sparse_outliers!(X::AbstractMatrix; fraction::Real=0.01, magnitude::Real=5.0, seed=nothing)

Add sparse, large positive outliers to a fraction of the entries of X in-place.

Arguments

X::AbstractMatrix: Data matrix to be corrupted.

Keyword Arguments

fraction::Real=0.01: Fraction of entries to corrupt.
magnitude::Real=5.0: Maximum outlier amplitude.
seed=nothing: Optional random seed.

Returns

X: The modified input matrix (in-place).

Side Effects

Modifies X in-place.

Errors

None.

Notes

Corrupts max(1, round(Int, fraction * m*n)) entries.
Added outliers are sampled from Uniform(0, magnitude).

Examples

julia> X = zeros(10, 10);

julia> add_sparse_outliers!(X; fraction=0.05, seed=1);

julia> count(>(0.0), X) > 0
true

source

RobustNMF.normalize_nonnegative! — Function

normalize_nonnegative!(X::AbstractMatrix; rescale::Bool=true)

Shift X in-place so that the minimum becomes 0.0, and optionally rescale to [0, 1].

Arguments

X::AbstractMatrix: Input matrix.

Keyword Arguments

rescale::Bool=true: Whether to divide by the maximum value after shifting.

Returns

X: The normalized matrix (in-place).

Side Effects

Modifies X in-place.

Errors

None.

Notes

If rescale=true and maximum(X) == 0, rescaling is skipped.

Examples

julia> X = [-1.0 2.0; 3.0 -4.0];

julia> normalize_nonnegative!(X);

julia> minimum(X), maximum(X)
(0.0, 1.0)

source

RobustNMF.load_image_folder — Function

load_image_folder(dir::AbstractString; pattern::AbstractString=".png", normalize::Bool=true)

Load images from a folder, convert to grayscale, flatten, and stack them as columns of X.

Arguments

dir::AbstractString: Path to the image directory.

Keyword Arguments

pattern::AbstractString=".png": File extension filter (matched via endswith).
normalize::Bool=true: Normalize output matrix to [0, 1].

Returns

X::Matrix{Float64}: One column per image.
(height, width): Original image dimensions.
filenames::Vector{String}: Loaded base file names.

Side Effects

Reads image files from disk.

Errors

ErrorException: If the directory does not exist or no files match pattern.
ErrorException: If images have inconsistent sizes.

Notes

Images are converted to grayscale and stored as Float64.
If normalize=true, normalize_nonnegative! is applied to X.

Examples

julia> # X, size, names = load_image_folder("faces/")

source

Visualization Functions

Convergence Plot

Track how the algorithm converges over iterations.

RobustNMF.plot_convergence — Function

plot_convergence(history::Vector;
                 title::String="NMF Convergence",
                 objective::Symbol=:auto,
                 ylabel::Union{Nothing,String}=nothing,
                 log_scale::Bool=true)

Plot the convergence history (history) produced by an NMF routine.

This function is intentionally objective-agnostic: depending on the algorithm, history may contain Frobenius reconstruction error (standard NMF), Huber loss (robust NMF), or another objective value.

Arguments

history::Vector: Objective values recorded per iteration.

Keyword Arguments

title::String: Plot title.
objective::Symbol=:auto: Hint for labeling the objective.
- :frobenius → "Frobenius Error"
- :huber → "Huber Loss"
- :l21 → "L2,1 Loss"
- :auto → "Objective" (neutral default)
ylabel::Union{Nothing,String}=nothing: Explicit y-axis label. If provided, this overrides objective.
log_scale::Bool=true: Use logarithmic scale for y-axis.

Returns

p::Plots.Plot: Plot object.

Side Effects

None.

Errors

None.

Notes

Log scaling helps visualize early convergence behavior.

Examples

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(20, 12; rank=4, seed=4);

julia> _, _, history = nmf(X; rank=4, maxiter=500, tol=1e-5, seed=101);

julia> p = plot_convergence(history; objective=:frobenius);

julia> _, _, history = robustnmf(X; rank=10, maxiter=500);

julia> plot_convergence(history; objective=:huber);

julia> p isa Plots.Plot
true

source

Basis Vectors

Visualize the learned basis vectors (W matrix).

RobustNMF.plot_basis_vectors — Function

plot_basis_vectors(W::AbstractMatrix; img_shape=nothing, max_components::Int=16,
                   title::String="Basis Vectors (W)", layout=nothing)

Visualize the basis vectors (columns of W) as heatmaps or images.

Arguments

W::AbstractMatrix: Basis matrix of size (m, rank).

Keyword Arguments

img_shape: Tuple (height, width) to reshape each basis vector as an image. If nothing, displays as 1D heatmaps.
max_components::Int=16: Maximum number of components to display.
title::String: Plot title.
layout: Custom layout tuple (rows, cols). If nothing, auto-computed.

Returns

p::Plots.Plot: Plot object showing the basis vectors.

Side Effects

None.

Errors

ErrorException: If img_shape does not match the length of a basis vector.

Notes

The number of displayed components is min(rank, max_components).
Layout is chosen to be roughly square when not provided.

Examples

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(40, 30; rank=6, seed=1);

julia> W, _, _ = nmf(X; rank=6, maxiter=50, tol=1e-5, seed=42);

julia> p = plot_basis_vectors(W; max_components=6);

julia> p isa Plots.Plot
true

source

Usage:

Each subplot shows one basis vector
For images: Shows meaningful parts (e.g., facial features, object components)
For text: Represents topics or themes

Reconstruction Comparison

Compare original data vs. reconstructed data side-by-side.

RobustNMF.plot_reconstruction_comparison — Function

plot_reconstruction_comparison(X_original::AbstractMatrix, X_recon::AbstractMatrix;
                               img_shape=nothing, n_samples::Int=5,
                               title::String="Reconstruction Comparison")

Compare original data with reconstructed data side by side.

Arguments

X_original::AbstractMatrix: Original data matrix.
X_recon::AbstractMatrix: Reconstructed data matrix (W * H).

Keyword Arguments

img_shape: Tuple (height, width) for reshaping columns as images.
n_samples::Int=5: Number of samples to display.
title::String: Plot title.

Returns

p::Plots.Plot: Plot object showing original vs reconstructed samples.

Side Effects

None.

Errors

AssertionError: If X_original and X_recon have different dimensions.
ErrorException: If img_shape is incompatible with column length.

Notes

For image data, each sample is displayed as an image pair.
For vector data, each sample is plotted as a line comparison.

Examples

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(20, 12; rank=4, seed=3);

julia> W, H, _ = nmf(X; rank=4, maxiter=50, tol=1e-5, seed=100);

julia> p = plot_reconstruction_comparison(X, W * H; n_samples=3);

julia> p isa Plots.Plot
true

source

NMF Summary

Creates a comprehensive summary with basis vectors, reconstructions, and convergence in one figure.

RobustNMF.plot_nmf_summary — Function

plot_nmf_summary(X::AbstractMatrix,
                 W::AbstractMatrix,
                 H::AbstractMatrix,
                 history::Vector;
                 img_shape=nothing,
                 max_basis::Int=9,
                 max_samples::Int=4,
                 objective::Symbol=:auto,
                 convergence_ylabel::Union{Nothing,String}=nothing,
                 title::String="NMF Summary")

Create a summary visualization for an NMF result.

The summary consists of:

Basis vectors / components (columns of W)
Activating coefficients (rows of H)
Reconstruction comparison (original vs reconstructed data)
Convergence curve (objective value over iterations)

This function is algorithm-agnostic and works for:

Standard NMF (Frobenius objective)
Robust NMF with Huber loss
Legacy L2,1 NMF

Arguments

X::AbstractMatrix: Original non-negative data matrix (m × n).
W::AbstractMatrix: Basis matrix (m × r).
H::AbstractMatrix: Coefficient matrix (r × n).
history::Vector: Objective values recorded during optimization.

Keyword Arguments

img_shape: Optional tuple (height, width) if columns of X or W represent vectorized images.
max_basis::Int=9: Maximum number of basis vectors (columns of W) to visualize.
max_samples::Int=4: Maximum number of samples / activations to visualize.
objective::Symbol=:auto: Type of objective used to generate history.
- :frobenius → squared Frobenius reconstruction objective
- :huber → Huber loss (robust NMF)
- :l21 → L2,1 loss
- :auto → neutral label ("Objective")
convergence_ylabel::Union{Nothing,String}=nothing: Explicit y-axis label for the convergence plot. If provided, this overrides the label implied by objective.
title::String="NMF Summary": Overall title for the summary figure.

Returns

p::Plots.Plot: Plot object with a comprehensive summary.

Side Effects

None.

Errors

None.

Notes

Includes a text panel with error metrics and iteration count.
For image data, set img_shape to visualize basis vectors and reconstructions.

Examples

This example runs the summary once for standard NMF and once for robust NMF (Huber).

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(30, 20; rank=5, seed=5);

julia> W, H, history = nmf(X; rank=5, maxiter=40, tol=1e-5, seed=102);

julia> p = plot_nmf_summary(X, W, H, history; objective=:frobenius, max_basis=4, max_samples=2);

julia> X, _, _ = generate_synthetic_data(30, 20; rank=5, seed=5);

julia> W, H, history = robustnmf(X; rank=5, maxiter=40, tol=1e-5, seed=103);

julia> p = plot_nmf_summary(X, W, H, history; objective=:huber, max_basis=4, max_samples=2);

julia> p isa Plots.Plot
true

source

Additional Visualization Functions

RobustNMF.plot_activation_coefficients — Function

plot_activation_coefficients(H::AbstractMatrix; max_samples::Int=10,
                             title::String="Activation Coefficients (H)")

Visualize the activation coefficient matrix H as a heatmap or as individual sample profiles.

Arguments

H::AbstractMatrix: Coefficient matrix of size (rank, n).

Keyword Arguments

max_samples::Int=10: Maximum number of samples to display (if showing individual profiles).
title::String: Plot title.

Returns

p::Plots.Plot: Plot object.

Side Effects

None.

Errors

None.

Notes

If n ≤ 100 and rank ≤ 50, a full heatmap is shown.
Otherwise, bar plots for up to max_samples samples are shown.

Examples

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(30, 20; rank=5, seed=2);

julia> _, H, _ = nmf(X; rank=5, maxiter=50, tol=1e-5, seed=99);

julia> p = plot_activation_coefficients(H; max_samples=5);

julia> p isa Plots.Plot
true

source

RobustNMF.plot_image_reconstruction — Function

plot_image_reconstruction(X::AbstractMatrix, W::AbstractMatrix, H::AbstractMatrix,
                          img_shape::Tuple{Int,Int}; indices=nothing, n_images::Int=5)

Specialized function for visualizing image reconstruction quality. Shows original, reconstructed, and difference images side by side.

Arguments

X::AbstractMatrix: Original image data (each column is a flattened image).
W::AbstractMatrix: Basis matrix.
H::AbstractMatrix: Coefficient matrix.
img_shape::Tuple{Int,Int}: Image dimensions (height, width).

Keyword Arguments

indices: Specific image indices to display. If nothing, randomly selected.
n_images::Int=5: Number of images to display.

Returns

p::Plots.Plot: Plot object.

Side Effects

None.

Errors

ErrorException: If img_shape does not match column length.

Notes

The difference image uses absolute error |X - W*H|.
If indices is provided, only the first n_images indices are used.

Examples

julia> using RobustNMF, Plots

julia> X, _, _ = generate_synthetic_data(100, 8; rank=5, seed=6);

julia> W, H, _ = nmf(X; rank=5, maxiter=40, tol=1e-5, seed=104);

julia> p = plot_image_reconstruction(X, W, H, (10, 10); n_images=3);

julia> p isa Plots.Plot
true

source

Key Parameters

Parameter	Default	Range	Notes
`rank`	-	5-50	Number of factors (usually 10-20)
`maxiter`	500	100-1000	Maximum iterations
`tol`	1e-4	1e-6 to 1e-3	Convergence tolerance
`delta`	1.0	0.5-2.0	Huber threshold (Robust NMF only)
`seed`	nothing	-	Random seed for reproducibility

About delta (Huber parameter):

Smaller values (0.5) → More robust to large outliers, but less precise
Larger values (2.0) → More precise on clean data, but less outlier-resistant
Start with delta=1.0 and tune based on your data

Performance Metrics

RMSE (Root Mean Square Error)

Measures average reconstruction error.

rmse = sqrt(mean((X - W*H).^2))

Lower is better
Standard NMF optimizes this metric

MAE (Mean Absolute Error)

Measures average absolute reconstruction error.

mae = mean(abs.(X - W*H))

Lower is better
Better metric for comparing robustness

Relative Error

Error as percentage of data magnitude.

rel_error = norm(X - W*H) / norm(X)

Lower is better
Typical range: 1-20% approximately

API Reference

Algorithms

Standard NMF

Robust NMF (Huber Loss)

Robust NMF (Legacy L2,1)

Helper Functions (Internal)

Data Generation and Preprocessing

Visualization Functions

Convergence Plot

Basis Vectors

Reconstruction Comparison

NMF Summary

Additional Visualization Functions

Key Parameters

Performance Metrics

RMSE (Root Mean Square Error)

MAE (Mean Absolute Error)

Relative Error