Skip to main content

Preprocessing and Analysis

This page covers spectral and spatial preprocessing, saving processed cubes, and simple analysis workflows. For opening, calibrating, and visualizing datacubes, see Basics.

Refer to the HV SDK Usage Guide for conceptual details about lazy operations, interleave, and calibration. And for the complete API reference, see the official HV SDK documentation.

Downloaded scripts

The downloadable .py files can be run without editing paths by setting environment variables such as HSI_EXAMPLE_BASE_DIR. See Running Downloaded Scripts for the full list of supported overrides.

Preprocessing

SNV

Standard Normal Variate normalizes each spectrum independently. It is useful when multiplicative scattering or intensity offsets dominate the spectra.

reflectance = open_reflectance_cube()
snv_reflectance = snv(reflectance)

line = reflectance.array_plane(300, hs.lines).T
snv_line = snv_reflectance.array_plane(300, hs.lines).T

print(f"Raw line shape: {line.shape}")
print(f"SNV line shape: {snv_line.shape}")

samples = [530, 962]
wavelengths = wavelengths_for_image(reflectance)

fig, axes = plt.subplots(1, 2, figsize=(12, 4), sharex=True)

for sample in samples:
axes[0].plot(wavelengths, line[sample], label=f"sample {sample}")
axes[1].plot(wavelengths, snv_line[sample], label=f"sample {sample}")

axes[0].set_title("Reflectance spectra")
axes[0].set_ylabel("Reflectance")
axes[1].set_title("SNV spectra")
axes[1].set_ylabel("SNV")

for ax in axes:
ax.set_xlabel("Wavelength [nm]")
ax.grid(True, alpha=0.3)
ax.legend()

fig.tight_layout()
plt.show()

Download this script

info

The SDK snv() function returns a new lazy Image. It normalizes along the band axis, so each pixel spectrum is corrected independently while the spatial shape of the cube is preserved.

Smoothing Spectra

Savitzky-Golay smoothing reduces spectral noise while preserving peak shape.

reflectance = open_reflectance_cube()

# Savitzky-Golay filter only supports BIP and BIL
reflectance = reflectance.to_interleave(hs.bip)
smoothed_reflectance = savgol_filter(reflectance, window_length=21, polyorder=3)

x = 450
y = 320
spectrum = reflectance[y, x, :].to_numpy_with_interleave(hs.bip)[0, 0, :]
smoothed_spectrum = smoothed_reflectance[y, x, :].to_numpy_with_interleave(hs.bip)[0, 0, :]
wavelengths = wavelengths_for_image(reflectance)

plt.plot(wavelengths, spectrum, alpha=0.5, label="raw")
plt.plot(wavelengths, smoothed_spectrum, label="smoothed")
plt.xlabel("Wavelength [nm]")
plt.ylabel("Reflectance")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

Download this script

info

The SDK savgol_filter() returns a new lazy Image. The window length must be odd and should be chosen according to the spectral resolution and the width of the features you want to preserve.

Band Selection

Band selection removes unneeded or noisy spectral regions and can make later processing faster. It is also useful for simulating what would happen if the camera captured fewer bands. On Hypervision cameras, reducing the number of captured bands can increase the maximum frame rate, which is important for real-time applications.

reflectance = open_reflectance_cube()

start_nm = 500
stop_nm = 1000

start_band = band_index_for_wavelength(start_nm, reflectance)
stop_band = band_index_for_wavelength(stop_nm, reflectance)

selected_range = reflectance[:, :, start_band:stop_band + 1]
every_fourth_band = reflectance[:, :, ::4]
rgb_band_indices = sorted([
band_index_for_wavelength(650, reflectance),
band_index_for_wavelength(550, reflectance),
band_index_for_wavelength(460, reflectance),
])
selected_rgb_bands = reflectance.select_bands(rgb_band_indices)

print(f"Selected band range: {start_band}:{stop_band + 1}")
print(f"Selected range shape: {selected_range.shape}")
print(f"Every fourth band shape: {every_fourth_band.shape}")
print(f"Explicit band selection shape: {selected_rgb_bands.shape}")

Download this script

tip

Use wavelength-based selection when possible. Hard-coded band indices are harder to reuse across sensors or calibrations. Use slicing for contiguous ranges and strides, and use select_bands() with increasing band indices when you need a specific list of bands read as one smaller cube.

Spatial and Spectral Binning

Binning averages neighboring values along one axis. It can reduce data size, improve processing speed, and improve SNR when the noise is mostly random. The tradeoff is reduced spatial or spectral resolution.

Use the SDK binning() operation to keep the processing lazy instead of exporting the cube to NumPy first.

reflectance = open_reflectance_cube()

# Average groups of 4 neighboring samples (spatial columns)
sample_binned = reflectance.binning(4, hs.samples)

# Average groups of 4 neighboring lines (spatial rows)
line_binned = reflectance.binning(4, hs.lines)

# Average groups of 4 neighboring lines and samples (full 4x4 spatial binning)
spatial_binned = reflectance.binning(4, hs.lines).binning(4, hs.samples)

# Average groups of 6 neighboring bands (spectral channels)
spectral_binned = reflectance.binning(6, hs.bands)

print(sample_binned.shape)
print(line_binned.shape)
print(spatial_binned.shape)
print(spectral_binned.shape)

Download this script

tip

Spatial binning is useful when speed, file size, or random-noise reduction matters more than preserving the finest spatial detail. Spectral binning is useful when you can trade spectral resolution for smoother spectra or faster downstream processing.

Cropping

Cropping is ordinary SDK slicing. It is lazy, so the cropped data is not read until you access it, export it to NumPy, or write it to disk. Use the logical SDK order (lines, samples, bands).

reflectance = open_reflectance_cube()

x0, x1 = 250, 1050
y0, y1 = 100, 700

cropped = reflectance[y0:y1, x0:x1, :]

print(f"Original shape: {reflectance.shape}")
print(f"Cropped shape: {cropped.shape}")

Download this script

Saving a Processed Cube

The SDK can write a lazy processing pipeline directly. This means operations such as calibration, band selection, binning, cropping, and type conversion are evaluated as the output file is written, without first materializing the full processed cube in memory.

reflectance = open_reflectance_cube()

x0, x1 = 250, 1050
y0, y1 = 100, 700
start_band = band_index_for_wavelength(500, reflectance)
stop_band = band_index_for_wavelength(1000, reflectance)

processed = reflectance[y0:y1, x0:x1, start_band:stop_band + 1]
processed = (processed.clip(0, 1) * 255).ensure_dtype(hs.uint8)

OUTPUT_PATH = Path(os.environ.get("HSI_EXAMPLE_OUTPUT", "processed_reflectance_uint8.hdr"))
hs.write(processed, str(OUTPUT_PATH))

print(f"Saved processed reflectance cube to {OUTPUT_PATH} with shape: {processed.shape}")

Download this script

File Format

ENVI (.hdr) is recommended for HSI workflows because it supports richer metadata than many image formats. The SDK also supports other output formats (PAM and TIFF), but ENVI is usually the safest choice for processed hyperspectral cubes.

Type conversion (and scaling)

Reflectance calibration produces floating-point data, which is best for analysis. If you need to save disk space for visualization or downstream tools that do not require full precision, scale the clipped reflectance range 0..1 to 0..255 and convert to hs.uint8 as shown above. Keep the floating-point version if you need quantitative accuracy.

Analysis

Analysis workflows help you understand structure in the datacube before moving to supervised modeling or deployment.

Band-Ratio Index

A spectral index compares two bands. This is useful when prior knowledge tells you which wavelengths should separate the material or property of interest.

reflectance = open_reflectance_cube()

band_a = band_index_for_wavelength(970, reflectance)
band_b = band_index_for_wavelength(750, reflectance)

selected_band_indices = sorted([band_a, band_b])
selected = reflectance.select_bands(selected_band_indices).to_numpy_with_interleave(hs.bip)
a = selected[:, :, selected_band_indices.index(band_a)]
b = selected[:, :, selected_band_indices.index(band_b)]

index = (a - b) / (a + b + 1e-8)

plt.imshow(contrast_stretch(index), cmap="viridis")
plt.title("Band-ratio index")
plt.axis("off")
plt.show()

Download this script

tip

The 1e-8 term avoids division by zero.

Anomaly Detection From Background Pixels

If you have a background ROI, you can model its spectral distribution and highlight pixels that are far away from it. This example uses Mahalanobis distance on a spectrally downsampled cube.

The background pixels are selected from an HV Explorer annotation with select_mask_from_descriptor(). See Annotations and ROIs for the general pattern for loading annotations, selecting ROI pixels, and plotting ROIs on preview images.

class MahalanobisDistance:
def __init__(self, covariance_model):
self.covariance_model = covariance_model

def predict(self, pixels):
return self.covariance_model.mahalanobis(pixels).reshape(-1, 1)


reflectance = open_reflectance_cube()

ann_file = load_annotations()
background_annotation = find_background_annotation(ann_file)
background = reflectance.select_mask_from_descriptor(background_annotation.descriptor)
# Mask selection returns a compact Image containing only the selected pixels.
# In BIP order this is [selected_pixels, 1, bands].
background_pixels = background.to_numpy_with_interleave(hs.bip)[:, 0, :]

band_step = 8
band_indices = list(range(0, background_pixels.shape[1], band_step))
model = EmpiricalCovariance()
model.fit(background_pixels[:, band_indices])

preview_cube = reflectance[0:250, 0:400, :].select_bands(band_indices)
# Wrap the sklearn-style model so it can run lazily over an Image pipeline.
distance_pipeline = predictor(MahalanobisDistance(model))(preview_cube)

distance = distance_pipeline.to_numpy_with_interleave(hs.bip)[:, :, 0]

plt.imshow(contrast_stretch(distance), cmap="magma")
plt.title("Anomaly map")
plt.axis("off")
plt.show()

Download this script

tip

This is a good pattern when "anything different from the annotated background" is more important than assigning a known class.