Preprocessing and Analysis
This page covers spectral and spatial preprocessing, saving processed cubes, and simple analysis workflows. For opening, calibrating, and visualizing datacubes, see Basics.
Refer to the HV SDK Usage Guide for conceptual details about lazy operations, interleave, and calibration. And for the complete API reference, see the official HV SDK documentation.
The downloadable .py files can be run without editing paths by setting
environment variables such as HSI_EXAMPLE_BASE_DIR. See
Running Downloaded Scripts
for the full list of supported overrides.
Preprocessing
SNV
Standard Normal Variate normalizes each spectrum independently. It is useful when multiplicative scattering or intensity offsets dominate the spectra.
reflectance = open_reflectance_cube()
snv_reflectance = snv(reflectance)
line = reflectance.array_plane(300, hs.lines).T
snv_line = snv_reflectance.array_plane(300, hs.lines).T
print(f"Raw line shape: {line.shape}")
print(f"SNV line shape: {snv_line.shape}")
samples = [530, 962]
wavelengths = wavelengths_for_image(reflectance)
fig, axes = plt.subplots(1, 2, figsize=(12, 4), sharex=True)
for sample in samples:
axes[0].plot(wavelengths, line[sample], label=f"sample {sample}")
axes[1].plot(wavelengths, snv_line[sample], label=f"sample {sample}")
axes[0].set_title("Reflectance spectra")
axes[0].set_ylabel("Reflectance")
axes[1].set_title("SNV spectra")
axes[1].set_ylabel("SNV")
for ax in axes:
ax.set_xlabel("Wavelength [nm]")
ax.grid(True, alpha=0.3)
ax.legend()
fig.tight_layout()
plt.show()
The SDK snv() function returns a new lazy Image. It normalizes along the
band axis, so each pixel spectrum is corrected independently while the spatial
shape of the cube is preserved.
Smoothing Spectra
Savitzky-Golay smoothing reduces spectral noise while preserving peak shape.
reflectance = open_reflectance_cube()
# Savitzky-Golay filter only supports BIP and BIL
reflectance = reflectance.to_interleave(hs.bip)
smoothed_reflectance = savgol_filter(reflectance, window_length=21, polyorder=3)
x = 450
y = 320
spectrum = reflectance[y, x, :].to_numpy_with_interleave(hs.bip)[0, 0, :]
smoothed_spectrum = smoothed_reflectance[y, x, :].to_numpy_with_interleave(hs.bip)[0, 0, :]
wavelengths = wavelengths_for_image(reflectance)
plt.plot(wavelengths, spectrum, alpha=0.5, label="raw")
plt.plot(wavelengths, smoothed_spectrum, label="smoothed")
plt.xlabel("Wavelength [nm]")
plt.ylabel("Reflectance")
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()
The SDK savgol_filter() returns a new lazy Image. The window length
must be odd and should be chosen according to the spectral resolution and the
width of the features you want to preserve.
Band Selection
Band selection removes unneeded or noisy spectral regions and can make later processing faster. It is also useful for simulating what would happen if the camera captured fewer bands. On Hypervision cameras, reducing the number of captured bands can increase the maximum frame rate, which is important for real-time applications.
reflectance = open_reflectance_cube()
start_nm = 500
stop_nm = 1000
start_band = band_index_for_wavelength(start_nm, reflectance)
stop_band = band_index_for_wavelength(stop_nm, reflectance)
selected_range = reflectance[:, :, start_band:stop_band + 1]
every_fourth_band = reflectance[:, :, ::4]
rgb_band_indices = sorted([
band_index_for_wavelength(650, reflectance),
band_index_for_wavelength(550, reflectance),
band_index_for_wavelength(460, reflectance),
])
selected_rgb_bands = reflectance.select_bands(rgb_band_indices)
print(f"Selected band range: {start_band}:{stop_band + 1}")
print(f"Selected range shape: {selected_range.shape}")
print(f"Every fourth band shape: {every_fourth_band.shape}")
print(f"Explicit band selection shape: {selected_rgb_bands.shape}")
Use wavelength-based selection when possible. Hard-coded band indices are harder
to reuse across sensors or calibrations. Use slicing for contiguous ranges and
strides, and use select_bands() with increasing band indices when you need a
specific list of bands read as one smaller cube.
Spatial and Spectral Binning
Binning averages neighboring values along one axis. It can reduce data size, improve processing speed, and improve SNR when the noise is mostly random. The tradeoff is reduced spatial or spectral resolution.
Use the SDK binning() operation to keep the processing lazy instead of
exporting the cube to NumPy first.
reflectance = open_reflectance_cube()
# Average groups of 4 neighboring samples (spatial columns)
sample_binned = reflectance.binning(4, hs.samples)
# Average groups of 4 neighboring lines (spatial rows)
line_binned = reflectance.binning(4, hs.lines)
# Average groups of 4 neighboring lines and samples (full 4x4 spatial binning)
spatial_binned = reflectance.binning(4, hs.lines).binning(4, hs.samples)
# Average groups of 6 neighboring bands (spectral channels)
spectral_binned = reflectance.binning(6, hs.bands)
print(sample_binned.shape)
print(line_binned.shape)
print(spatial_binned.shape)
print(spectral_binned.shape)
Spatial binning is useful when speed, file size, or random-noise reduction matters more than preserving the finest spatial detail. Spectral binning is useful when you can trade spectral resolution for smoother spectra or faster downstream processing.
Cropping
Cropping is ordinary SDK slicing. It is lazy, so the cropped data is not read
until you access it, export it to NumPy, or write it to disk. Use the logical
SDK order (lines, samples, bands).
reflectance = open_reflectance_cube()
x0, x1 = 250, 1050
y0, y1 = 100, 700
cropped = reflectance[y0:y1, x0:x1, :]
print(f"Original shape: {reflectance.shape}")
print(f"Cropped shape: {cropped.shape}")
Saving a Processed Cube
The SDK can write a lazy processing pipeline directly. This means operations such as calibration, band selection, binning, cropping, and type conversion are evaluated as the output file is written, without first materializing the full processed cube in memory.
reflectance = open_reflectance_cube()
x0, x1 = 250, 1050
y0, y1 = 100, 700
start_band = band_index_for_wavelength(500, reflectance)
stop_band = band_index_for_wavelength(1000, reflectance)
processed = reflectance[y0:y1, x0:x1, start_band:stop_band + 1]
processed = (processed.clip(0, 1) * 255).ensure_dtype(hs.uint8)
OUTPUT_PATH = Path(os.environ.get("HSI_EXAMPLE_OUTPUT", "processed_reflectance_uint8.hdr"))
hs.write(processed, str(OUTPUT_PATH))
print(f"Saved processed reflectance cube to {OUTPUT_PATH} with shape: {processed.shape}")
ENVI (.hdr) is recommended for HSI workflows because it supports richer
metadata than many image formats. The SDK also supports other output formats
(PAM and TIFF), but ENVI is usually the safest choice for processed
hyperspectral cubes.
Reflectance calibration produces floating-point data, which is best for
analysis. If you need to save disk space for visualization or downstream tools
that do not require full precision, scale the clipped reflectance range
0..1 to 0..255 and convert to hs.uint8 as shown above. Keep the
floating-point version if you need quantitative accuracy.
Analysis
Analysis workflows help you understand structure in the datacube before moving to supervised modeling or deployment.
Band-Ratio Index
A spectral index compares two bands. This is useful when prior knowledge tells you which wavelengths should separate the material or property of interest.
reflectance = open_reflectance_cube()
band_a = band_index_for_wavelength(970, reflectance)
band_b = band_index_for_wavelength(750, reflectance)
selected_band_indices = sorted([band_a, band_b])
selected = reflectance.select_bands(selected_band_indices).to_numpy_with_interleave(hs.bip)
a = selected[:, :, selected_band_indices.index(band_a)]
b = selected[:, :, selected_band_indices.index(band_b)]
index = (a - b) / (a + b + 1e-8)
plt.imshow(contrast_stretch(index), cmap="viridis")
plt.title("Band-ratio index")
plt.axis("off")
plt.show()
The 1e-8 term avoids division by zero.
Anomaly Detection From Background Pixels
If you have a background ROI, you can model its spectral distribution and highlight pixels that are far away from it. This example uses Mahalanobis distance on a spectrally downsampled cube.
The background pixels are selected from an HV Explorer annotation with
select_mask_from_descriptor(). See
Annotations and ROIs for the general pattern
for loading annotations, selecting ROI pixels, and plotting ROIs on preview
images.
class MahalanobisDistance:
def __init__(self, covariance_model):
self.covariance_model = covariance_model
def predict(self, pixels):
return self.covariance_model.mahalanobis(pixels).reshape(-1, 1)
reflectance = open_reflectance_cube()
ann_file = load_annotations()
background_annotation = find_background_annotation(ann_file)
background = reflectance.select_mask_from_descriptor(background_annotation.descriptor)
# Mask selection returns a compact Image containing only the selected pixels.
# In BIP order this is [selected_pixels, 1, bands].
background_pixels = background.to_numpy_with_interleave(hs.bip)[:, 0, :]
band_step = 8
band_indices = list(range(0, background_pixels.shape[1], band_step))
model = EmpiricalCovariance()
model.fit(background_pixels[:, band_indices])
preview_cube = reflectance[0:250, 0:400, :].select_bands(band_indices)
# Wrap the sklearn-style model so it can run lazily over an Image pipeline.
distance_pipeline = predictor(MahalanobisDistance(model))(preview_cube)
distance = distance_pipeline.to_numpy_with_interleave(hs.bip)[:, :, 0]
plt.imshow(contrast_stretch(distance), cmap="magma")
plt.title("Anomaly map")
plt.axis("off")
plt.show()
This is a good pattern when "anything different from the annotated background" is more important than assigning a known class.