Annotations¶

The annotations module provides a lightweight way to work with region annotations stored in JSON files. Annotations can be loaded from disk, inspected in Python, converted to masks, and used to extract pixel samples from hsi.HSImage objects.

The Python API is available in the hsi.annotations module.

Loading annotation files¶

Annotation files are represented by hsi.annotations.AnnotationFile objects. An annotation file contains two things:

a list of annotations in annotations
the property definitions used by those annotations in property_desc

import hsi
import hsi.annotations

ann_file = hsi.annotations.open("annotations.json")

print(ann_file)
print(f"Number of annotations: {len(ann_file.annotations)}")
print(f"Defined properties: {list(ann_file.property_desc)}")

Inspecting annotations¶

Each hsi.annotations.Annotation contains a title, a UUID, a geometric descriptor, and a set of properties.

annotation = ann_file.annotations[0]

print(annotation.title)
print(annotation.uuid)
print(annotation.descriptor)
print(annotation.properties)

# Example: access individual property values
cls = annotation.properties["cls"]
value = annotation.properties["v"]
print(cls, value)

Property definitions are stored separately in property_desc. This is useful when you want to inspect the allowed labels or the meaning of a numeric value.

cls_property = ann_file.property_desc["cls"]
print(cls_property)

Working with descriptors¶

Each annotation has a hsi.annotations.Descriptor that defines its geometry. The current descriptor types are rectangles, polygons, and ellipses.

Descriptors expose their bounding box and can generate a boolean mask.

descriptor = annotation.descriptor

print(descriptor.x, descriptor.y)
print(descriptor.width, descriptor.height)

mask = descriptor.mask()
print(mask.shape)

Note

hsi.annotations.Descriptor.mask() returns a mask for the descriptor’s local bounding box, not a full-image mask. This is useful for inspection and custom processing of the annotation region itself.

Extracting pixels from an image¶

The most convenient way to use annotations together with image data is hsi.HSImage.select_mask_from_descriptor(). This method applies the descriptor in image coordinates and returns the covered pixels as a flattened image.

import hsi
import hsi.annotations

img = hsi.open("image.hdr")
ann_file = hsi.annotations.open("annotations.json")

annotation = ann_file.annotations[0]

selected = img.select_mask_from_descriptor(annotation.descriptor)
spectra = selected.to_numpy()

print(spectra.shape)

The output is flattened to \(L \times 1 \times B\), where \(L\) is the number of covered pixels and \(B\) is the number of bands in the source image.

This makes it straightforward to collect samples from multiple annotations.

samples = []

for annotation in ann_file.annotations:
    selected = img.select_mask_from_descriptor(annotation.descriptor)
    samples.append(
        {
            "title": annotation.title,
            "properties": annotation.properties,
            "spectra": selected.to_numpy(),
        }
    )

Writing annotation files¶

Annotation files can be written back to JSON using hsi.annotations.write().

import hsi.annotations

ann_file = hsi.annotations.open("annotations.json")

# Save a copy of the file
hsi.annotations.write(ann_file, "annotations_copy.json")

This is useful when you want to load annotation metadata, process it in Python, and store the result in the same format.

Training a simple linear SVC¶

Annotation files are also a convenient way to build labeled training data. A common pattern is to use one annotation property as the class label, map the stored label strings to class IDs using property_desc, extract the covered spectra for each annotation, and then fit a model on the resulting samples.

import numpy as np
from sklearn.svm import LinearSVC

import hsi
import hsi.annotations

img = hsi.open("image.hdr")
ann_file = hsi.annotations.open("annotations.json")

class_property = ann_file.property_desc["cls"]
class_to_id = {label: idx for idx, label in enumerate(class_property.labels)}

x_train = []
y_train = []

for annotation in ann_file.annotations:
    label_name = annotation.properties["cls"]
    label_id = class_to_id[label_name]

    selected = img.select_mask_from_descriptor(annotation.descriptor)
    spectra = selected.to_numpy_with_interleave(hsi.bip)[:, 0, :]

    x_train.append(spectra)
    y_train.extend([label_id] * len(spectra))

x_train = np.concatenate(x_train, axis=0)
y_train = np.asarray(y_train)

model = LinearSVC()
model.fit(x_train, y_train)

In this example, the cls property stores the class as a string, while property_desc["cls"] defines the full label set. The example converts each label string to a numeric class ID before training. Each covered pixel becomes one training sample, so the final training matrix has shape \(N \times B\), where \(N\) is the total number of selected pixels and \(B\) is the number of bands.

Note

This approach works best when each annotation covers a reasonably homogeneous region. If annotations contain mixed material, the extracted pixel spectra will also mix classes.