Annotations¶
The annotations module provides a lightweight way to work with region annotations stored in JSON files.
Annotations can be loaded from disk, inspected in Python, converted to masks, and used to extract pixel
samples from hsi.HSImage
The Python API is available in the hsi.annotations module.
Loading annotation files¶
Annotation files are represented by hsi.annotations.AnnotationFile objects. An annotation file contains
two things:
a list of annotations in
annotationsthe property definitions used by those annotations in
property_desc
import hsi
import hsi.annotations
ann_file = hsi.annotations.open("annotations.json")
print(ann_file)
print(f"Number of annotations: {len(ann_file.annotations)}")
print(f"Defined properties: {list(ann_file.property_desc)}")
Inspecting annotations¶
Each hsi.annotations.Annotation contains a title, a UUID, a geometric descriptor, and a set of
properties.
annotation = ann_file.annotations[0]
print(annotation.title)
print(annotation.uuid)
print(annotation.descriptor)
print(annotation.properties)
# Example: access individual property values
cls = annotation.properties["cls"]
value = annotation.properties["v"]
print(cls, value)
Property definitions are stored separately in property_desc. This is useful when you want to inspect the
allowed labels or the meaning of a numeric value.
cls_property = ann_file.property_desc["cls"]
print(cls_property)
Working with descriptors¶
Each annotation has a hsi.annotations.Descriptor that defines its geometry. The current descriptor types
are rectangles, polygons, and ellipses.
Descriptors expose their bounding box and can generate a boolean mask.
descriptor = annotation.descriptor
print(descriptor.x, descriptor.y)
print(descriptor.width, descriptor.height)
mask = descriptor.mask()
print(mask.shape)
Note
hsi.annotations.Descriptor.mask() returns a mask for the descriptor’s local bounding box, not a full-image
mask. This is useful for inspection and custom processing of the annotation region itself.
Extracting pixels from an image¶
The most convenient way to use annotations together with image data is
hsi.HSImage.select_mask_from_descriptor(). This method applies the descriptor in
image coordinates and returns the covered pixels as a flattened image.
import hsi
import hsi.annotations
img = hsi.open("image.hdr")
ann_file = hsi.annotations.open("annotations.json")
annotation = ann_file.annotations[0]
selected = img.select_mask_from_descriptor(annotation.descriptor)
spectra = selected.to_numpy()
print(spectra.shape)
The output is flattened to \(L \times 1 \times B\), where \(L\) is the number of covered pixels and \(B\) is the number of bands in the source image.
This makes it straightforward to collect samples from multiple annotations.
samples = []
for annotation in ann_file.annotations:
selected = img.select_mask_from_descriptor(annotation.descriptor)
samples.append(
{
"title": annotation.title,
"properties": annotation.properties,
"spectra": selected.to_numpy(),
}
)
Writing annotation files¶
Annotation files can be written back to JSON using hsi.annotations.write().
import hsi.annotations
ann_file = hsi.annotations.open("annotations.json")
# Save a copy of the file
hsi.annotations.write(ann_file, "annotations_copy.json")
This is useful when you want to load annotation metadata, process it in Python, and store the result in the same format.
Training a simple linear SVC¶
Annotation files are also a convenient way to build labeled training data. A common pattern is to use one annotation
property as the class label, map the stored label strings to class IDs using property_desc, extract the
covered spectra for each annotation, and then fit a model on the resulting samples.
import numpy as np
from sklearn.svm import LinearSVC
import hsi
import hsi.annotations
img = hsi.open("image.hdr")
ann_file = hsi.annotations.open("annotations.json")
class_property = ann_file.property_desc["cls"]
class_to_id = {label: idx for idx, label in enumerate(class_property.labels)}
x_train = []
y_train = []
for annotation in ann_file.annotations:
label_name = annotation.properties["cls"]
label_id = class_to_id[label_name]
selected = img.select_mask_from_descriptor(annotation.descriptor)
spectra = selected.to_numpy_with_interleave(hsi.bip)[:, 0, :]
x_train.append(spectra)
y_train.extend([label_id] * len(spectra))
x_train = np.concatenate(x_train, axis=0)
y_train = np.asarray(y_train)
model = LinearSVC()
model.fit(x_train, y_train)
In this example, the cls property stores the class as a string, while property_desc["cls"] defines
the full label set. The example converts each label string to a numeric class ID before training. Each covered pixel
becomes one training sample, so the final training matrix has shape \(N \times B\), where \(N\) is the total
number of selected pixels and \(B\) is the number of bands.
Note
This approach works best when each annotation covers a reasonably homogeneous region. If annotations contain mixed material, the extracted pixel spectra will also mix classes.