camfi.datamodel.via module

Defines data structures relating to VGG Image Annotator. Depends on camfi.datamodel.geometry.

class camfi.datamodel.via.ViaFileAttributes(*, datetime_corrected: datetime.datetime = None, datetime_original: datetime.datetime = None, exposure_time: pydantic.types.PositiveFloat = None, location: str = None, pixel_x_dimension: pydantic.types.PositiveInt = None, pixel_y_dimension: pydantic.types.PositiveInt = None)

Bases: pydantic.main.BaseModel

Contains file-level metadata for a single photograph.

Parameters
  • datetime_corrected (Optional[datetime]) – Time image was taken (after taking the error of the camera’s clock into account).

  • datetime_original (Optional[datetime]) – Time image was taken (according to camera).

  • exposure_time (Optional[PositiveFloat]) – The exposure time of the image in seconds, as reported by the camera.

  • location (Optional[str]) – The location the image was taken.

  • pixel_x_dimension (Optional[PositiveInt]) – The width of the image in pixels.

  • pixel_y_dimension (Optional[PositiveInt]) – The height of the image in pixels.

class camfi.datamodel.via.ViaMetadata(*, file_attributes: camfi.datamodel.via.ViaFileAttributes, filename: pathlib.Path, regions: list, size: int = - 1)

Bases: pydantic.main.BaseModel

Combines file-level image metadata with a list of annotations contained within the image.

Parameters
  • file_attributes (ViaFileAttributes) – File-level image metadata.

  • filename (Path) – Relative path to image file.

  • regions (list[ViaRegion]) – list of flying insect annotations.

  • size (int = -1) – Not used. (Included for compatability with VIA).

filter_regions(region_filters: camfi.datamodel.region_filter_config.RegionFilterConfig) None

Filters regions in-place.

Parameters

region_filters (RegionFilterConfig) – Filters to apply.

get_bounding_boxes() list

Calls .get_bounding_box on each region in self.regions.

Returns

boxes – list of bounding boxes, with one BoundingBox per item in self.regions.

Return type

list[BoundingBox]

get_labels() list

Gets a list full of 1’s with same length as self.regions.

Returns

labels – [1, 1, 1, …]

Return type

list[int]

load_exif_metadata(root: pathlib.Path = PosixPath('.'), location: Optional[str] = None, datetime_corrector: Optional[Callable[[datetime.datetime], datetime.datetime]] = None) None

Extract EXIF metadata from an image file and put it in self.file_attributes.

Note: this will overwrite all contents in self.file_attributes.

EXIF tags loaded:
  • datetime_original: datetime

  • exposure_time: PositiveFloat

  • pixel_x_dimension: PositiveInt

  • pixel_y_dimension: PositiveInt

Extra tags:
  • datetime_corrected: datetime

    if datetime_corrector is set, this is calculated by calling datetime_corrector(datetime_original).

  • location: str

    set if location is set

Parameters
  • root (Path) – Root directory from which the relative path in self.filename is resolved. Defaults to current working directory. If a str is passed it will be coerced to a Path.

  • location (Optional[str]) – Option to also apply a location

  • datetime_corrector (Optional[DatetimeCorrector]) – If set, then will be used to calculate datetime_corrected

Returns

Return type

None (operates in place)

Examples

>>> metadata = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="camfi/test/data/DSCF0010.JPG",
...     regions=[],
... )
>>> metadata.load_exif_metadata()
>>> metadata.file_attributes.datetime_original
datetime.datetime(2019, 11, 14, 20, 30, 29)
>>> print(round(metadata.file_attributes.exposure_time, 6))
0.111111
>>> metadata.file_attributes.pixel_y_dimension
3456
>>> metadata.file_attributes.pixel_x_dimension
4608

Optionally specify root directory. Here we are loading the same file, but using a root parameter. Note that root may also be a relative path (as in this case). Absolute paths are also acceptable.

>>> metadata_with_root = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="data/DSCF0010.JPG",
...     regions=[],
... )
>>> metadata_with_root.load_exif_metadata(root="camfi/test")
>>> metadata_with_root.file_attributes == metadata.file_attributes
True

If location is set, this will be reflected

>>> metadata = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="camfi/test/data/DSCF0010.JPG",
...     regions=[],
... )
>>> metadata.load_exif_metadata(location="cabramurra")
>>> metadata.file_attributes.location
'cabramurra'

If a time correction needs to be made (for example if the camera’s clock is known to have been incorrectly set), then we can correct the datetime by supplying a function to the datetime_corrector parameter.

>>> metadata = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="camfi/test/data/DSCF0010.JPG",
...     regions=[],
... )
>>> metadata.load_exif_metadata(
...     datetime_corrector=lambda dt: dt - timedelta(days=30)
... )
>>> metadata.file_attributes.datetime_original
datetime.datetime(2019, 11, 14, 20, 30, 29)
>>> metadata.file_attributes.datetime_corrected
datetime.datetime(2019, 10, 15, 20, 30, 29)
read_image(root: pathlib.Path = PosixPath('.')) torch.Tensor

Read an image from a file.

Parameters

root (Path) – Root directory from which the relative path in self.filename is resolved. Defaults to current working directory. If a str is passed it will be coerced to a Path.

Returns

Image as RGB float32 tensor.

Return type

torch.Tensor[colour, height (y), width (x)]

Examples

>>> metadata = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="camfi/test/data/DSCF0010.JPG",
...     regions=[],
... )
>>> image = metadata.read_image()
>>> image.shape == (3, 3456, 4608)
True
>>> image.dtype
torch.float32

Optionally specify root directory. Here we are loading the same file, but using a root parameter. Note that root may also be a relative path (as in this case). Absolute paths are also acceptable.

>>> metadata_with_root = ViaMetadata(
...     file_attributes=ViaFileAttributes(),
...     filename="data/DSCF0010.JPG",
...     regions=[],
... )
>>> image_with_root = metadata_with_root.read_image(root="camfi/test")
>>> image_with_root.allclose(image)
True
snap_to_bounds(bounds: camfi.datamodel.geometry.BoundingBox) None

Snaps regions to bounds. Operates in place.

Parameters

bounds (BoundingBox) – Regions which are not in bounds are snapped.

class camfi.datamodel.via.ViaProject(*, _via_attributes: dict, _via_img_metadata: dict, _via_settings: dict)

Bases: pydantic.main.BaseModel

Defines the structure of a VIA project file. Can be used for loading and saving VIA project data.

Parameters
  • via_attributes (dict) – Unused by camfi. (Included for compatability with VIA).

  • via_img_metadata (dict[str, ViaMetadata]) – dict of {str: ViaMetadata} pairs. Keys can be arbitrary strings, however they usually bare some resemblance to the .filename attribute of the ViaMetadata instance.

  • via_settings (dict) – Unused by camfi. (Included for compatability with VIA).

class Config

Bases: object

Sets pydantic.BaseModel configuration of ViaProject.

__dict__ = mappingproxy({'__module__': 'camfi.datamodel.via', '__doc__': 'Sets pydantic.BaseModel configuration of ViaProject.', 'alias_generator': <function ViaProject.Config.<lambda>>, 'json_encoders': {<class 'pathlib.Path'>: <function ViaProject.Config.<lambda>>}, '__dict__': <attribute '__dict__' of 'Config' objects>, '__weakref__': <attribute '__weakref__' of 'Config' objects>, '__annotations__': {}})
__weakref__

list of weak references to the object (if defined)

alias_generator()
__config__

alias of camfi.datamodel.via.Config

__or__(other: camfi.datamodel.via.ViaProject) camfi.datamodel.via.ViaProject

Returns a new ViaProject instance with via_img_metadata taken from combining self and other. via_attributes and via_settings are taken from self. If there is an image key which appears in both projects, then the value from other will be taken (as per the convention for | on dict in python).

filter_inplace(function: Callable[[camfi.datamodel.via.ViaMetadata], bool]) None

Filters images in self.via_img_metadata in-place.

Parameters

function (Callable[[ViaMetadata], bool]) – Called on each value in self.via_img_metadata to determine if it should be included in output.

filtered_copy(function: Callable[[camfi.datamodel.via.ViaMetadata], bool], deep: bool = False) camfi.datamodel.via.ViaProject

Filters images in self.via_img_metadata, returning a new ViaProject instance.

Parameters
  • function (Callable[[ViaMetadata], bool]) – Called on each value in self.via_img_metadata to determine if it should be included in output.

  • deep (bool) – If True, make a deep copy.

Returns

project – Copy of self with via_img_metadata filtered.

Return type

ViaProject

formatted_json(**kwargs) str

Like json, but fixes by_alias=True, indent=2, and exclude_none=True.

load_all_exif_metadata(root: pathlib.Path = PosixPath('.'), location_dict: Optional[Mapping[pathlib.Path, Optional[str]]] = None, datetime_correctors: Optional[Mapping[pathlib.Path, Optional[Callable[[datetime.datetime], datetime.datetime]]]] = None, disable_progress_bar: Optional[bool] = True) None

Calls the .load_exif_metadata method on all ViaMetadata instances in self.via_img_metadata, extracting the EXIF metadata from each image file.

Parameters
  • root (Path) – Root directory from which the relative path in self.filename is resolved. Defaults to current working directory. If a str is passed it will be coerced to a Path.

  • location_dict (Optional[Mapping[Path, Optional[str]]]) – A mapping from filenames (i.e. relative paths to images under root) to location strings, which are passed to ViaMetadata.load_exif_metadata. Typically, an instance of camfi.util.SubDirdict should be used.

  • datetime_correctors (Optional[Mapping[Path, Optional[DatetimeCorrector]]]) – A mapping from filenames (i.e. relative paths to images under root) to DatetimeCorrector instances, which are passed to ViaMetadata.load_exif_metadata Typically, an instance of camfi.util.SubDirdict should be used.

  • disable_progress_bar (Optional[bool]) – If True (default), progress bar is disabled. If set to None, disable on non-TTY.

Returns

Operates in place.

Return type

None

Examples

>>> with open("camfi/test/data/sample_project_images_included.json") as f:
...     project = ViaProject.parse_raw(f.read())

The file which has been loaded contains no metadata

>>> for meta in project.via_img_metadata.values():
...     print(meta.filename, str(meta.file_attributes.datetime_original))
DSCF0010.JPG None
DSCF0011.JPG None

After load_all_exif_metadata is called, project does contain image metadata

>>> project.load_all_exif_metadata(root=Path("camfi/test/data"))
>>> for meta in project.via_img_metadata.values():
...     print(meta.filename, str(meta.file_attributes.datetime_original))
DSCF0010.JPG 2019-11-14 20:30:29
DSCF0011.JPG 2019-11-14 20:40:32

If location_dict and/or datetime_correctors are set, the metadata will include location and/or datetime_corrected, respectively. Normally, these would be set as instances of camfi.util.SubDirdict, but for brevity we use a regular dict for each of them here.

>>> project.load_all_exif_metadata(
...     root=Path("camfi/test/data"),
...     location_dict={
...         Path("DSCF0010.JPG"): "loc0", Path("DSCF0011.JPG"): "loc1"
...     },
...     datetime_correctors={
...         Path("DSCF0010.JPG"): lambda dt: dt + timedelta(hours=1),
...         Path("DSCF0011.JPG"): lambda dt: dt - timedelta(hours=1),
...     },
... )
>>> for meta in project.via_img_metadata.values():
...     print(
...         meta.filename,
...         meta.file_attributes.location,
...         str(meta.file_attributes.datetime_corrected),
...     )
DSCF0010.JPG loc0 2019-11-14 21:30:29
DSCF0011.JPG loc1 2019-11-14 19:40:32
to_image_dataframe(tz: Union[None, str, datetime.tzinfo] = None) pandas.core.frame.DataFrame

Returns a Pandas DataFrame with one row per image.

Parameters

tz (Optional[tzinfo]) – If set, all datetime_corrected values will be converted to the specified timezone.

Returns

df – DataFrame with one row per image. Contains columns for each field in ViaFileAttributes, as well as img_key, filename, and n_annotations. Note: values in datetime_corrected (but not datetime_original) will be converted to pandas.Timestamp, with timezone conversion/localization applied if applicable.

Return type

pd.DataFrame

to_region_dataframe() pandas.core.frame.DataFrame

Returns a Pandas DataFrame with one row per region (annotation).

Returns

regions – DataFrame with a row for every annotation in self.via_img_metadata. Contains a column for every field in ViaFileAttributes and ViaRegionAttributes, as well as img_key, filename, and name columns.

Return type

pd.DataFrame

class camfi.datamodel.via.ViaRegion(*, region_attributes: camfi.datamodel.via_region_attributes.ViaRegionAttributes, shape_attributes: Union[camfi.datamodel.geometry.PolylineShapeAttributes, camfi.datamodel.geometry.CircleShapeAttributes, camfi.datamodel.geometry.PointShapeAttributes])

Bases: pydantic.main.BaseModel

Combines region metadata with a geometry to define a complete annotation of a single flying insect motion blur.

Parameters
  • region_attributes (ViaRegionAttributes) – Metadata of annotation.

  • shape_attributes (Union[) – PolylineShapeAttributes, CircleShapeAttributes, PointShapeAttributes

  • ] – Geometry of annotation.

get_bounding_box() camfi.datamodel.geometry.BoundingBox

Calls .get_bounding_box on self.shape_attributes to get the bounding box of the annotation.

Returns

box – Bounding box of annotation.

Return type

BoundingBox

in_box(box: camfi.datamodel.geometry.BoundingBox) bool

Returns True if all points in region are within bounding box.

Parameters

box (BoundingBox) – Box to test against.

Examples

>>> polyline = PolylineShapeAttributes(
...     all_points_x=[1, 3],
...     all_points_y=[15, 13],
... )
>>> region = ViaRegion(
...     region_attributes=ViaRegionAttributes(),
...     shape_attributes=polyline,
... )
>>> region.in_box(BoundingBox(x0=1, y0=13, x1=4, y1=16))
True
>>> region.in_box(BoundingBox(x0=2, y0=13, x1=4, y1=16))
False
>>> region.in_box(BoundingBox(x0=1, y0=14, x1=4, y1=16))
False
>>> region.in_box(BoundingBox(x0=1, y0=13, x1=3, y1=16))
False
>>> region.in_box(BoundingBox(x0=1, y0=13, x1=4, y1=15))
False
passes_filter(filters: camfi.datamodel.region_filter_config.RegionFilterConfig) bool

Determine whether self passes filters.

Parameters

filters (RegionFilterConfig) – Filters to check.

Returns

passes – True if all filters are passed.

Return type

bool

snap_to_bounds(bounds: camfi.datamodel.geometry.BoundingBox) None

Moves self.shape_attributes so that it is completely contained within given bounds. If self.shape_attributes is a PolylineShapeAttributes, and it is moved, it is converted to a CircleShapeAttributes (and any invalid region attributes are removed). Operates in-place.

Parameters

bounds (BoundingBox) – Bounds to snap annotation to.