Hawk¶

Hawk-specific data loading, DMD wrappers, and visualisation.

hawk ¶

Hawk-specific data loading, DMD wrappers, and visualisation.

Everything specific to the hawk flight dataset lives here: NPZ file loading, mean-shape retrieval via morphing_birds.Hawk3D, batch RMSE analysis, and marker-position plots. General-purpose DMD logic is in the other modules; this layer fills in hawk defaults.

Functions:

Name	Description
`load_bird_data`	Load a hawk NPZ file into a DataFrame.
`get_average_shape`	Return the mean hawk shape via `Hawk3D`.
`normalise_hawk_data`	Centre hawk markers by subtracting the mean shape.
`bin_hawk_dataframe`	Bin with hawk-specific column casting.
`run_hawk_dmd`	`run_dmd` with hawk defaults.
`run_sequence_dmd`	Load one hawk sequence and run DMD.
`batch_rmse_analysis`	DMD + RMSE on many sequences.
`plot_hawk_markers`	4x3 scatter grid of marker positions.
`plot_hawk_markers_shaded`	4x3 binned mean +/- 1 SD bands.
`plot_hawk_2d`	2x4 y-z projections.
`plot_single_sequence`	Single-sequence scatter.

load_bird_data ¶

load_bird_data(bird_name: str, behaviour: str, perch_distance: str | None = None, bilateral: str = 'Bilateral', turn: str = 'Straight', verbose: bool = False) -> tuple[pd.DataFrame, np.ndarray]

Load hawk marker and info data from NPZ files.

Constructs a file path from the identifiers, loads the marker and info arrays from the NPZ archive, and merges them into a single DataFrame. Column names are read from ColumnNames.npz in the same directory.

Parameters:

Name	Type	Description	Default
`bird_name`	`str`	Individual hawk identifier, e.g. `'Toothless'`.	required
`behaviour`	`str`	Flight phase label used in the filename, e.g. `'Initial'`, `'Flapping'`.	required
`perch_distance`	`str`	Distance to perch, e.g. `'9m'`. If `None` the distance token is omitted from the filename.	`None`
`bilateral`	`str`	`'Bilateral'` (8 markers, left + right) or `'Unilateral'` (4 markers, right side only).	`'Bilateral'`
`turn`	`str`	Turn direction: `'Straight'`, `'Left'`, or `'Right'`. Only used when perch_distance is `'9m'`.	`'Straight'`
`verbose`	`bool`	If `True`, print the DataFrame shape and sequence count.	`False`

Returns:

Name	Type	Description
`wingbeat_df`	`DataFrame`	Combined info + marker DataFrame with one row per frame. Info columns include `seqID`, `frameID`, `time`, etc.; marker columns follow the `{side}_{position}_{axis}` naming convention.
`marker_column_names`	`ndarray`	1-D string array of marker column names, length `n_markers * 3`.

Source code in src/birddmd/hawk.py

def load_bird_data(
    bird_name: str,
    behaviour: str,
    perch_distance: str | None = None,
    bilateral: str = "Bilateral",
    turn: str = "Straight",
    verbose: bool = False,
) -> tuple[pd.DataFrame, np.ndarray]:
    """Load hawk marker and info data from NPZ files.

    Constructs a file path from the identifiers, loads the marker and
    info arrays from the NPZ archive, and merges them into a single
    DataFrame.  Column names are read from ``ColumnNames.npz`` in the
    same directory.

    Parameters
    ----------
    bird_name : str
        Individual hawk identifier, e.g. ``'Toothless'``.
    behaviour : str
        Flight phase label used in the filename, e.g. ``'Initial'``,
        ``'Flapping'``.
    perch_distance : str, optional
        Distance to perch, e.g. ``'9m'``.  If ``None`` the distance
        token is omitted from the filename.
    bilateral : str
        ``'Bilateral'`` (8 markers, left + right) or ``'Unilateral'``
        (4 markers, right side only).
    turn : str
        Turn direction: ``'Straight'``, ``'Left'``, or ``'Right'``.
        Only used when *perch_distance* is ``'9m'``.
    verbose : bool
        If ``True``, print the DataFrame shape and sequence count.

    Returns
    -------
    wingbeat_df : pd.DataFrame
        Combined info + marker DataFrame with one row per frame.
        Info columns include ``seqID``, ``frameID``, ``time``, etc.;
        marker columns follow the ``{side}_{position}_{axis}`` naming
        convention.
    marker_column_names : np.ndarray
        1-D string array of marker column names, length
        ``n_markers * 3``.
    """
    pd_str = "" if perch_distance is None else perch_distance

    if perch_distance == "9m":
        path = (
            f"{SAMPLES_DIR}/{behaviour}_{pd_str}{turn}Turn{bird_name}_{bilateral}.npz"
        )
    else:
        path = f"{SAMPLES_DIR}/{behaviour}_{pd_str}{bird_name}_{bilateral}.npz"

    file = np.load(path, allow_pickle=True)
    col_names = np.load(f"{SAMPLES_DIR}/ColumnNames.npz")
    marker_cols = col_names["marker_column_names"]
    info_cols = col_names["info_column_names"]

    marker_df = pd.DataFrame(file["marker_data"], columns=marker_cols)
    info_df = pd.DataFrame(file["info_data"], columns=info_cols)
    wingbeat_df = pd.concat([info_df, marker_df], axis=1)

    if verbose:
        n_seq = wingbeat_df["seqID"].nunique()
        print(
            f"Loaded {bird_name} {bilateral.lower()}: "
            f"{wingbeat_df.shape}, {n_seq} sequences"
        )

    return wingbeat_df, marker_cols

get_average_shape ¶

get_average_shape(n_markers: int, mean_shape_path: str | None = None) -> np.ndarray

Return the mean hawk shape via Hawk3D.

Instantiates a morphing_birds.Hawk3D object from the mean-shape CSV and returns either the full bilateral marker set or the right-side-only subset.

Parameters:

Name	Type	Description	Default
`n_markers`	`int`	Number of anatomical markers: 8 for bilateral (both wings) or 4 for unilateral (right side only).	required
`mean_shape_path`	`str`	Path to the mean-shape CSV file. Defaults to `data/mean_hawk_shape.csv` shipped with the package.	`None`

Returns:

Type	Description
`ndarray`	Shape `(1, n_markers, 3)` — the time-averaged marker positions in 3-D space.

Raises:

Type	Description
`ValueError`	If n_markers is not 4 or 8.

Source code in src/birddmd/hawk.py

def get_average_shape(
    n_markers: int,
    mean_shape_path: str | None = None,
) -> np.ndarray:
    """Return the mean hawk shape via ``Hawk3D``.

    Instantiates a ``morphing_birds.Hawk3D`` object from the mean-shape
    CSV and returns either the full bilateral marker set or the
    right-side-only subset.

    Parameters
    ----------
    n_markers : int
        Number of anatomical markers: 8 for bilateral (both wings) or
        4 for unilateral (right side only).
    mean_shape_path : str, optional
        Path to the mean-shape CSV file.  Defaults to
        ``data/mean_hawk_shape.csv`` shipped with the package.

    Returns
    -------
    np.ndarray
        Shape ``(1, n_markers, 3)`` — the time-averaged marker
        positions in 3-D space.

    Raises
    ------
    ValueError
        If *n_markers* is not 4 or 8.
    """
    if mean_shape_path is None:
        mean_shape_path = MEAN_SHAPE_PATH

    hawk3d = Hawk3D(mean_shape_path)
    if n_markers == N_BILATERAL_MARKERS:
        return hawk3d.markers
    if n_markers == N_UNILATERAL_MARKERS:
        return hawk3d.right_markers
    msg = (
        f"Expected {N_UNILATERAL_MARKERS} or {N_BILATERAL_MARKERS} markers,"
        f" got {n_markers}"
    )
    raise ValueError(msg)

normalise_hawk_data ¶

normalise_hawk_data(markers: ndarray) -> tuple[np.ndarray, np.ndarray]

Centre hawk markers by subtracting the mean hawk shape.

Infers the marker count from the array shape, loads the corresponding mean shape via get_average_shape, and subtracts it from every frame.

Parameters:

Name	Type	Description	Default
`markers`	`ndarray`	Raw marker positions, either shape `(n_frames, n_markers, 3)` or flattened as `(n_frames, n_markers * 3)`.	required

Returns:

Name	Type	Description
`normalised`	`ndarray`	Mean-subtracted markers, same shape as markers.
`average_shape`	`ndarray`	The mean shape that was subtracted, shape `(1, n_markers, 3)`.

Raises:

Type	Description
`ValueError`	If markers is not 2-D or 3-D.

Source code in src/birddmd/hawk.py

def normalise_hawk_data(
    markers: np.ndarray,
) -> tuple[np.ndarray, np.ndarray]:
    """Centre hawk markers by subtracting the mean hawk shape.

    Infers the marker count from the array shape, loads the
    corresponding mean shape via `get_average_shape`, and
    subtracts it from every frame.

    Parameters
    ----------
    markers : np.ndarray
        Raw marker positions, either shape ``(n_frames, n_markers, 3)``
        or flattened as ``(n_frames, n_markers * 3)``.

    Returns
    -------
    normalised : np.ndarray
        Mean-subtracted markers, same shape as *markers*.
    average_shape : np.ndarray
        The mean shape that was subtracted, shape
        ``(1, n_markers, 3)``.

    Raises
    ------
    ValueError
        If *markers* is not 2-D or 3-D.
    """
    DIMS_2 = 2
    DIMS_3 = 3
    if markers.ndim == DIMS_3:
        n_markers = markers.shape[1]
    elif markers.ndim == DIMS_2:
        n_markers = markers.shape[1] // DIMS_3
    else:
        msg = f"Expected 2-D or 3-D data, got {markers.ndim}-D"
        raise ValueError(msg)

    average_shape = get_average_shape(n_markers)
    return normalise_data(markers, average_shape), average_shape

bin_hawk_dataframe ¶

bin_hawk_dataframe(dataframe: DataFrame, x_axis: str = 'HorzDistance', bin_size: float = 0.01) -> pd.DataFrame

Bin a hawk DataFrame, casting hawk-specific columns to float.

Wraps bin_dataframe_means, automatically casting HorzDistance and VertDistance to float before binning.

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	Hawk flight DataFrame (e.g. from `load_bird_data`).	required
`x_axis`	`str`	Column to bin along, typically `'HorzDistance'` (spatial) or `'time'` (temporal).	`'HorzDistance'`
`bin_size`	`float`	Width of each bin in the units of x_axis (metres for distance, seconds for time).	`0.01`

Returns:

Type	Description
`DataFrame`	Binned DataFrame with one row per bin, containing group means of all numeric columns.

Source code in src/birddmd/hawk.py

def bin_hawk_dataframe(
    dataframe: pd.DataFrame,
    x_axis: str = "HorzDistance",
    bin_size: float = 0.01,
) -> pd.DataFrame:
    """Bin a hawk DataFrame, casting hawk-specific columns to float.

    Wraps `bin_dataframe_means`, automatically
    casting ``HorzDistance`` and ``VertDistance`` to float before
    binning.

    Parameters
    ----------
    dataframe : pd.DataFrame
        Hawk flight DataFrame (e.g. from `load_bird_data`).
    x_axis : str
        Column to bin along, typically ``'HorzDistance'`` (spatial) or
        ``'time'`` (temporal).
    bin_size : float
        Width of each bin in the units of *x_axis* (metres for
        distance, seconds for time).

    Returns
    -------
    pd.DataFrame
        Binned DataFrame with one row per bin, containing group means
        of all numeric columns.
    """
    return bin_dataframe_means(
        dataframe,
        x_axis,
        bin_size,
        numeric_cast_columns=["HorzDistance", "VertDistance"],
    )

run_hawk_dmd ¶

run_hawk_dmd(data: ndarray, times: ndarray | None = None, n_modes: int = 6, d: int = 2, eig_constraints: set[str] | None = None, n_markers: int = 8, mean_shape_path: str | None = None, verbose: bool = True, **kwargs) -> DMDResult

Run DMD with hawk defaults.

Loads the mean hawk shape automatically and delegates to run_dmd. This is the recommended entry point for single-array hawk analyses where you already have the marker data in memory.

Parameters:

Name	Type	Description	Default
`data`	`ndarray`	Marker positions, shape `(n_frames, n_markers, 3)` or `(n_frames, n_markers * 3)`.	required
`times`	`ndarray`	Time vector, shape `(n_frames,)`. If `None`, a uniform time step is assumed.	`None`
`n_modes`	`int`	Number of DMD modes (SVD rank).	`6`
`d`	`int`	Hankel delay-embedding depth.	`2`
`eig_constraints`	`set of str`	Constraints passed to BOPDMD, e.g. `{"conjugate_pairs"}`. Defaults to `{"conjugate_pairs"}`.	`None`
`n_markers`	`int`	Number of anatomical markers (8 bilateral, 4 unilateral).	`8`
`mean_shape_path`	`str`	Path to the mean-shape CSV. Defaults to the bundled file.	`None`
`verbose`	`bool`	If `True`, print fitting diagnostics.	`True`
`**kwargs`		Additional keyword arguments forwarded to `run_dmd` (e.g. init_alpha).	`{}`

Returns:

Type	Description
`DMDResult`	Complete DMD results including eigenvalues, modes, reconstruction, etc.

Source code in src/birddmd/hawk.py

def run_hawk_dmd(
    data: np.ndarray,
    times: np.ndarray | None = None,
    n_modes: int = 6,
    d: int = 2,
    eig_constraints: set[str] | None = None,
    n_markers: int = 8,
    mean_shape_path: str | None = None,
    verbose: bool = True,
    **kwargs,
) -> DMDResult:
    """Run DMD with hawk defaults.

    Loads the mean hawk shape automatically and delegates to
    `run_dmd`.  This is the recommended entry
    point for single-array hawk analyses where you already have the
    marker data in memory.

    Parameters
    ----------
    data : np.ndarray
        Marker positions, shape ``(n_frames, n_markers, 3)`` or
        ``(n_frames, n_markers * 3)``.
    times : np.ndarray, optional
        Time vector, shape ``(n_frames,)``.  If ``None``, a uniform
        time step is assumed.
    n_modes : int
        Number of DMD modes (SVD rank).
    d : int
        Hankel delay-embedding depth.
    eig_constraints : set of str, optional
        Constraints passed to BOPDMD, e.g. ``{"conjugate_pairs"}``.
        Defaults to ``{"conjugate_pairs"}``.
    n_markers : int
        Number of anatomical markers (8 bilateral, 4 unilateral).
    mean_shape_path : str, optional
        Path to the mean-shape CSV.  Defaults to the bundled file.
    verbose : bool
        If ``True``, print fitting diagnostics.
    **kwargs
        Additional keyword arguments forwarded to `run_dmd`
        (e.g. *init_alpha*).

    Returns
    -------
    DMDResult
        Complete DMD results including eigenvalues, modes,
        reconstruction, etc.
    """
    if eig_constraints is None:
        eig_constraints = {"conjugate_pairs"}
    avg = get_average_shape(n_markers, mean_shape_path)
    return run_dmd(
        data=data,
        times=times,
        n_modes=n_modes,
        d=d,
        eig_constraints=eig_constraints,
        average_shape=avg,
        n_markers=n_markers,
        verbose=verbose,
        **kwargs,
    )

run_sequence_dmd ¶

run_sequence_dmd(bird_name: str, perch_dist: str, turn: str, behaviour: str, seqID: str, n_modes: int = 10, d: int = 2, eig_constraints: set[str] | None = None, min_seq_length: int | None = None, interpolate: bool = False, verbose: bool = True) -> DMDResult | None

Load one hawk sequence from disk and run DMD on it.

Convenience wrapper that chains load_bird_data, load_sequence_data, optional spline interpolation, and run_dmd into a single call.

Parameters:

Name	Type	Description	Default
`bird_name`	`str`	Individual hawk identifier, e.g. `'Toothless'`.	required
`perch_dist`	`str`	Distance to perch, e.g. `'9m'`.	required
`turn`	`str`	Turn direction: `'Straight'`, `'Left'`, or `'Right'`.	required
`behaviour`	`str`	Flight phase label, e.g. `'Initial'`, `'Flapping'`.	required
`seqID`	`str`	Unique sequence identifier within the loaded dataset.	required
`n_modes`	`int`	Number of DMD modes (SVD rank).	`10`
`d`	`int`	Hankel delay-embedding depth.	`2`
`eig_constraints`	`set of str`	Constraints passed to BOPDMD, e.g. `{"conjugate_pairs"}`. Defaults to `{"conjugate_pairs"}`.	`None`
`min_seq_length`	`int`	Minimum number of frames required. Sequences shorter than this are skipped. Defaults to `n_modes + 1`.	`None`
`interpolate`	`bool`	If `True`, apply cubic-spline interpolation to produce uniformly spaced time steps before fitting.	`False`
`verbose`	`bool`	If `True`, print loading and fitting diagnostics.	`True`

Returns:

Type	Description
`DMDResult or None`	Complete DMD results, or `None` if the sequence has fewer frames than min_seq_length.

Source code in src/birddmd/hawk.py

def run_sequence_dmd(
    bird_name: str,
    perch_dist: str,
    turn: str,
    behaviour: str,
    seqID: str,
    n_modes: int = 10,
    d: int = 2,
    eig_constraints: set[str] | None = None,
    min_seq_length: int | None = None,
    interpolate: bool = False,
    verbose: bool = True,
) -> DMDResult | None:
    """Load one hawk sequence from disk and run DMD on it.

    Convenience wrapper that chains `load_bird_data`,
    `load_sequence_data`, optional spline
    interpolation, and `run_dmd` into a single
    call.

    Parameters
    ----------
    bird_name : str
        Individual hawk identifier, e.g. ``'Toothless'``.
    perch_dist : str
        Distance to perch, e.g. ``'9m'``.
    turn : str
        Turn direction: ``'Straight'``, ``'Left'``, or ``'Right'``.
    behaviour : str
        Flight phase label, e.g. ``'Initial'``, ``'Flapping'``.
    seqID : str
        Unique sequence identifier within the loaded dataset.
    n_modes : int
        Number of DMD modes (SVD rank).
    d : int
        Hankel delay-embedding depth.
    eig_constraints : set of str, optional
        Constraints passed to BOPDMD, e.g. ``{"conjugate_pairs"}``.
        Defaults to ``{"conjugate_pairs"}``.
    min_seq_length : int, optional
        Minimum number of frames required.  Sequences shorter than
        this are skipped.  Defaults to ``n_modes + 1``.
    interpolate : bool
        If ``True``, apply cubic-spline interpolation to produce
        uniformly spaced time steps before fitting.
    verbose : bool
        If ``True``, print loading and fitting diagnostics.

    Returns
    -------
    DMDResult or None
        Complete DMD results, or ``None`` if the sequence has fewer
        frames than *min_seq_length*.
    """
    if eig_constraints is None:
        eig_constraints = {"conjugate_pairs"}
    df, marker_cols = load_bird_data(
        bird_name=bird_name,
        behaviour=behaviour,
        perch_distance=perch_dist,
        turn=turn,
        verbose=verbose,
    )
    df = remove_time_duplicates(df)

    n_markers = len(marker_cols) // 3
    average_shape = get_average_shape(n_markers)
    markers, times = load_sequence_data(df, seqID, marker_cols)

    if min_seq_length is None:
        min_seq_length = n_modes + 1
    if markers.shape[0] <= min_seq_length:
        if verbose:
            print(f"Sequence {seqID} too short ({markers.shape[0]} frames)")
        return None

    if interpolate:
        times, markers = spline_interpolation(times, markers)

    markers = markers.reshape(-1, n_markers, 3)
    return run_dmd(
        data=markers,
        times=times,
        n_modes=n_modes,
        d=d,
        eig_constraints=eig_constraints,
        average_shape=average_shape,
        n_markers=n_markers,
        verbose=verbose,
    )

batch_rmse_analysis ¶

batch_rmse_analysis(df: DataFrame, seqIDs: list[str], marker_column_names: ndarray, average_shape: ndarray, n_modes: int = 6, d: int = 2, marker_names: list[str] | None = None, verbose: bool = False, per_frame: bool = False) -> pd.DataFrame

Run DMD on many hawk sequences and collect RMSE statistics.

Iterates over seqIDs, fits a DMD model to each sequence, and computes the root-mean-square reconstruction error. Sequences that fail (e.g. due to convergence issues) are silently skipped unless verbose is True.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	Full hawk dataset containing all sequences (e.g. from `load_bird_data`).	required
`seqIDs`	`list of str`	Sequence identifiers to analyse.	required
`marker_column_names`	`ndarray`	1-D string array of marker column names, length `n_markers * 3`.	required
`average_shape`	`ndarray`	Mean hawk shape, shape `(1, n_markers, 3)`, subtracted before fitting.	required
`n_modes`	`int`	Number of DMD modes (SVD rank).	`6`
`d`	`int`	Hankel delay-embedding depth.	`2`
`marker_names`	`list of str`	Human-readable marker names used as suffixes for per-marker RMSE columns (e.g. `rmse_wingtip`). Defaults to `marker_0`, `marker_1`, etc.	`None`
`verbose`	`bool`	If `True`, print a message for each skipped sequence.	`False`
`per_frame`	`bool`	If `True`, each result row also contains: `rmse_per_frame` — 1-D array of frame-level RMSE values. `rmse_per_frame_per_marker` — 2-D array, shape `(n_frames, n_markers)`. `horz_distance` — 1-D array of horizontal distance per frame (only if the column exists in df).	`False`

Returns:

Type	Description
`DataFrame`	One row per successfully analysed sequence, with columns `seqID`, `total_rmse`, per-marker RMSE columns, and any available metadata from `HAWK_META_COLUMNS`.

Source code in src/birddmd/hawk.py

def batch_rmse_analysis(
    df: pd.DataFrame,
    seqIDs: list[str],
    marker_column_names: np.ndarray,
    average_shape: np.ndarray,
    n_modes: int = 6,
    d: int = 2,
    marker_names: list[str] | None = None,
    verbose: bool = False,
    per_frame: bool = False,
) -> pd.DataFrame:
    """Run DMD on many hawk sequences and collect RMSE statistics.

    Iterates over *seqIDs*, fits a DMD model to each sequence, and
    computes the root-mean-square reconstruction error.  Sequences
    that fail (e.g. due to convergence issues) are silently skipped
    unless *verbose* is ``True``.

    Parameters
    ----------
    df : pd.DataFrame
        Full hawk dataset containing all sequences (e.g. from
        `load_bird_data`).
    seqIDs : list of str
        Sequence identifiers to analyse.
    marker_column_names : np.ndarray
        1-D string array of marker column names, length
        ``n_markers * 3``.
    average_shape : np.ndarray
        Mean hawk shape, shape ``(1, n_markers, 3)``, subtracted
        before fitting.
    n_modes : int
        Number of DMD modes (SVD rank).
    d : int
        Hankel delay-embedding depth.
    marker_names : list of str, optional
        Human-readable marker names used as suffixes for per-marker
        RMSE columns (e.g. ``rmse_wingtip``).  Defaults to
        ``marker_0``, ``marker_1``, etc.
    verbose : bool
        If ``True``, print a message for each skipped sequence.
    per_frame : bool
        If ``True``, each result row also contains:

        - ``rmse_per_frame`` — 1-D array of frame-level RMSE values.
        - ``rmse_per_frame_per_marker`` — 2-D array, shape
          ``(n_frames, n_markers)``.
        - ``horz_distance`` — 1-D array of horizontal distance per
          frame (only if the column exists in *df*).

    Returns
    -------
    pd.DataFrame
        One row per successfully analysed sequence, with columns
        ``seqID``, ``total_rmse``, per-marker RMSE columns, and any
        available metadata from ``HAWK_META_COLUMNS``.
    """
    n_markers = len(marker_column_names) // 3
    meta_cols = [c for c in HAWK_META_COLUMNS if c in df.columns]
    has_horz = per_frame and "HorzDistance" in df.columns
    rows = []

    for sid in seqIDs:
        try:
            seq_df = df[df["seqID"] == sid]
            metadata = seq_df.iloc[0][meta_cols].to_dict() if meta_cols else {}

            markers, times = load_sequence_data(df, sid, marker_column_names)
            markers = markers.reshape(-1, n_markers, 3)
            times_unique, unique_idx = np.unique(times, return_index=True)
            markers = markers[unique_idx]
            times = times_unique

            result = run_dmd(
                data=markers,
                times=times,
                n_modes=n_modes,
                d=d,
                eig_constraints={"conjugate_pairs"},
                n_markers=n_markers,
                average_shape=average_shape,
                verbose=False,
            )

            gt = markers[:-1]
            sq_err = (gt - result.reconstruction) ** 2
            total_rmse = float(np.sqrt(np.mean(sq_err)))
            rmse_per_marker = np.sqrt(np.mean(sq_err, axis=(0, 2)))

            row = {"seqID": sid, "total_rmse": total_rmse, **metadata}
            names = marker_names or [f"marker_{i}" for i in range(n_markers)]
            for i, name in enumerate(names):
                row[f"rmse_{name}"] = rmse_per_marker[i]

            if per_frame:
                row["rmse_per_frame"] = np.sqrt(np.mean(sq_err, axis=(1, 2)))
                row["rmse_per_frame_per_marker"] = np.sqrt(
                    np.mean(sq_err, axis=2)
                )  # shape (n_frames, n_markers)
            if has_horz:
                hd = seq_df["HorzDistance"].to_numpy(dtype=np.float64)
                row["horz_distance"] = hd[unique_idx][:-1]

            rows.append(row)
        except Exception as exc:
            if verbose:
                print(f"Skipping {sid}: {exc}")

    return pd.DataFrame(rows)

plot_hawk_markers ¶

plot_hawk_markers(dataframe: DataFrame, _marker_column_names: ndarray, x_axis: str = 'HorzDistance') -> Figure

Plot a 4x3 scatter grid of marker positions.

Each row corresponds to one of the four marker positions (see MARKER_POSITIONS); each column shows the x, y, or z coordinate. Left and right markers are overlaid in the same colour.

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	Hawk flight DataFrame with marker columns following the `{side}_{position}_{axis}` naming convention.	required
`_marker_column_names`	`ndarray`	1-D string array of marker column names (used for layout consistency; individual columns are constructed from `MARKER_POSITIONS`).	required
`x_axis`	`str`	Column to plot on the horizontal axis. `'HorzDistance'` (negated automatically) or `'time'`.	`'HorzDistance'`

Returns:

Type	Description
`Figure`	Matplotlib figure with 12 scatter subplots.

Source code in src/birddmd/hawk.py

def plot_hawk_markers(
    dataframe: pd.DataFrame,
    _marker_column_names: np.ndarray,
    x_axis: str = "HorzDistance",
) -> Figure:
    """Plot a 4x3 scatter grid of marker positions.

    Each row corresponds to one of the four marker positions (see
    ``MARKER_POSITIONS``); each column shows the x, y, or z
    coordinate.  Left and right markers are overlaid in the same
    colour.

    Parameters
    ----------
    dataframe : pd.DataFrame
        Hawk flight DataFrame with marker columns following the
        ``{side}_{position}_{axis}`` naming convention.
    _marker_column_names : np.ndarray
        1-D string array of marker column names (used for layout
        consistency; individual columns are constructed from
        ``MARKER_POSITIONS``).
    x_axis : str
        Column to plot on the horizontal axis.  ``'HorzDistance'``
        (negated automatically) or ``'time'``.

    Returns
    -------
    Figure
        Matplotlib figure with 12 scatter subplots.
    """
    fig, axes = plt.subplots(4, 3, figsize=(6, 8), sharex=True, sharey=False)
    ax = axes.flatten()
    coordinates = ["x", "y", "z"]

    for pi, pos in enumerate(MARKER_POSITIONS):
        for ci, coord in enumerate(coordinates):
            idx = pi * 3 + ci
            left = f"left_{pos}_{coord}"
            right = f"right_{pos}_{coord}"
            negate = x_axis.lower().startswith("horz")
            x = -dataframe[x_axis] if negate else dataframe[x_axis]

            ax[idx].scatter(x, dataframe[left], s=2, color=MARKER_COLOURS[pos])
            ax[idx].scatter(x, dataframe[right], s=2, color=MARKER_COLOURS[pos])

            ax[idx].grid(
                True,
                which="major",
                axis="x",
                linestyle="-",
                linewidth=0.5,
                color="k",
                alpha=0.5,
            )
            ax[idx].xaxis.set_major_locator(MaxNLocator(4))

            if ci == 0:
                ax[idx].set_ylabel(pos, fontsize=10)
            if pi == len(MARKER_POSITIONS) - 1:
                ax[idx].set_xlabel(x_axis, fontsize=10)
            if pi == 0:
                ax[idx].set_title(coord, fontsize=12)

            ax[idx].yaxis.set_ticklabels([])
            _remove_spines(ax[idx])

    plt.tight_layout()
    return fig

plot_hawk_markers_shaded ¶

plot_hawk_markers_shaded(dataframe: DataFrame, _marker_column_names: ndarray, x_axis: str = 'HorzDistance', bin_size: float = 0.01) -> Figure

Plot a 4x3 grid of binned mean +/- 1 SD bands for hawk markers.

Same layout as plot_hawk_markers but instead of raw scatter points, data are binned along x_axis and the mean is shown as a solid (left) or dotted (right) line with a shaded +/- 1 standard deviation band. The x-coordinate uses absolute values before averaging.

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	Hawk flight DataFrame with marker columns following the `{side}_{position}_{axis}` naming convention.	required
`_marker_column_names`	`ndarray`	1-D string array of marker column names.	required
`x_axis`	`str`	Column to bin along. `'HorzDistance'` (negated automatically) or `'time'`.	`'HorzDistance'`
`bin_size`	`float`	Width of each bin in the units of x_axis (metres for distance, seconds for time).	`0.01`

Returns:

Type	Description
`Figure`	Matplotlib figure with 12 shaded-band subplots.

Source code in src/birddmd/hawk.py

def plot_hawk_markers_shaded(
    dataframe: pd.DataFrame,
    _marker_column_names: np.ndarray,
    x_axis: str = "HorzDistance",
    bin_size: float = 0.01,
) -> Figure:
    """Plot a 4x3 grid of binned mean +/- 1 SD bands for hawk markers.

    Same layout as `plot_hawk_markers` but instead of raw
    scatter points, data are binned along *x_axis* and the mean is
    shown as a solid (left) or dotted (right) line with a shaded
    +/- 1 standard deviation band.  The x-coordinate uses absolute
    values before averaging.

    Parameters
    ----------
    dataframe : pd.DataFrame
        Hawk flight DataFrame with marker columns following the
        ``{side}_{position}_{axis}`` naming convention.
    _marker_column_names : np.ndarray
        1-D string array of marker column names.
    x_axis : str
        Column to bin along.  ``'HorzDistance'`` (negated
        automatically) or ``'time'``.
    bin_size : float
        Width of each bin in the units of *x_axis* (metres for
        distance, seconds for time).

    Returns
    -------
    Figure
        Matplotlib figure with 12 shaded-band subplots.
    """
    fig, axes = plt.subplots(4, 3, figsize=(6, 8), sharex=True, sharey=False)
    ax = axes.flatten()
    coordinates = ["x", "y", "z"]

    x_min = dataframe[x_axis].min()
    x_max = dataframe[x_axis].max()
    bins = np.arange(x_min, x_max + bin_size, bin_size)
    bins = np.around(bins, 3)
    bin_centres = bins[:-1] + bin_size / 2

    df = dataframe.copy()
    df["bins"] = pd.cut(df[x_axis], bins.tolist(), right=False, include_lowest=True)

    for pi, pos in enumerate(MARKER_POSITIONS):
        for ci, coord in enumerate(coordinates):
            idx = pi * 3 + ci
            left = f"left_{pos}_{coord}"
            right = f"right_{pos}_{coord}"
            colour = MARKER_COLOURS[pos]
            negate = x_axis.lower().startswith("horz")

            if coord == "x":
                lm = df.groupby("bins", observed=True)[left].apply(
                    lambda s: np.abs(s).mean()
                )
                ls = df.groupby("bins", observed=True)[left].apply(
                    lambda s: np.abs(s).std()
                )
                rm = df.groupby("bins", observed=True)[right].apply(
                    lambda s: np.abs(s).mean()
                )
                rs = df.groupby("bins", observed=True)[right].apply(
                    lambda s: np.abs(s).std()
                )
            else:
                lm = df.groupby("bins", observed=True)[left].mean()
                ls = df.groupby("bins", observed=True)[left].std()
                rm = df.groupby("bins", observed=True)[right].mean()
                rs = df.groupby("bins", observed=True)[right].std()

            vl = ~np.isnan(lm) & ~np.isnan(ls)
            vr = ~np.isnan(rm) & ~np.isnan(rs)
            lbc = bin_centres[: len(lm)][vl]
            rbc = bin_centres[: len(rm)][vr]

            if np.any(vl):
                xv = -lbc if negate else lbc
                ax[idx].fill_between(
                    xv,
                    lm[vl] - ls[vl],
                    lm[vl] + ls[vl],
                    color=colour,
                    alpha=0.3,
                    edgecolor="none",
                )
                ax[idx].plot(
                    xv,
                    lm[vl],
                    color=colour,
                    linewidth=2,
                    linestyle="-",
                    label="Left" if idx == 0 else None,
                )

            if np.any(vr):
                xv = -rbc if negate else rbc
                ax[idx].fill_between(
                    xv,
                    rm[vr] - rs[vr],
                    rm[vr] + rs[vr],
                    color=colour,
                    alpha=0.3,
                    edgecolor="none",
                )
                ax[idx].plot(
                    xv,
                    rm[vr],
                    color=colour,
                    linewidth=2,
                    linestyle=":",
                    label="Right" if idx == 0 else None,
                )

            if ci == 0:
                ax[idx].set_ylabel(pos, fontsize=10)

            titles = ["abs(x)", "y", "z"]
            if pi == 0:
                ax[idx].set_title(titles[ci], fontsize=12)

            ax[idx].yaxis.set_ticklabels([])
            _remove_spines(ax[idx])

            if pi == len(MARKER_POSITIONS) - 1 and ci == 0:
                _format_bottom_axis(ax[idx], dataframe, x_axis, negate)

    plt.tight_layout()
    return fig

plot_hawk_2d ¶

plot_hawk_2d(dataframe: DataFrame, marker_column_names: ndarray) -> Figure

Plot 2x4 y-z projections of hawk marker positions.

Each subplot shows one marker's lateral (y) vs. vertical (z) position across all frames, coloured by time. Useful for visualising the spatial envelope of wing kinematics.

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	Hawk flight DataFrame with marker columns and a `time` column.	required
`marker_column_names`	`ndarray`	1-D string array of marker column names, length `n_markers * 3`. Every third name (stride 3) is used to derive the marker base names.	required

Returns:

Type	Description
`Figure`	Matplotlib figure with 8 equal-aspect subplots.

Source code in src/birddmd/hawk.py

def plot_hawk_2d(
    dataframe: pd.DataFrame,
    marker_column_names: np.ndarray,
) -> Figure:
    """Plot 2x4 y-z projections of hawk marker positions.

    Each subplot shows one marker's lateral (y) vs. vertical (z)
    position across all frames, coloured by time.  Useful for
    visualising the spatial envelope of wing kinematics.

    Parameters
    ----------
    dataframe : pd.DataFrame
        Hawk flight DataFrame with marker columns and a ``time``
        column.
    marker_column_names : np.ndarray
        1-D string array of marker column names, length
        ``n_markers * 3``.  Every third name (stride 3) is used to
        derive the marker base names.

    Returns
    -------
    Figure
        Matplotlib figure with 8 equal-aspect subplots.
    """
    names = marker_column_names[::3]
    names = [m.split("_x")[0] for m in names]

    fig, axes = plt.subplots(
        2,
        4,
        figsize=(5, 8),
        sharex=True,
        sharey=True,
        tight_layout=True,
    )
    ax = axes.flatten()

    for ii, marker in enumerate(names):
        ax[ii].scatter(
            dataframe[marker + "_y"],
            dataframe[marker + "_z"],
            s=0.1,
            c=dataframe["time"],
        )
        ax[ii].set_aspect("equal", "box")
        ax[ii].axis("off")
        ax[ii].set_title(marker, fontsize=8)

    return fig

plot_single_sequence ¶

plot_single_sequence(dataframe: DataFrame, seq_num: int, marker_name: str = 'right_wingtip_z', x_axis: str = 'HorzDistance') -> Figure

Plot a single hawk sequence for one marker coordinate.

Selects the sequence at position seq_num (zero-indexed) from the unique seqID values and scatter-plots the chosen marker coordinate against the horizontal axis, coloured by time.

Parameters:

Name	Type	Description	Default
`dataframe`	`DataFrame`	Hawk flight DataFrame with `seqID`, `time`, marker, and x_axis columns.	required
`seq_num`	`int`	Zero-based index into the sorted unique sequence IDs.	required
`marker_name`	`str`	Column name of the marker coordinate to plot on the vertical axis, e.g. `'right_wingtip_z'`.	`'right_wingtip_z'`
`x_axis`	`str`	Column for the horizontal axis. `'HorzDistance'` labels the axis as distance to perch; `'time'` labels it as time from take-off.	`'HorzDistance'`

Returns:

Type	Description
`Figure`	Matplotlib figure with a single scatter subplot.

Source code in src/birddmd/hawk.py

def plot_single_sequence(
    dataframe: pd.DataFrame,
    seq_num: int,
    marker_name: str = "right_wingtip_z",
    x_axis: str = "HorzDistance",
) -> Figure:
    """Plot a single hawk sequence for one marker coordinate.

    Selects the sequence at position *seq_num* (zero-indexed) from the
    unique ``seqID`` values and scatter-plots the chosen marker
    coordinate against the horizontal axis, coloured by time.

    Parameters
    ----------
    dataframe : pd.DataFrame
        Hawk flight DataFrame with ``seqID``, ``time``, marker, and
        *x_axis* columns.
    seq_num : int
        Zero-based index into the sorted unique sequence IDs.
    marker_name : str
        Column name of the marker coordinate to plot on the vertical
        axis, e.g. ``'right_wingtip_z'``.
    x_axis : str
        Column for the horizontal axis.  ``'HorzDistance'`` labels the
        axis as distance to perch; ``'time'`` labels it as time from
        take-off.

    Returns
    -------
    Figure
        Matplotlib figure with a single scatter subplot.
    """
    seqIDs = dataframe["seqID"].unique()
    seqID = seqIDs[seq_num]

    fig, ax = plt.subplots(1, 1, figsize=(5, 5), tight_layout=True)
    mask = dataframe["seqID"] == seqID
    ax.scatter(
        dataframe[mask][x_axis],
        dataframe[mask][marker_name],
        s=10,
        c=dataframe[mask]["time"],
    )
    if x_axis.startswith("Horz"):
        ax.set_xlabel("Horizontal Distance to Perch (m)")
    elif x_axis.startswith("time"):
        ax.set_xlabel("time from take-off jump (s)")
    else:
        ax.set_xlabel(x_axis)
    ax.set_ylabel(marker_name)
    return fig