Stats¶
Statistical analysis utilities: RMSE, variance explained, and sequence filtering.
stats ¶
Statistical analysis utilities.
Provides the canonical RMSE implementation, sequence quality filtering, and variance-explained computation.
Functions:
| Name | Description |
|---|---|
compute_rmse |
Per-frame root mean square error. |
filter_sequences |
Return sequence IDs that pass quality criteria. |
variance_explained |
Fraction of variance captured by a reconstruction (R²-like). |
compute_rmse ¶
Per-frame RMSE between reconstruction and ground_truth.
Both arrays must share the same shape, typically
(n_frames, n_markers, 3).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
reconstruction
|
ndarray
|
Reconstructed data. |
required |
ground_truth
|
ndarray
|
Original data to compare against. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
RMSE per frame, shape |
Notes
Computed as:
RMSE(t) = √( mean( (x̂(t) - x(t))² ) )
where the mean is taken over all markers and coordinates at each time step.
Source code in src/birddmd/stats.py
filter_sequences ¶
filter_sequences(df: DataFrame, gap_threshold: float = DEFAULT_GAP_THRESHOLD, time_start_max: float = DEFAULT_TIME_START_MAX, min_frames: int = DEFAULT_MIN_FRAMES) -> list[str]
Return sequence IDs that pass quality filters.
Sequences are rejected if they contain time gaps larger than gap_threshold, start later than time_start_max, or have fewer than min_frames frames.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
df
|
DataFrame
|
Must contain |
required |
gap_threshold
|
float
|
Maximum allowed gap between consecutive frames (seconds). |
DEFAULT_GAP_THRESHOLD
|
time_start_max
|
float
|
Maximum allowed start time (seconds). |
DEFAULT_TIME_START_MAX
|
min_frames
|
int
|
Minimum required frame count. |
DEFAULT_MIN_FRAMES
|
Returns:
| Type | Description |
|---|---|
list of str
|
Sequence IDs passing all filters. |
Source code in src/birddmd/stats.py
variance_explained ¶
Fraction of variance captured by reconstruction.
Analogous to R²:
VE = 1 - SS_res / SS_tot
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
original
|
ndarray
|
Ground-truth data (any shape). |
required |
reconstruction
|
ndarray
|
Reconstructed data (same shape as original). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Variance explained, in [0, 1] for a good fit. |