Skip to main content

VAME step-by-step

Open In Colab

tip

Check out also the published VAME Workflow Guide, including more hands-on recommendations HERE.

tip

You can run an entire VAME workflow with just a few lines, using the Pipeline method.

If you haven't yet, please install VAME:

pip install vame-py

The VAME workflow consists of four main steps, plus optional analysis:

  1. Initialize project: In this step we will start the project and get your pose estimation data into the movement format
  2. Preprocess: This step will perform cleaning, filtering and alignment of the raw pose estimation data
  3. Train the VAME model:
    • Split the input data into training and test datasets.
    • Train the VAME model to embed behavioural dynamics.
    • Evaluate the performance of the trained model based on its reconstruction capabilities.
  4. Segment behavior:
    • Segment pose estimation time series into behavioral motifs, using HMM or K-means.
    • Group similar motifs into communities, using hierarchical clustering.
  5. Vizualization and analysis [Optional]:
    • Visualization and projection of latent vectors onto a 2D plane via UMAP.
    • Create motif and community videos.
    • Use the generative model (reconstruction decoder) to sample from the learned data distribution.

Let's start by importing the necessary libraries:

import vame
from vame.util.sample_data import download_sample_data
from pathlib import Path

Input data

To quickly try VAME, you can download sample data and use it as input. If you want to work with your own data, all you need to do is to provide the paths to the pose estimation files as lists of strings. You can also optionally provide the paths to the corresponding video files.

# You can run VAME with data from different sources:
# "DeepLabCut", "SLEAP" or "LightningPose"
source_software = "DeepLabCut"

# Download sample data
ps = download_sample_data(source_software)
videos = [ps["video"]]
poses_estimations = [ps["poses"]]

print(videos)
print(poses_estimations)

Step 1: Initialize your project

VAME organizes around projects. To start a new project, you need to define some basic things:

  • the project's name
  • the paths to the pose estimation files
  • the source software used to produce the pose estimation data
config_file, config_data = vame.init_new_project(
project_name="my_vame_project",
poses_estimations=poses_estimations,
source_software="DeepLabCut",
)

This command will create a project folder in the defined working directory with the project name you defined. In this folder you can find a config file called config.yaml which holds the main parameters for the VAME workflow.

The videos and pose estimation files will be linked or copied to the project folder.

Let's take a look at the project's configuration:

print(config_data)

Now let's take a look at the formatted input dataset:

ds_path = Path(config_data["project_path"]) / "data" / "raw" / f"{config_data['session_names'][0]}.nc"
vame.io.load_poses.load_vame_dataset(ds_path)

Step 2: Preprocess the raw pose estimation data

The preprocessing step includes:

Cleaning low confidence data points

Pose estimation data points with confidence below the threshold will be cleared and interpolated.

Egocentric alignment using key reference points

Based on two reference keypoints, the data will be aligned to an egocentric coordinate system:

  • centered_reference_keypoint: The keypoint that will be centered in the frame.
  • orientation_reference_keypoint: The keypoint that will be used to determine the rotation of the frame.

By consequence, the x and y coordinates of the centered_reference_keypoint and the x coordinate of the orientation_reference_keypoint will be set to an array of zeros, and further removed from the dataset.

Outlier cleaning

Outliers will be removed based on the interquartile range (IQR) method. This means that data points that are below Q1 - iqr_factor * IQR or above Q3 + iqr_factor * IQR will be cleared and interpolated.

Savitzky-Golay filtering

The data will be further smoothed using a Savitzky-Golay filter.

vame.preprocessing(
config=config_data,
centered_reference_keypoint="snout",
orientation_reference_keypoint="tailbase",
)

Step 3: Train the VAME model

At this point, we will prepare the data for training the VAME model, run the training and evaluate the model.

We start by splitting the input data into train and test sets:

vame.create_trainset(config=config_data)

Now we can train the VAME model. This migth take a while, depending on dataset size and your hardware.

vame.train_model(config=config_data)

The model evaluation produces two plots, one showing the loss of the model during training and the other showing the reconstruction and future prediction of input sequence.

vame.evaluate_model(config=config_data)

Step 4: Segment behavior

Behavioral segmentation in VAME is done in two steps:

  1. Segmentation of pose estimation data into motifs
  2. Clustering motifs in communities
vame.segment_session(config=config_data)

This will perfomr the segmentation using two different algorithms: HMM and K-means. The results will be saved in the project folder.

Community detection is done by grouping similar motifs into communities using hierarchical clustering. For that you must choose:

  • segmentation_algorithm, which can be either "hmm" or "kmeans"
  • cut_tree, which is the cut level for the hierarchical clustering
vame.community(
config=config_data,
segmentation_algorithm="hmm",
cut_tree=2,
)

Step 5: Vizualization and analysis

from vame.visualization.motif import visualize_motif_tree
from vame.visualization.umap import visualize_umap
from vame.visualization.preprocessing import (
visualize_preprocessing_scatter,
visualize_preprocessing_timeseries,
)
from vame.visualization.model import plot_loss
visualize_preprocessing_scatter(config=config_data)
visualize_preprocessing_timeseries(config=config_data)
plot_loss(cfg=config_data, model_name="VAME")
visualize_motif_tree(
config=config_data,
segmentation_algorithm="hmm",
)
visualize_umap(
config=config_data,
label="community",
segmentation_algorithm="hmm",
)

Create motif and community videos

VAME only needs the pose estimation data to generate motifs and communities. But it provides auxiliary functions to split original videos into motifs or communities videos.

from vame.video import add_videos_to_project

add_videos_to_project(config=config_data, videos=videos)

Create motif videos to get insights about the fine grained poses:

vame.motif_videos(
config=config_data,
segmentation_algorithm='hmm',
)

Create community videos to get insights about behavior on a hierarchical scale:

vame.community_videos(
config=config_data,
segmentation_algorithm='hmm',
)