fiftyone.utils.video#

Video utilities.

Copyright 2017-2025, Voxel51, Inc.
voxel51.com

Functions:

`extract_clip`(video_path, output_path[, ...])	Extracts the specified clip from the video.
`reencode_videos`(sample_collection[, ...])	Re-encodes the videos in the sample collection as H.264 MP4s that can be visualized in the FiftyOne App.
`transform_videos`(sample_collection[, fps, ...])	Transforms the videos in the sample collection according to the provided parameters using `ffmpeg`.
`sample_videos`(sample_collection[, ...])	Samples the videos in the sample collection into directories of per-frame images according to the provided parameters using `ffmpeg`.
`reencode_video`(input_path, output_path[, ...])	Re-encodes the video using the H.264 codec.
`transform_video`(input_path, output_path[, ...])	Transforms the video according to the provided parameters using `ffmpeg`.
`sample_video`(input_path, output_patt[, ...])	Samples the video into a directory of per-frame images according to the provided parameters using `ffmpeg`.
`sample_frames_uniform`(frame_rate[, ...])	Returns a list of frame numbers sampled uniformly according to the provided parameters.
`concat_videos`(input_paths, output_path[, ...])	Concatenates the given list of videos, in order, into a single video.
`exact_frame_count`(input_path)	Returns the exact number of frames in the video.

fiftyone.utils.video.extract_clip(video_path, output_path, support=None, timestamps=None, metadata=None, fast=False)#

Extracts the specified clip from the video.

Provide either support or timestamps to this method.

When fast=False, the following ffmpeg command is used:

# Slower, more accurate option
ffmpeg -ss <start_time> -i <video_path> -t <duration> <output_path>

When fast is True, the following two-step ffmpeg process is used:

# Faster, less accurate option
ffmpeg -ss <start_time> -i <video_path> -t <duration> -c copy <tmp_path>
ffmpeg -i <tmp_path> <output_path>

Parameters:

video_path – the path to the video
output_path – the path to write the extracted clip
support (None) – the [first, last] frame number range to clip
timestamps (None) – the [start, stop] timestamps to clip, in seconds
metadata (None) – the fiftyone.core.metadata.VideoMetadata for the video
fast (False) – whether to use a faster-but-potentially-less-accurate strategy to extract the clip

fiftyone.utils.video.reencode_videos(sample_collection, force_reencode=True, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, update_filepaths=True, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Re-encodes the videos in the sample collection as H.264 MP4s that can be visualized in the FiftyOne App.

If no output_dir is specified and delete_originals is False, then if a transformation would result in overwriting an existing file with the same filename, the original file is renamed to <name>-original.<ext>.

By default, the re-encoding is performed via the following ffmpeg command:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

You can configure parameters of the re-encoding such as codec and compression by passing keyword arguments for eta.core.video.FFmpeg(**kwargs) to this function.

Note

This method will not update the metadata field of the collection after transforming. You can repopulate the metadata field if needed by calling:

sample_collection.compute_metadata(overwrite=True)

Parameters:

sample_collection – a fiftyone.core.collections.SampleCollection
force_reencode (True) – whether to re-encode videos that are already MP4s
media_field ("filepath") – the input field containing the video paths to transform
output_field (None) – an optional field in which to store the paths to the transformed videos. By default, media_field is updated in-place
output_dir (None) – an optional output directory in which to write the transformed videos. If none is provided, the videos are updated in-place
rel_dir (None) – an optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each video. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths
update_filepaths (True) – whether to store the output paths on the sample collection
delete_originals (False) – whether to delete the original videos after re-encoding
skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be re-encoded
verbose (False) – whether to log the ffmpeg commands that are executed
progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.transform_videos(sample_collection, fps=None, min_fps=None, max_fps=None, size=None, min_size=None, max_size=None, reencode=False, force_reencode=False, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, update_filepaths=True, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Transforms the videos in the sample collection according to the provided parameters using ffmpeg.

If no output_dir is specified and delete_originals is False, then if a transformation would result in overwriting an existing file with the same filename, the original file is renamed to <name>-original.<ext>.

In addition to the size and frame rate parameters, if reencode == True, the following basic ffmpeg command structure is used to re-encode the videos as H.264 MP4s:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

Note

This method will not update the metadata field of the collection after transforming. You can repopulate the metadata field if needed by calling:

sample_collection.compute_metadata(overwrite=True)

Parameters:

sample_collection – a fiftyone.core.collections.SampleCollection
fps (None) – an optional frame rate at which to resample the videos
min_fps (None) – an optional minimum frame rate. Videos with frame rate below this value are upsampled
max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled
size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved
min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
reencode (False) – whether to re-encode the videos as H.264 MP4s
force_reencode (False) – whether to re-encode videos whose parameters already satisfy the specified values
media_field ("filepath") – the input field containing the video paths to transform
output_field (None) – an optional field in which to store the paths to the transformed videos. By default, media_field is updated in-place
output_dir (None) – an optional output directory in which to write the transformed videos. If none is provided, the videos are updated in-place
rel_dir (None) – an optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each video. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths
update_filepaths (True) – whether to store the output paths on the sample collection
delete_originals (False) – whether to delete the original videos after re-encoding
skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be transformed
verbose (False) – whether to log the ffmpeg commands that are executed
progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_videos(sample_collection, frames_patt=None, frames=None, fps=None, max_fps=None, size=None, min_size=None, max_size=None, original_frame_numbers=True, force_sample=False, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, save_filepaths=False, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Samples the videos in the sample collection into directories of per-frame images according to the provided parameters using ffmpeg.

By default, each folder of images is written using the same basename as the input video. For example, if frames_patt = "%%06d.jpg", then videos with the following paths:

/path/to/video1.mp4
/path/to/video2.mp4
...

would be sampled as follows:

/path/to/video1/
    000001.jpg
    000002.jpg
    ...
/path/to/video2/
    000001.jpg
    000002.jpg
    ...

However, you can use the optional output_dir and rel_dir parameters to customize the location and shape of the sampled frame folders. For example, if output_dir = "/tmp" and rel_dir = "/path/to", then videos with the following paths:

/path/to/folderA/video1.mp4
/path/to/folderA/video2.mp4
/path/to/folderB/video3.mp4
...

would be sampled as follows:

/tmp/folderA/
    video1/
        000001.jpg
        000002.jpg
        ...
    video2/
        000001.jpg
        000002.jpg
        ...
/tmp/folderB/
    video3/
        000001.jpg
        000002.jpg
        ...

Parameters:

sample_collection – a fiftyone.core.collections.SampleCollection
frames_patt (None) – a pattern specifying the filename/format to use to store the sampled frames, e.g., "%%06d.jpg". The default value is fiftyone.config.default_sequence_idx + fiftyone.config.default_image_ext
frames (None) – an optional list of lists defining specific frames to sample from each video. Entries can also be None, in which case all frames will be sampled. If provided, fps and max_fps are ignored
fps (None) – an optional frame rate at which to sample frames
max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled
size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved
min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
original_frame_numbers (True) – whether to use the original frame numbers when writing the output frames (True) or to instead reindex the frames as 1, 2, … (False)
force_sample (False) – whether to resample videos whose sampled frames already exist
media_field ("filepath") – the input field containing the video paths to sample
output_field (None) – an optional frame field in which to store the paths to the sampled frames. By default, media_field is used
output_dir (None) – an optional output directory in which to write the sampled frames. By default, the frames are written in folders with the same basename of each video
rel_dir (None) – a relative directory to remove from the filepath of each video, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path(). This argument can be used in conjunction with output_dir to cause the sampled frames to be written in a nested directory structure within output_dir matching the shape of the input video’s folder structure
save_filepaths (False) – whether to save the sampled frame paths in the output_field field of each frame of the input collection
delete_originals (False) – whether to delete the original videos after sampling
skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be sampled
verbose (False) – whether to log the ffmpeg commands that are executed
progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.reencode_video(input_path, output_path, verbose=False, **kwargs)#

Re-encodes the video using the H.264 codec.

By default, the re-encoding is performed via the following ffmpeg command:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

You can configure parameters of the re-encoding such as codec and compression by passing keyword arguments for eta.core.video.FFmpeg(**kwargs) to this function.

Parameters:

input_path – the path to the input video
output_path – the path to write the output video
verbose (False) – whether to log the ffmpeg command that is executed
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.transform_video(input_path, output_path, fps=None, min_fps=None, max_fps=None, size=None, min_size=None, max_size=None, reencode=False, verbose=False, **kwargs)#

Transforms the video according to the provided parameters using ffmpeg.

In addition to the size and frame rate parameters, if reencode == True, the following basic ffmpeg command structure is used to re-encode the video as an H.264 MP4:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

Parameters:

input_path – the path to the input video
output_path – the path to write the output video
fps (None) – an optional frame rate at which to resample the videos
min_fps (None) – an optional minimum frame rate. Videos with frame rate below this value are upsampled
max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled
size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved
min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
reencode (False) – whether to reencode the video (see main description)
verbose (False) – whether to log the ffmpeg command that is executed
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_video(input_path, output_patt, frames=None, fps=None, max_fps=None, size=None, min_size=None, max_size=None, original_frame_numbers=True, verbose=False, **kwargs)#

Samples the video into a directory of per-frame images according to the provided parameters using ffmpeg.

Parameters:

input_path – the path to the input video
output_patt – a pattern like /path/to/images/%%06d.jpg specifying the filename/format to write the sampled frames
frames (None) – an iterable of frame numbers to sample. If provided, fps and max_fps are ignored
fps (None) – an optional frame rate at which to sample the frames
max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled
size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved
min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint
original_frame_numbers (True) – whether to use the original frame numbers when writing the output frames (True) or to instead reindex the frames as 1, 2, … (False)
verbose (False) – whether to log the ffmpeg command that is executed
**kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_frames_uniform(frame_rate, total_frame_count=None, support=None, fps=None, max_fps=None, always_sample_last=False)#

Returns a list of frame numbers sampled uniformly according to the provided parameters.

Parameters:

frame_rate – the video frame rate
total_frame_count (None) – the total number of frames in the video
support (None) – a [first, last] frame range from which to sample
fps (None) – a frame rate at which to sample frames
max_fps (None) – a maximum frame rate at which to sample frames
always_sample_last (False) – whether to always sample the last frame

Returns:

a list of frame numbers, or None if all frames should be sampled

fiftyone.utils.video.concat_videos(input_paths, output_path, verbose=False)#

Concatenates the given list of videos, in order, into a single video.

Parameters:

input_paths – a list of video paths
output_path – the path to write the output video
verbose (False) – whether to log the ffmpeg command that is executed

fiftyone.utils.video.exact_frame_count(input_path)#

Returns the exact number of frames in the video.

Warning

This method uses the -count_frames argument of ffprobe, which requires decoding the video and can be very slow.

Parameters:: input_path – the path to the video
Returns:: the number of frames in the video