fiftyone.utils.video#

Video utilities.

Copyright 2017-2025, Voxel51, Inc.

Functions:

extract_clip(video_path,Β output_path[,Β ...])

Extracts the specified clip from the video.

reencode_videos(sample_collection[,Β ...])

Re-encodes the videos in the sample collection as H.264 MP4s that can be visualized in the FiftyOne App.

transform_videos(sample_collection[,Β fps,Β ...])

Transforms the videos in the sample collection according to the provided parameters using ffmpeg.

sample_videos(sample_collection[,Β ...])

Samples the videos in the sample collection into directories of per-frame images according to the provided parameters using ffmpeg.

reencode_video(input_path,Β output_path[,Β ...])

Re-encodes the video using the H.264 codec.

transform_video(input_path,Β output_path[,Β ...])

Transforms the video according to the provided parameters using ffmpeg.

sample_video(input_path,Β output_patt[,Β ...])

Samples the video into a directory of per-frame images according to the provided parameters using ffmpeg.

sample_frames_uniform(frame_rate[,Β ...])

Returns a list of frame numbers sampled uniformly according to the provided parameters.

concat_videos(input_paths,Β output_path[,Β ...])

Concatenates the given list of videos, in order, into a single video.

exact_frame_count(input_path)

Returns the exact number of frames in the video.

fiftyone.utils.video.extract_clip(video_path, output_path, support=None, timestamps=None, metadata=None, fast=False)#

Extracts the specified clip from the video.

Provide either support or timestamps to this method.

When fast=False, the following ffmpeg command is used:

# Slower, more accurate option
ffmpeg -ss <start_time> -i <video_path> -t <duration> <output_path>

When fast is True, the following two-step ffmpeg process is used:

# Faster, less accurate option
ffmpeg -ss <start_time> -i <video_path> -t <duration> -c copy <tmp_path>
ffmpeg -i <tmp_path> <output_path>
Parameters:
  • video_path – the path to the video

  • output_path – the path to write the extracted clip

  • support (None) – the [first, last] frame number range to clip

  • timestamps (None) – the [start, stop] timestamps to clip, in seconds

  • metadata (None) – the fiftyone.core.metadata.VideoMetadata for the video

  • fast (False) – whether to use a faster-but-potentially-less-accurate strategy to extract the clip

fiftyone.utils.video.reencode_videos(sample_collection, force_reencode=True, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, update_filepaths=True, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Re-encodes the videos in the sample collection as H.264 MP4s that can be visualized in the FiftyOne App.

If no output_dir is specified and delete_originals is False, then if a transformation would result in overwriting an existing file with the same filename, the original file is renamed to <name>-original.<ext>.

By default, the re-encoding is performed via the following ffmpeg command:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

You can configure parameters of the re-encoding such as codec and compression by passing keyword arguments for eta.core.video.FFmpeg(**kwargs) to this function.

Note

This method will not update the metadata field of the collection after transforming. You can repopulate the metadata field if needed by calling:

sample_collection.compute_metadata(overwrite=True)
Parameters:
  • sample_collection – a fiftyone.core.collections.SampleCollection

  • force_reencode (True) – whether to re-encode videos that are already MP4s

  • media_field ("filepath") – the input field containing the video paths to transform

  • output_field (None) – an optional field in which to store the paths to the transformed videos. By default, media_field is updated in-place

  • output_dir (None) – an optional output directory in which to write the transformed videos. If none is provided, the videos are updated in-place

  • rel_dir (None) – an optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each video. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths

  • update_filepaths (True) – whether to store the output paths on the sample collection

  • delete_originals (False) – whether to delete the original videos after re-encoding

  • skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be re-encoded

  • verbose (False) – whether to log the ffmpeg commands that are executed

  • progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.transform_videos(sample_collection, fps=None, min_fps=None, max_fps=None, size=None, min_size=None, max_size=None, reencode=False, force_reencode=False, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, update_filepaths=True, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Transforms the videos in the sample collection according to the provided parameters using ffmpeg.

If no output_dir is specified and delete_originals is False, then if a transformation would result in overwriting an existing file with the same filename, the original file is renamed to <name>-original.<ext>.

In addition to the size and frame rate parameters, if reencode == True, the following basic ffmpeg command structure is used to re-encode the videos as H.264 MP4s:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

Note

This method will not update the metadata field of the collection after transforming. You can repopulate the metadata field if needed by calling:

sample_collection.compute_metadata(overwrite=True)
Parameters:
  • sample_collection – a fiftyone.core.collections.SampleCollection

  • fps (None) – an optional frame rate at which to resample the videos

  • min_fps (None) – an optional minimum frame rate. Videos with frame rate below this value are upsampled

  • max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled

  • size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved

  • min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • reencode (False) – whether to re-encode the videos as H.264 MP4s

  • force_reencode (False) – whether to re-encode videos whose parameters already satisfy the specified values

  • media_field ("filepath") – the input field containing the video paths to transform

  • output_field (None) – an optional field in which to store the paths to the transformed videos. By default, media_field is updated in-place

  • output_dir (None) – an optional output directory in which to write the transformed videos. If none is provided, the videos are updated in-place

  • rel_dir (None) – an optional relative directory to strip from each input filepath to generate a unique identifier that is joined with output_dir to generate an output path for each video. This argument allows for populating nested subdirectories in output_dir that match the shape of the input paths

  • update_filepaths (True) – whether to store the output paths on the sample collection

  • delete_originals (False) – whether to delete the original videos after re-encoding

  • skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be transformed

  • verbose (False) – whether to log the ffmpeg commands that are executed

  • progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_videos(sample_collection, frames_patt=None, frames=None, fps=None, max_fps=None, size=None, min_size=None, max_size=None, original_frame_numbers=True, force_sample=False, media_field='filepath', output_field=None, output_dir=None, rel_dir=None, save_filepaths=False, delete_originals=False, skip_failures=False, verbose=False, progress=None, **kwargs)#

Samples the videos in the sample collection into directories of per-frame images according to the provided parameters using ffmpeg.

By default, each folder of images is written using the same basename as the input video. For example, if frames_patt = "%%06d.jpg", then videos with the following paths:

/path/to/video1.mp4
/path/to/video2.mp4
...

would be sampled as follows:

/path/to/video1/
    000001.jpg
    000002.jpg
    ...
/path/to/video2/
    000001.jpg
    000002.jpg
    ...

However, you can use the optional output_dir and rel_dir parameters to customize the location and shape of the sampled frame folders. For example, if output_dir = "/tmp" and rel_dir = "/path/to", then videos with the following paths:

/path/to/folderA/video1.mp4
/path/to/folderA/video2.mp4
/path/to/folderB/video3.mp4
...

would be sampled as follows:

/tmp/folderA/
    video1/
        000001.jpg
        000002.jpg
        ...
    video2/
        000001.jpg
        000002.jpg
        ...
/tmp/folderB/
    video3/
        000001.jpg
        000002.jpg
        ...
Parameters:
  • sample_collection – a fiftyone.core.collections.SampleCollection

  • frames_patt (None) – a pattern specifying the filename/format to use to store the sampled frames, e.g., "%%06d.jpg". The default value is fiftyone.config.default_sequence_idx + fiftyone.config.default_image_ext

  • frames (None) – an optional list of lists defining specific frames to sample from each video. Entries can also be None, in which case all frames will be sampled. If provided, fps and max_fps are ignored

  • fps (None) – an optional frame rate at which to sample frames

  • max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled

  • size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved

  • min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • original_frame_numbers (True) – whether to use the original frame numbers when writing the output frames (True) or to instead reindex the frames as 1, 2, … (False)

  • force_sample (False) – whether to resample videos whose sampled frames already exist

  • media_field ("filepath") – the input field containing the video paths to sample

  • output_field (None) – an optional frame field in which to store the paths to the sampled frames. By default, media_field is used

  • output_dir (None) – an optional output directory in which to write the sampled frames. By default, the frames are written in folders with the same basename of each video

  • rel_dir (None) – a relative directory to remove from the filepath of each video, if possible. The path is converted to an absolute path (if necessary) via fiftyone.core.storage.normalize_path(). This argument can be used in conjunction with output_dir to cause the sampled frames to be written in a nested directory structure within output_dir matching the shape of the input video’s folder structure

  • save_filepaths (False) – whether to save the sampled frame paths in the output_field field of each frame of the input collection

  • delete_originals (False) – whether to delete the original videos after sampling

  • skip_failures (False) – whether to gracefully continue without raising an error if a video cannot be sampled

  • verbose (False) – whether to log the ffmpeg commands that are executed

  • progress (None) – whether to render a progress bar (True/False), use the default value fiftyone.config.show_progress_bars (None), or a progress callback function to invoke instead

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.reencode_video(input_path, output_path, verbose=False, **kwargs)#

Re-encodes the video using the H.264 codec.

By default, the re-encoding is performed via the following ffmpeg command:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH

You can configure parameters of the re-encoding such as codec and compression by passing keyword arguments for eta.core.video.FFmpeg(**kwargs) to this function.

Parameters:
  • input_path – the path to the input video

  • output_path – the path to write the output video

  • verbose (False) – whether to log the ffmpeg command that is executed

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.transform_video(input_path, output_path, fps=None, min_fps=None, max_fps=None, size=None, min_size=None, max_size=None, reencode=False, verbose=False, **kwargs)#

Transforms the video according to the provided parameters using ffmpeg.

In addition to the size and frame rate parameters, if reencode == True, the following basic ffmpeg command structure is used to re-encode the video as an H.264 MP4:

ffmpeg \
    -loglevel error -vsync 0 -i $INPUT_PATH \
    -c:v libx264 -preset medium -crf 23 -pix_fmt yuv420p -vsync 0 -an \
    $OUTPUT_PATH
Parameters:
  • input_path – the path to the input video

  • output_path – the path to write the output video

  • fps (None) – an optional frame rate at which to resample the videos

  • min_fps (None) – an optional minimum frame rate. Videos with frame rate below this value are upsampled

  • max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled

  • size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved

  • min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • reencode (False) – whether to reencode the video (see main description)

  • verbose (False) – whether to log the ffmpeg command that is executed

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_video(input_path, output_patt, frames=None, fps=None, max_fps=None, size=None, min_size=None, max_size=None, original_frame_numbers=True, verbose=False, **kwargs)#

Samples the video into a directory of per-frame images according to the provided parameters using ffmpeg.

Parameters:
  • input_path – the path to the input video

  • output_patt – a pattern like /path/to/images/%%06d.jpg specifying the filename/format to write the sampled frames

  • frames (None) – an iterable of frame numbers to sample. If provided, fps and max_fps are ignored

  • fps (None) – an optional frame rate at which to sample the frames

  • max_fps (None) – an optional maximum frame rate. Videos with frame rate exceeding this value are downsampled

  • size (None) – an optional (width, height) for each frame. One dimension can be -1, in which case the aspect ratio is preserved

  • min_size (None) – an optional minimum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • max_size (None) – an optional maximum (width, height) for each frame. A dimension can be -1 if no constraint should be applied. The frames are resized (aspect-preserving) if necessary to meet this constraint

  • original_frame_numbers (True) – whether to use the original frame numbers when writing the output frames (True) or to instead reindex the frames as 1, 2, … (False)

  • verbose (False) – whether to log the ffmpeg command that is executed

  • **kwargs – keyword arguments for eta.core.video.FFmpeg(**kwargs)

fiftyone.utils.video.sample_frames_uniform(frame_rate, total_frame_count=None, support=None, fps=None, max_fps=None, always_sample_last=False)#

Returns a list of frame numbers sampled uniformly according to the provided parameters.

Parameters:
  • frame_rate – the video frame rate

  • total_frame_count (None) – the total number of frames in the video

  • support (None) – a [first, last] frame range from which to sample

  • fps (None) – a frame rate at which to sample frames

  • max_fps (None) – a maximum frame rate at which to sample frames

  • always_sample_last (False) – whether to always sample the last frame

Returns:

a list of frame numbers, or None if all frames should be sampled

fiftyone.utils.video.concat_videos(input_paths, output_path, verbose=False)#

Concatenates the given list of videos, in order, into a single video.

Parameters:
  • input_paths – a list of video paths

  • output_path – the path to write the output video

  • verbose (False) – whether to log the ffmpeg command that is executed

fiftyone.utils.video.exact_frame_count(input_path)#

Returns the exact number of frames in the video.

Warning

This method uses the -count_frames argument of ffprobe, which requires decoding the video and can be very slow.

Parameters:

input_path – the path to the video

Returns:

the number of frames in the video