Metadata is stored in the container format of a video, not in the encoded video data. There are many container formats, all with different capabilities. Open Source video players such as Bino or VLC typically rely on FFmpeg to read metadata and present it in a unified way. For stereoscopic 3D and surround video metadata, see FFmpeg’s stereo3d.h and spherical.h.
However, while FFmpeg can read the relevant metadata, it does not provide a reliable way to write it. You have to use tools specific for each container format.
In the following, only the Matroska container formats MKV and WebM
and the MP4 container format are discussed, since only those provide
relevant metadata capabilities. The metadata tool for MKV/WebM is
mkvpropedit
, which is part of the MKVToolNix package. For MP4, you
can use ExifTool.
There are many ways to pack the left and right views into a video: you can put them on top of each other, place them side by side, interleave them in some way, or alternate between left and right views.
Interleaving and alternating are very bad ideas because they hamper video encoding, which relies heavily on finding homogeneous regions both within a video frame and between consecutive frames.
I recommend the top-bottom layout, where the left view is placed on top of the right view, resulting in a single video frame that is twice as high as one of the views.
This is better than the left-right layout (where the left view and right views are placed side by side) because width and height of a resulting video frame are similar for typical 16:9 videos. This makes it more likely that the video dimensions fit into the size restrictions of hardware-based encoding and decoding capabilities. For example, NVIDIA GPUs typically support frames of up to 4096x4096 or 8192x8192 pixels, so a top-bottom frame is more likely to fit than a left-right frame.
There are layout variants where the right view comes first, but there is simply no reason to use them. There are also variants where the left and right view have only half the resolution vertically (for top-bottom) or horizontally (for left-right) so that a packed frame has the usual aspect ratio, but the resulting limitation of video quality is simply not necessary nowadays.
Only the Matroska container formats MKV and WebM have proper support
for Stereo-3D metadata. There is a StereoMode
element which
can have values 3
for top-bottom and 1
for
left-right. There are other
values, but they are irrelevant for the reasons explained above.
Example to set top-bottom layout on video track 1:
mkvpropedit video.mkv --edit track:v1 --set stereo-mode=3
FFmpeg and Bino support this metadata, VLC does not.
There is also a metadata proposal by 3dtv.at, but it only applies to WMV files. While Bino supports this metadata, FFmpeg and VLC do not.
Both surround modes use an equirectangular map to store the surrounding sphere in a rectangular video frame. For 360° video, the frames have an aspect ratio of 2:1. For 180° video, the aspect ratio is 1:1 since it only encodes the half-sphere in front of the viewer.
The Matroska container formats MKV and WebM have proper support for
Surround metadata. There is a ProjectionType
element which
can have the value 1
for equirectangular projection. There
are other
values, but they are only relevant for special-purpose
applications.
Example to set equirectangular projection on video track 1:
mkvpropedit video.mkv --edit track:v1 --set projection-type=1
For the MP4 container format, the Google Spherical Video specification describes the XMP GSpherical extension which can be set with exiftool:
exiftool \
-XMP-GSpherical:Spherical="true" \
-XMP-GSpherical:Stitched="true" \
-XMP-GSpherical:ProjectionType="equirectangular" \
video.mp4
Both the MKV/WebM and MP4 metadata are understood by FFmpeg, Bino, and VLC.
Note that all of the above metadata is only for 360° surround video. FFmpeg can report metadata for 180° video to applications, but there is no pratical way to set this metadata for specific container formats.
You can also have stereoscopic surround videos, and the Matroska metadata can simply be combined.
Example to set top-bottom layout and equirectangular projection on video track 1:
mkvpropedit video.mkv --edit track:v1 --set stereo-mode=3 --set projection-type=1
The XMP GSpherical extension also has an additional field for stereoscopic mode that you can set to top-bottom or left-right.
Example to set top-bottom layout and equirectangular projection on video track 1:
exiftool \
-XMP-GSpherical:Spherical="true" \
-XMP-GSpherical:Stitched="true" \
-XMP-GSpherical:ProjectionType="equirectangular" \
-XMP-GSpherical:StereoMode="top-bottom" \
video.mp4
Both the MKV/WebM and MP4 combined metadata are understood by FFmpeg and Bino. VLC understands only the equirectangular part.
Recommendations: