The LWN.net Video4Linux2 API series. |
Before any application can work with a video device, it must come to an understanding with the driver about how video data will be formatted. This negotiation can be a rather complex process, resulting from the facts that (1) video hardware varies widely in the formats it can handle, and (2) performing format transformations in the kernel is frowned upon. So the application must be able to find out what formats are supported by the hardware and set up a configuration which is workable for everybody involved. This article will cover the basics of how formats are described; the next installment will get into the API implemented by V4L2 drivers to negotiate formats with applications.
A colorspace is, in broad terms, the coordinate system used to describe colors. There are several of them defined by the V4L2 specification, but only two are used in any broad way. They are:
This colorspace also covers the set of YUV and YCbCr representations. This representation derives from the need for early color television signals to be displayable on monochrome TV sets. So the Y (or "luminance") value is a simple brightness value; when displayed alone, it yields a grayscale image. The U and V (or Cb and Cr) "chrominance" values describe the blue and red components of the color; green can be derived by subtracting those components from the luminance. Conversion between YUV and RGB is not entirely straightforward, however; there are several formulas to choose from.
Note that YUV and YCbCr are not exactly the same thing, though the terms are often used interchangeably.
Quite a few other colorspaces exist; most of them are variants of television-related standards. See this page from the V4L2 specification for the full list.
As we have seen, pixel values are expressed as tuples, usually consisting of RGB or YUV values. There are two commonly-used ways of organizing those tuples into an image:
Packed formats might be more commonly used, especially with RGB formats, but both types can be generated by hardware and requested by applications. If the video device supports both packed and planar formats, the driver should make them both available to user space.
Color formats are described within the V4L2 API using the venerable "fourcc" code mechanism. These codes are 32-bit values, generated from four ASCII characters. As such, they have the advantages of being easily passed around and being human-readable. When a color format code reads, for example, 'RGB4', there is no need to go look it up in a table.
Note that fourcc codes are used in a lot of different settings, some of which predate Linux. The MPlayer application uses them internally. fourcc refers only to the coding mechanism, however, and says nothing about which codes are actually used - MPlayer has a translation function for converting between its fourcc codes and those used by V4L2.
In the format descriptions shown below, bytes are always listed in memory order - least significant bytes first on a little-endian machine. The least significant bit of each byte is on the right; for each color field, the lighter-shaded bit is the most significant.
Name | fourcc | Byte 0 | Byte 1 | Byte 2 | Byte 3 | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
V4L2_PIX_FORMAT_RGB332 | RGB1 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB444 | R444 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB555 | RGB0 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB565 | RGBP | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB555X | RGBQ | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB565X | RGBR | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_BGR24 | BGR3 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB24 | RGB3 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_BGR32 | BGR4 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_RGB32 | RGB4 | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_SBGGR8 | BA81 | ||||||||||||||||||||||||||||||||||||
When formats with empty space (shown in gray, above) are used, applications may use that space for an alpha (transparency) value.
The final format above is the "Bayer" format, which is generally something very close to the real data from the sensor found in most cameras. There are green values for every pixel, but blue and red only for every other pixel. Essentially, green carries the more important intensity information, with red and blue being interpolated across the pixels where they are missing. This is a pattern we will see again with the YUV formats.
The packed YUV formats will be shown first. The key for reading this table is:
Name | fourcc | Byte 0 | Byte 1 | Byte 2 | Byte 3 | ||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
V4L2_PIX_FORMAT_GREY | GREY | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_YUYV | YUYV | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_UYVY | UYVY | ||||||||||||||||||||||||||||||||||||
V4L2_PIX_FORMAT_Y41P | Y41P | ||||||||||||||||||||||||||||||||||||
There are several planar YUV formats in use as well. Drawing them all out does not help much, so we'll go with one example. The commonly-used "YUV 4:2:2" format (V4L2_PIX_FMT_YUV422, fourcc 422P) uses three separate arrays. A 4x4 image would be represented like this:
Y plane: | ||||||||||||||||||||||||||||||||||||
U plane: | ||||||||||||||||||||||||||||||||||||
V plane: | ||||||||||||||||||||||||||||||||||||
As with the Bayer format, YUV 4:2:2 has one U and one V value for every other Y value; displaying the image requires interpolating across the missing values. The other planar YUV formats are:
A few other YUV formats exist, but they are rarely used; see this page for the full list.
A couple of formats which might be useful for some drivers are:
There are a number of other, miscellaneous formats, some of them proprietary; this page has a list of them.
Now that we have an understanding of color formats, we can take a look at how the V4L2 API describes image formats in general. The key structure here is struct v4l2_pix_format (defined in <linux/videodev2.h>, which contains these fields:
All together, these parameters describe a buffer of video data in a reasonably complete manner. An application can fill out a v4l2_pix_format structure asking for just about any sort of format that a user-space developer can imagine. On the driver side, however, things have to be restrained to the formats the hardware can work with. So every V4L2 application must go through a negotiation process with the driver in an attempt to arrive at an image format that is both supported by the hardware and adequate for the application's needs. The next installment in this series will describe how this negotiation works from the device driver's point of view.
Video4Linux2 part 5a: colors and formats
Posted Jan 25, 2007 4:34 UTC (Thu) by jwb (guest, #15467) [Link]
This excellent article reminds me of another excellent article, The Pixel Rosetta Stone: Packings and Colorspaces which is part of a much larger work, The Lurker's Guide to Video, which has been around for eons but remains somewhat obscure.
RGB332 doesn't look right
Posted Jan 27, 2007 0:29 UTC (Sat) by tbird20d (subscriber, #1901) [Link]
In your diagram for the RGB332 format, there should be two colors with 3 bits of information, and there are only 2 (for a total of 7 used bits instead of 8 bits). I think this is an error.
RGB332 doesn't look right
Posted Jan 27, 2007 0:43 UTC (Sat) by corbet (editor, #1) [Link]
A red bit got away from me - they can be awfully slippery sometimes. I found another one and stuck it in.
RGB332 doesn't look right
Posted Jan 30, 2007 21:22 UTC (Tue) by roelofs (guest, #2599) [Link]
A red bit got away from me - they can be awfully slippery sometimes.You lost another one: RGB0. (Damned bits... This sort of thing never used to happen when the MCP was in control.)
Greg
RGB332 doesn't look right
Posted Jan 30, 2007 21:38 UTC (Tue) by corbet (editor, #1) [Link]
Hmph. That was a color problem from before I figured out which colors I really wanted to use. The bit wasn't gone, just hiding. Fixed.
RGB332 doesn't look right
Posted Jan 30, 2007 22:14 UTC (Tue) by roelofs (guest, #2599) [Link]
That was a color problem from before I figured out which colors I really wanted to use.Everybody's a critic, I know, but... I think it might work better to stick with a single shade for each channel and use a bullet or mid-dot or something to mark the most significant bit. (Are these made out of mini-tables? That was my assumption, but I haven't looked at the source yet.) I find the pale blue and pale green to be much more similar to each other than to the "regular" blue and green, to the point that the very first case (light blue, dark blue, light green, ...) looks more like a single bit on a pale background than a two-bit grouping followed by a three-bit one.
To put it another way, the grouping of same-color bits is more important information than which individual bit is most significant, but the extreme paleness of the MSBs makes them the most visually striking part of the diagrams. (IMHO. :-) )
Thanks,
Greg
color spaces, whee!
Posted Jan 30, 2007 22:03 UTC (Tue) by roelofs (guest, #2599) [Link]
This colorspace also covers the set of YUV and YCbCr representations. This representation derives from the need for early color television signals to be displayable on monochrome TV sets.Well, that's one (techno-historical) way to look at it... But perhaps more accurately, it derives from the physiology of the human eye, which basically has YUV- or YCbCr-like sensors in it. The fact that YUV/YCbCr better lend themselves to certain compression tricks (even lossless ones, which is still sort of amazing to me) plays a role, too.
Conversion between YUV and RGB is not entirely straightforward, however; there are several formulas to choose from.
Heck, conversion between RGB and RGB isn't entirely straightforward, either. Aside from "gamma" (~brightness/contrast), which most of us are vaguely aware of, there are a whole host of other, interlinked concepts involved: chromaticity, white point, color temperature, gamut, surround (ambient background lighting), rendering intent, ... In addition, there are separate exponential transfer functions (~"gamma") for cameras, monitors, display cards (LUTs!), etc.; you need to know details about both the source and the destination devices, and often there are files in between. It's a complicated mess, and it's really easy to get some (or all) of it wrong. :-/
Greg
__u32 pixelformat:
Posted Jan 31, 2007 4:06 UTC (Wed) by roelofs (guest, #2599) [Link]
the fourcc code describing the image format...stored in little-endian format (not native!), in case anyone else was wondering.
Greg
Video4Linux2 part 5a: colors and formats
Posted Sep 7, 2007 11:04 UTC (Fri) by miku (subscriber, #35152) [Link]
Various links to v4l2 spec seems to be broken.
Video4Linux2 part 5a: colors and formats
Posted Nov 2, 2007 21:15 UTC (Fri) by jimparis (subscriber, #38647) [Link]
I believe the BA81 pattern is incorrect (the greens should be staggered, in a checkerboard pattern).
Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds