Figure 1: Generating an MLI: P1, P2, and P3 are pixels of the image; A, B, and C are structure numbers between 0 and 127; A' = A + 128 (high bit set), and I(P1, A) is the intensity of structure A at the point where the ray passing through P1 intersects it. The data stream representing the image in the figure is:
To generate the representation, we first choose a width w and height h for the target image, and select a viewing direction and lighting for the models. We color each of the structures in the model white and render it separately, recording the image and the Z-buffer for each structure's rendering. The information contained in these 2n images is sufficient for producing a target image with arbitrary color and transparency values for each structure, but this representation is as large as 2n uncompressed images, and contains a large percentage of extraneous information. Fortunately it is relatively easy to extract only the necessary data and arrange it in an order that leads to very simple and fast decoding.
Let be an array of pixel intensity values from the rendering of structure i, and let be an array of Z-buffer values from the rendering of structure i. For simplicity, we assume that the depths in the Z-buffer are normalized so that 0 represents the front clipping plane, some b > 0 represents the back clipping plane, and any value greater than b represents a background pixel.
For each pixel (x, y), we sort the values from front to back, giving a permutation such that We then define the following n arrays of size :
These arrays can be interpreted as follows: each pixel in is labeled with the structure number that appears in that pixel if all structures are fully opaque, or zero for background. For i > 1, each pixel in is labeled with the structure number that appears in that pixel if the structures that occupy the same pixel in are all turned off.
Finally, let to be the number of overlapping structures at pixel (x, y), so equals the smallest i for which , minus one. This allows us to define, for each pixel, the sequence
The sequence tells us how to render pixel (x, y) given an assignment of colors and transparencies to each of the n structures. Each successive pair of elements represents a (structure number, intensity value) tuple ordered from front to back, so to compute the pixel, we need only step along the sequence, blending the appropriate intensity of each structure's color in proportions dictated by its transparency.
In order to store the variable-length sequences sequentially without having to store the value for each pixel, we tag the first element of each sequence and insert placeholders when as follows: let equal zero with the high bit set if , and let equal with the high bit of the first element set, otherwise. This use of the high bit explains the requirement that . Our final representation R consists of the following sequence:
Figure 1 illustrates the derivation of an MLI representation from 3D models.
The data stream R is a compact representation of the original intensity and depth images; the length of R in bytes is:
Equivalently, letting g equal the number of background pixels (i.e. pixels (x, y) for which ), and a equal the mean of over all non-background pixels, we have . In other words, the length of the representation is no more than 2a times greater than the length of an uncompressed image, and in our experience, a tends to be relatively small compared to n (it was never greater than n/8 in our sample application). This is a significant improvement over the naive strategy of sending all 2n and arrays, and we will see in the next section that the ordering of data in R is conducive to rapid image construction since the front to back ordering is precomputed. Section 2.3 describes a simple compression scheme that further reduces the size required for the representation of MLIs.