Figure 1: Generating an MLI: P1, P2, and P3 are pixels of the
image; A, B, and C are structure numbers between 0 and 127; A'
= A + 128 (high bit set), and I(P1, A) is the intensity of
structure A at the point where the ray passing through P1
intersects it. The data stream representing the image in the figure
is:
To generate the representation, we first choose a width w and height
h for the target image, and select a viewing direction and lighting
for the models. We color each of the
structures in the
model white and render it separately, recording the image and the
Z-buffer for each structure's rendering. The information contained in
these 2n images is sufficient for producing a target image with
arbitrary color and transparency values for each structure, but this
representation is as large as 2n uncompressed
images,
and contains a large percentage of extraneous information. Fortunately
it is relatively easy to extract only the necessary data and arrange
it in an order that leads to very simple and fast decoding.
Let
be an
array of pixel intensity
values from the rendering of structure i, and let
be an
array of Z-buffer values from the rendering of
structure i. For simplicity, we assume that the depths in the
Z-buffer are normalized so that 0 represents the front clipping
plane, some b > 0 represents the back clipping plane, and any value
greater than b represents a background pixel.
For each pixel (x, y), we sort the values
from front to
back, giving a permutation
such that
We then define the following n arrays of size
:
These arrays can be interpreted as follows: each pixel in
is labeled with the structure number that appears in
that pixel if all structures are fully opaque, or zero for
background. For i > 1, each pixel in
is labeled
with the structure number that appears in that pixel if the structures
that occupy the same pixel in
are all turned off.
Finally, let
to be the number of overlapping structures at
pixel (x, y), so
equals the smallest i for which
, minus one. This allows us to define, for
each pixel, the sequence
The sequence
tells us how to render pixel (x, y) given
an assignment of colors and transparencies to each of the n
structures. Each successive pair of elements represents a
(structure number, intensity value) tuple ordered from front to
back, so to compute the pixel, we need only step along the sequence,
blending the appropriate intensity of each structure's color in
proportions dictated by its transparency.
In order to store the variable-length
sequences sequentially
without having to store the value
for each pixel, we tag the
first element of each sequence and insert placeholders when
as follows: let
equal zero with the high bit set if
, and let
equal
with the high bit of
the first element set, otherwise. This use of the high bit explains
the requirement that
. Our final representation R
consists of the following sequence:
Figure 1 illustrates the derivation of an MLI representation from 3D models.
The data stream R is a compact representation of the original intensity and depth images; the length of R in bytes is:
Equivalently, letting g equal the number of background pixels
(i.e. pixels (x, y) for which
), and a equal the
mean of
over all non-background pixels, we have
. In other words, the length of the
representation is no more than 2a times greater than the length of
an uncompressed
image, and in our experience, a tends
to be relatively small compared to n (it was never greater than
n/8 in our sample application). This is a significant improvement
over the naive strategy of sending all 2n
and
arrays, and we will see in the next section that the
ordering of data in R is conducive to rapid image construction since
the front to back ordering is precomputed. Section 2.3
describes a simple compression scheme that further reduces the size
required for the representation of MLIs.