THE MATHEMATICS BEHIND PERSPECTIVE

 

                  THE MATHEMATICS BEHIND PERSPECTIVE
                         by Max Pandaemonium


The illustration of perspective on a two-dimensional screen from a set 
of three-dimensional images can be accomplished by mapping a three- 
dimensional point onto a two-dimensional plane.  Taking it one point 
at a time, we find the point of intersection where the line-of-sight 
from the observer to this given point intersects a plane (which 
represents the screen of a computer, for instance).

I will use vector mathematics in the derivation that follows.  If you 
are not familiar with vector mathematics, you're going to have some 
trouble following along; but I will explain what I am doing as I go 
along, if not why it works mathematically.

Hereafter I will use this notation:  Figures, such as points and 
planes and lines, will be referenced by capital letters; values, such 
as vectors or real numbers, will be referenced by lowercase letters.  
If there is a point A, then the vector a (if not previously defined as 
something else) indicates the location of the point A from the origin.  
That is, if A is (a, b, c), then a = <a, b, c>.

Here is how I will proceed with the derivation.  First we will define 
the reference plane (screen) C.  Then we will find the line of sight, 
L, between the point in question and the observer.  Then we will find 
the point of intersection of the line and the plane, and will 
translate that into a two-dimensional location on the plane (i.e., the 
screen).

Let the point P be the location of the point we wish to map to the 
plane.  Therefore, p is the vector from the origin to the point P.  
Similarly we will define point V to be the location of the observer.  
Also, let a unit vector (i.e., having a length of 1) be the facing 
vector, r.  Thus r represents the direction in which the observer is 
looking, and therefore does not have to be back toward the origin; it 
can be in any direction.

We'll define the plane C to be located at the head of the facing 
vector r.  In other words, the primarily point in C, which we will 
call S (which represents the center of the screen) is given by

                              s = v + r
                                                                   (i)

Since we want this plane to be perpendicular to the facing
vector, r, we can declare that r is a normal vector for C.  Thus an 
equation of C (where u is a general vector) can be:

                      C:  r dot (u - v - r) = 0.
                                                                  (ii)

(This equation implies that r and u - v - r are always perpendicular.)

Now that C is defined, we will define the line of sight, L.  This is 
very simple, since a direction vector for this line might be p - v.

Here lies the main part of the derivation.  We must now find the 
point, which we shall call T, where the line L and the plane C 
intersect.  The vector t - v, logically, should be a scalar multiple 
of p - v.  Thus we shall say

                           t - v = k(p - v)
                                                                 (iii)

where k is some real number.

The pieces of the puzzle are almost ready to be fit together.  One 
clear objective is to find k, the scalar multiple that corresponds to 
finding T.  In doing this we will define a vector x, which will have 
its tail at S, the center of the screen, and its head at T, the mapped 
point.  This is useful because, when we want to transfer the three-
dimensional point T to a two-dimensional map on the screen, it will be 
useful to have the vector set up this way.  (Just as we always have 
vectors with their tails at the origins in Cartesian space, we would 
like a vector in this new two-dimensional map to have its tail at the 
"origin," or center of the screen.)  Thus, x is given by

                            x = t - v - r.
                                                                  (iv)

Substituting equation (iii) into (iv), we get

                           x = k(p - v) - r.
                                                                   (v)

Now we must have an equation to find k.  We know that t - v will lie 
on the plane if, according to our equation (i), the two vectors from S 
are perpendicular.  Equivalently,

                             r dot x = 0.
                                                                  (vi)

Substuting equation (v) into (vi),

                       r dot [k(p - v) - r] = 0.
                                                                 (vii)

Now we must substitute identifiers for these vectors to solve for k.  
We shall make the following definitions:

                            p = <px, py, pz>;
                            r = <rx, ry, rz>;
                            v = <vx, vy, vz>.
                                                                (viii)

Thus equation (vii) becomes

<rx, ry, rz> dot <k(px - vx) - rx, k(py - vy) - ry, k(pz - vz) - rz>
        = 0
                                                                  (ix)

Multiplying through the dot product,

 rx[k(px - vx) - rx] + ry[k(py - vy) - ry] + rz[k(pz - vz) - rz] = 0

k rx (px - vx) - rx^2 + k ry (py - vy) - ry^2 + k rz (pz - vz) - rz^2
        = 0

   k[rx(px - vx) + ry(py - vy) + rz(pz - vz)] = rx^2 + ry^2 + rz^2

                           rx^2 + ry^2 = rz^2
             k = ---------------------------------------.
                 rx(px - vx) + ry(py - vy) + rz(pz - vz)
                                                                   (1)

It is useful to consider some things about k at the present time.  
Ideally, k should be a positive number.  Consider:  If k = 0, then t - 
v is of zero length, and therefore P and V are coincident.  If k < 0, 
that means that P is not on the other side of C that V is; in simple 
terms, P is "behind" V.  If k is undefined (the denominator is zero), 
then there is no k that will define t - v -- in other words, line L is 
parallel to C, and therefore there will be no mapping.  In either of 
cases, the calculation can stop here, since the point will not be 
visible.

Now we still have the matter of transferring this three-dimensional 
space coordinate into a two-dimensional screen coordinate.  To do 
this, first we will find the value of x.  From equation (v), 
substituting the definitions in (viii), we get

                           x = k(p - v) - r

          x = k(<px, py, pz> - <vx, vy, vz>) - <rx, ry, rz>

       x = <k(px - vx) - rx, k(py - vy) - ry, k(pz - vz) - rz>.
                                                                   (x)

Now we must define two unit vectors, u and u', which represent the x'- 
and y'-axes of the new two-dimensional system.  Note that u and u' 
have their tails at S, and both u and u' must be contained in C.  In 
other words,

                       r dot u = r dot u' = 0.
                                                                  (xi)

Since we want to find these x' and y' values in the new two- 
dimensional system, we will find the scalar projection of vector x on 
both u and u' in turn.  This is a simple matter:

                            x' = u dot x;
                            y' = u' dot x.
                                                                 (xii)

If we take the value definitions

                          u = <ux, uy, uz>;
                          u' = <u'x, u'y, u'z>
                                                                (xiii)

then we can combine equations (x), (xii), and (xiii) to get

                            x' = u dot x;
                            y' = u' dot x

           x' = <ux, uy, uz> dot
                  <k(px - vx) - rx, k(py - vy) - ry, k(pz - vz) - rz>;
           y' = <u'x, u'y, u'z> dot
                  <k(px - vx) - rx, k(py - vy) - ry, k(pz - vz) - rz>

x' = ux[k(px - vx) - rx) + uy[k(py - vy) - ry] + uz[k(pz - vz) - rz];
y' = u'x[k(px - vx) - rx) + u'y[k(py - vy) - ry] +
        u'z[k(pz - vz) - rz]
                                                                   (2)

where k is calculated according to equation (1).  QED.


Note that proof above is quite transparent as to the actual values of 
u and u'.  Naturally they should be perpendicular, since they represent
x'- and y'-axes, but what should we set them to be?

A natural interpretation is to have the y'-axis, that is, the one
represented by the vector u', to be "up"; that is, to correspond to
the real z-axis.  A choice of u and u' for this might be

      u = <ry(rx^2 + ry^2 + rz^2), -rx(rx^2 + ry^2 + rz^2), 0>;
      u' = <rx rz, ry rz, -rx^2 - ry^2>.
                                                                   (3)

It should be noted quickly that this will not work if r and k are
parallel.  If this should be so (that is, rx and ry are simultaneously
zero), then this alternate form can be used:

                           u = <-rz, 0, 0>;
                           v = <0, -1, 0>.
                                                                  (3a)

                                 -)(- 

Comments

Popular posts from this blog

BOTTOM LIVE script

Fawlty Towers script for "A Touch of Class"