Below is a very well written and
helpful article written by Brian Florian from Secrets
of Home Theater and Hi Fidelity. The site has many
useful articles and product reviews for high end home theater and audiophile sound equipment . Please enjoy the article.
Please Note The graphics may take some time to
download if you are on 56k. Its worth the wait!!!
CLICK ON LOGO TO VISIT
An
Explanation of Film-to-Video Frame Rate Conversion for NTSC
To
better understand the upcoming concepts, one must be armed with some
basic knowledge of how film gets transferred to video, as well as the
nature of interlaced versus progressive display. As such, the
following information is not intended to be a definitive paper on the
subject, but should serve as a good introduction for all.
The
visuals and animations presented here, though large in file size, are
key and will reward repeat viewing.
Motion
pictures are comprised not of motion at all, but numerous stills shown
in rapid succession. For the films we all watch at the theater, 24
frames are shown in one second (24 frames per second, or 24fps). The
NTSC television system differs from film in this regard, making it
complicated to show film on video.
Televisions
create their image by drawing (scanning) lines of light on the CRT
face, left to right, top to bottom, to produce a picture over the
entire screen. The resultant images that make up the motion picture
are comprised of two interlaced fields: that is, the first field
consists of all the odd lines (1 through 525), and the second field
consists of all the even lines (2 through 524). The result is that
only half of the video's display is drawn every 60th of a second. A
simulation of this is shown on the left. Field 1 is scanned, and then
Field 2 is scanned. Traditional talk quotes NTSC television as having
30 frames per second (as opposed to film's 24), each being comprised
of two interlaced fields. This is actually misleading: The NTSC
interlaced system shows 60 unique images per second, but each one uses
only half of the vertical resolution available on the display. Only if
the source material contained 30 unique frames per second could you
could say that two fields form a single frame but in reality, video
material such as the evening news is true 60 fields per second. So we
don't want to think of interlaced televisions in terms of frames but
rather in terms of fields, interlaced fields, and 60 of them per
second.
The
principal drawbacks of an interlaced display are (A) visible line
structure, (B) flicker caused by the rapid alternating of the fields,
and most important, (C) artifacts such as 'feathering' (also
referred to as 'combing') and 'line twitter'. Visual artifacts like
these last two occur anytime the subject or the camera is in a
different position from field to field. The subject will be in
one position for one field, and in another position for the next,
resulting in jagged edges (feathering) or shimmering horizontal lines
(twitter).
The
animation on the right shows an example of an interlaced display
trying to show a tomato moving from left to right. Each field shows
the tomato a little farther to the right than the previous. Because
the fields are interlaced, jagged vertical edges can't help but exist,
except during for the last two fields (5 and 6) where the tomato is
stationary. The further back you are from an interlaced display (or
the smaller the display is), the less this and other artifacts are
noticed. If you want to see the effect in real life, just stick your
nose up to an interlaced TV. Focus in on an objects edge that is
stationary and wait for it to move. You will notice this right away.
At
left is an interlaced image of a skier. Not only is the flicker
annoying, but have a good look at the ski-pole: It comes and goes
because its so fine it can only be found in one of the two interlaced
fields. This is line twitter. This artifact manifests it self when
fine detail is less than 2 scan lines high. It is exasperated during
vertical movement as the fields alternate. Often fine detail is
filtered before being encoded to minimize these artifacts when played
back at home on your interlaced display device. Because of this, we
have yet to experience the full potential of DVD.
The
preceding basic knowledge of interlacing is necessary to understand
the transfer of film to video, because it is an important factor in
what we end up seeing.
Motion
picture photography is based on 24 frames per second. Time to call to
mind all that math you learned in school and realize that 24 doesn't
go into 60 very easily. To boil it down a little, our challenge is to
make 4 frames from the film fit as evenly as possible across 10 video
fields. We can't just double up the fields on every fourth film frame
or we'd get a real 'stuttered' look. Instead, a process is used known
as 3-2 pulldown to create 10 video fields from 4 film frames. This
form of telecine alternates between creating 3 fields from a film
frame and 2 fields from a film frame. Hence the name 3-2.
Consider
now our flow chart of the 3-2 pulldown performed on four frames of
this movie scene:
Pretty
cool right? It is and it isn't. 3-2 pulldown inherits much of the
artifacts we described when talking about interlaced video. A anytime
a field follows one made from a different film frame (noted above by
the "!" icon), there exist the possibility for anomalies in
what we see, feathering and twittering being great examples.
Absolutely any differences between the two film frames that make up
the video frame (the last field of one frame and the first field of
the next frame), be it brightness, color, or especially motion, are
going to result in some artifact as the two fields merge on screen.
Even our little animated synthesis of the final interlaced product,
which actually contains 10 interlaced pieces, shows evidence of such
anomalies as the flying police cars move ahead. Such is life.
As
long as you are watching your movies on an ordinary interlaced
display, there is not much more to tell you. What you see at home is
pretty much what we've shown as the interlaced content in the above
illustration. But should you have the fortune to be using a
progressive display TV, the following comes into play.
Progressive
displays, such as high-performance CRT/LCD/DLP/D-iLA projectors and
the new HDTV-ready TVs, can show progressive scanned images as opposed
to interlaced. In order to do this, the display must scan at a higher
rate, 2x the speed of NTSC. Because we are scanning at twice the
speed, we can draw an entire frame in the same amount of time it takes
an interlaced system to draw a single field. We learned above that an
interlaced display shows 60 fields per second. But with
progressive, each "field" is now a complete picture
including all scan lines, top to bottom, so we will now call it a
frame, and we are showing 60 of those per second. (Of course, only 24
of those are unique if the source is film based) The benefits of a
progressive display are no flicker, scan lines are much less visible
(permitting closer seating to the display), and they have none of the
artifacts we described for the interlaced display, as long as the
source material is progressive in nature (film or a progressive video
camera).
But
sources which are truly progressive in nature are hard to come by
right now. Movies on DVD are almost always decoded as
interlaced fields yet all of the film's original frames are there,
just broken up. What we're going to talk about next is how we take the
interlaced content of DVD and recreate the full film frames so we can
display them progressively. The term commonly used to restore the
progressive image is deinterlacing, though we think it is more correct
to call it re-interleaving, which is a subset of deinterlacing.
Deinterlacing
(or re-interleaving) involves assembling pairs of interlaced fields
into one progressive frame (1/60 of a second long), and showing it at
least twice to use up the same amount of time as two fields. The need
for 60 flashes on the screen each second stems from a biological
property called the Flicker Fusion Frequency, meaning how many flashes
that we need to see each second so that we (our brains) fuse the image
into one where we don't see a flicker.
For
every film frame that had three fields made from it, the third field
is a duplicate of the first, and (if the MPEG-2 encoder is behaving
properly) won't even be stored on the DVD. Instead of encoding the
duplicate fields, the DVD flags repeat_first_field and top_field_first
are used to instruct the MPEG decoder where to place these duplicate
fields during playback.
The
progressive output of a DVD player should assemble 2 fields from each
film frame and create a complete progressive one that looks just like
the original film frame. You should now be thinking that the DVD will
once again have 24 frames to show in one second. But the progressive
display is still expecting 60 complete frames per second. In order to
space them out, the DVD player shows the complete frames in this
order: 1, 1, 1, 2, 2, 3, 3, 3, 4, 4 and so on.
This
form of display gives us a moving image very close to the original
film. It has a tendency to "judder" a bit though, as every
other film frame lasts 1/60 of a second longer than the previous one.
Even our little synthesis of the final product, which actually
contains 10 pieces, shows this judder. In the future, both the player
and the display could increase their display rate above 60 fields per
second, to 72 per second. At that point, the fields would only last
1/72 of a second, permitting the player to show every film frame three
times (24 x 3 = 72), eliminating the motion judder, and also helping
us with the Flicker Fusion Frequency problem (60 flashes per second
are just barely enough in a well lit viewing environment). This would
look like: 1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4 and so on. 72 fps
will only work with film based sources though, as it is a multiple of
24. It will not work well with video sources which are 60 field per
second.
The
re-interleaving process we've just covered is specific to 24fps film
material which is MPEG-2 decoded (as interlaced fields). It's really a
matter of putting the right fields together so it's fairly simple.
Deinterlacing native NTSC interlaced video material is much more
complicated. In such video material, each field is a unique image in
time, and in order to be deinterlaced at an acceptable level, it
requires getting into motion-adaptive and motion-compensation
algorithms to overcome the inherent problems of the interlaced
material. There is no best method, and the two mentioned are expensive
to implement.
(Note:
NTSC does not really run at 60 Hz; it is technically 59.94 Hz. The
industry rounds it up to make it easier to read. If you did play back
video at 60 Hz instead of 59.94 Hz, you would end up with a dropped
frame approximately once every 20 seconds.)