This application claims the benefit, under 35 U.S.C. § 365 of International Application PCT/US2004/005690, filed Feb. 26, 2004, which was published in accordance with PCT Article 21(2) on Nov. 18, 2004 in English and which claims the benefit of U.S. provisional patent application No. 60/467,798, filed May 2, 2003.
This invention relates to the reproduction of analog optically recorded soundtracks and in particular to the restoration of recorded signal quality in variable density recordings.
Optical recording remains the predominant method for creating an analog motion picture soundtrack. Such optical recording can make use of a variable area method whereby illumination from a calibrated light source passes through a shutter that is modulated with the audio signal. The shutter opens in proportion to the intensity or level of the audio signal and results in the illumination beam from the light source being modulated in width. This varying width illumination exposes a monochromatic photographic film which when processed results in a black audio waveform envelope surrounded at the waveform extremities by a substantially clear or colored film base material. In this way the width of the exposed and developed film represents the instantaneous audio signal amplitude.
A second method exists for recording analog motion picture soundtracks wherein the audio signal causes the total width of the photographic audio track to be variably exposed. With this method, termed “variable density”, the exposure of the complete track width varies in accordance with the amplitude of the audio signal to produce a track which varies in optical transmissivity between a substantially clear or colored base film material having relatively high light transmissivity and low transmission or high density areas of exposed and developed photographic material. Thus the instantaneous audio signal amplitude is represented by a variation in the transmission of illumination though the exposed and developed film track width. This recording method suffers from a poor, low signal-to-noise ratio and signal amplitude distortion resulting from exposing the film into areas where the transfer characteristic exhibits non-linearity. In addition, inter-modulation distortion results as sections of the film track immediately adjacent to the intended exposure areas become affected by both light diffraction around the recording slit and scattering within the film emulsion.
Hence, with either variable density or variable area recording methods, the audio modulation (sound) can be recovered by suitably gathering the illumination transmitted through the soundtrack area, typically by means of a photo detector.
The aforementioned analog film sound recording techniques incur imperfections caused by physical damage and contamination during recording, printing and subsequent handling of the film. Since these recording techniques use photographic film, the amount of light used in recording (Density) and the exposure time (Exposure) constitute critical parameters. The correct density for recording can be determined by a series of tests to determine the highest and lowest possible densities that fall within the linear portions of the transfer characteristic of the film.
Film stock on which sound is recorded film is generally only sensitive blue illumination. Such film stock typically employs a gray anti-halation dye to substantially reduce or eliminate halation effects. Halation occurs as the result of reflections from the back of the film base causing a secondary, unwanted exposure of the emulsion. Typically, a variable area track has a gamma between 0.5 and 1.6.
The frequency response of the variable density recording method is determined by various parameters, for example, the width of the slit through which the modulated light passes, the exposure of the film, and the Modulation Transfer Function (MTF) of the film which is directly related to light diffusion. The higher the exposure time the lower the frequency bandwidth of the recording.
Optimum density occurs as a result of a compromise among the signal-to-noise ratio, the inter-modulation distortion and non-linear exposures. An optimum density can be determined by test exposures to find an acceptably low value for inter-modulation distortion resulting from image spreading.
In addition to non-linear densities and inter-modulation distortion, other imperfections can result. For example, the density of the exposed or unexposed areas can vary randomly or can vary in sections across or along the soundtrack area. During audio track playback, such density variations directly translate into spurious noise components interspersed with the wanted audio signal.
A further source of audio track degradation occurs as the result of mechanical imperfections variously imparted to the film and/or incurred during reproduction. One such deficiency causes the film, or tracks thereon, to weave or move laterally with respect to a fixed transducer. Film weave can result in various forms of imperfection such as amplitude and phase modulation of the reproduced audio signal.
As discussed, analog optical recording methods remain inherently susceptible to physical damage and contamination of the film during handling. For example, dirt or dust can introduce transient, random noise events. Similarly, scratches in either the exposed or unexposed areas of the film can alter the optical transmission properties of the soundtrack and cause severe transient noise spikes. Furthermore other physical or mechanical consequences, such as the film perforation, improper film path lacing or related film damage can introduce unwanted cyclical repetitive effects into the soundtrack. These cyclical variations can introduce spurious illumination and give rise to a low frequency buzz, for example having an approximately 96 Hz rectangular pulse waveform, rich in harmonics and interspersed with the wanted audio signal. Similarly, picture area light leakage into the soundtrack area can also cause image related audio degradation.
Conventional analog soundtrack readers reproduce the changes in light transmitted through the film together with all its imperfections. Heretofore, such readers have not offered any correction of the variable density track anomalies and deficiencies discussed previously. European patent EP 1091573 teaches compensations for the effects of variations in density or shading due to errors in printing and noise generated by the CCD imager scanning the track. However, the patent fails to address the effects of inter-modulation distortion, and in addition teaches the use of 8-bit signal quantization, which yields an unacceptably low signal-to-noise ratio in the order of 49 dBs.
German patent application DE 197 29 201 A1 discloses a telecine, which scans analog optically recorded soundtracks. The disclosed apparatus scans the sound information signal and applies two-dimensional filtering to the output values. German application DE 197 33 528 A1 describes a system for stereo sound signals. An evaluation circuit provides only the left or the right sound signal or the sum signal of both as a monophonic output signal.
Clearly, a need exists for an arrangement that allows reproduction and processing of analog optically-recorded soundtracks to not only substantially eliminate the noted deficiencies but to enhance the quality of the reproduced audio signal.
Briefly, in accordance with a first aspect of the present principles, an analog optically recorded variable density soundtrack is restored by use of digital signal processing. An advantageous arrangement employs a line array imager, typically a CCD imager, to scan and form an image of the variable density track for storage as a digital signal for storage in a memory system, typically a hard disk or array of such hard disks. The imager output signal is quantized with at least 12-bit resolution to obtain an acceptable signal-to-noise ratio of approximately 74 dB in the resulting audio signal. An audio signal is extracted from the stored soundtrack image and undergoes statistical processing by use of one or more methods to eliminate deficiencies and restore the quality.
The statistical processing techniques can include one or more of the following:
In another aspect of the present principles, the analog variable density optical soundtrack undergoes scanning by a 2048 pixel line scan CCD imager. Light from a light source passes through the soundtrack area of the film, for imaging by, and to substantially fill the width of the of, the CCD imager. The varying density of the soundtrack recording results in a corresponding variation of light imaged by the CCD imager. The output signal from the CCD is quantized with a 12-bit resolution and stored in a storage system, typically in the form of a raid array. The exposure time of the CCD imager is synchronized with bi-phase drive signals that control the film transport; thereby providing an exposure rate of about 30,000 scans per second, which yields a nominal bandwidth of 15 KHz in the resulting soundtrack signal.
To compensate for the effects of film grain or granularity, which result in unwanted signal amplitude variations or random noise, one or more statistical processing methods are used.
In accordance with another aspect of the present principles, an apparatus for the playback of an analog optically recorded soundtrack comprises a transport means for transporting a film having such a soundtrack. A scanning means generates an image signal of the analog optically recorded soundtrack only. An alignment means aligns the scanning means such that the image signal of the soundtrack substantially fills the width of the scanning means. A processor processes the image signal to form an audio output signal.
In accordance with yet another aspect of the present principles, there is provided a method for eliminating positional variations of an analog optically recorded soundtrack on a film. The method comprises the steps of (a) transporting the film which includes a soundtrack with an audio representative envelope that is subject to positional variation, (b) forming a digital image of the soundtrack with said audio representative envelope during transport, (c) aligning the digital image of said soundtrack with an audio representative envelope to assure the positional variation of said soundtrack on the film and peaks of the audio representative envelope remain within the digital image, and (d) processing the digital image to separate only the audio representative envelope and form therefrom an audio output signal.
Another aspect of the present principles facilitates azimuth alignment of a scanning means during soundtrack playback. The apparatus comprises film transport for transporting a film including an analog optically recorded soundtrack. A scanning means generates an image signal of only the soundtrack and is aligned such that an image signal of the soundtrack substantially fills a width of the scanning means. An azimuth aligning means positions the scanning means such that equal density values of the image of said soundtrack are displayed concurrently with substantially the same brightness.
In a conventional film sound reproducer light from source 10 passes through the film 20 and the track 25 so as to emerge with an intensity varying in accordance with the method employed for exposing the film to record the soundtrack. A photocell or solid-state photo detector (not shown) gathers the varying-intensity light. The photo sensor usually generates a current or voltage in accordance with the intensity of the transmitted light. The analog audio output signal from the photo sensor undergoes amplification and processing to alter the frequency content to improve or mitigate deficiencies in the acoustic properties of the recorded track. However, such frequency response manipulation is generally incapable of remedying the deficiencies without adversely effecting the wanted audio content.
In the inventive arrangement shown in
The bellows extension tube and tens of the optical group 75 are accurately adjusted to image the standardized recorded track positions. However, manual adjustments are provided to permit both focusing, exposure and image size adjustment or zoom control to allow the recorded film area to substantially fill the maximum sensor width with a small area of the soundtrack. The mounting system of the camera 100 also facilitates both lateral and azimuth adjustments. A lateral adjustment (L), as seen in
The selection of lens and other components of the optical group 75 are determined largely by the audio optical track width and the width of the imager array. An optical track of a 35 mm film has a standardized width of 2.13 mm, and the approximate length of the CCD imager 100 is about 20.48 mm based on a pixel size of 10 microns. Thus to enable the maximum width of a soundtrack of a 35 mm film to fill the imager width requires an image magnification of about 10:1. Similarly for a 16 mm film whose optical track has a width of 1.83 mm, in order to fill the imager width requires the addition of a 56 mm extension tube or bellows.
The Camera 100, for example an Aviiva type M2-CL camera, is controlled by frame grabber (CTRL) 200, for example, Matrox Meteor II CL digital board, which synchronizes the image capture and generation of a 12-bit digital signal representing the line scanned image of soundtrack 25 as the film 20 continuously traverses the projected beam of light. The CCD imager 110 has 2048 pixels and provides a parallel digital output signal 120, quantized to 12-bits and capable of operating with a pixel rate on the order of 60 MHz.
The digital image signal 120 represents 2048 successive measurements across the width of the soundtrack 25, which are captured as a 12 bit gray scale signal representing the instantaneous optical transmission of light through the soundtrack. This continuous succession of track width images (representing transmission/density measurements) undergoes storage, as a continuous digital image of the soundtrack 25, in a storage system 300, depicted as an exemplary RAID system.
Under control of the frame grabber 200 and responsive to user control, the Camera 100 generates its 12-bit parallel digital output signal 120, in accordance with either the CameraLink or RS 622 output signal format. The use of a 2048 pixel line array sensor quantized to 12-bit resolution provides an adequate signal to a quantizing noise ratio of about 74 dB and with a resolution sufficient to capture the soundtrack envelope image without significant frequency response distortion. The frame grabber 200, which controls the camera 100, can provide synchronization to NTSC or HD television sync pulses via sync interface 250, and also permits an output data rate sufficient to capture soundtrack images at normal operating speed of nominally 24 fps.
In addition to the imaging considerations, the desired bandwidth of the processed audio signal must be considered. For example, if a reproduced audio bandwidth of 15 kHz is required, a sampling or image scanning rate of 30 kHz is needed. Thus with an exemplary sampling rate of 30 kHz, the camera 100 will output 2048 pixels represented as 12-bit words for each image scan (audio track line scan) producing an output data rate of 3072*30*103 or 92.1 mega bytes per second. Hence, one minute of soundtrack requires approximately 5.53 gigabytes of storage. Such storage capacity requirements can be provided by the RAID system 300, which typically comprises an Ultra Wide SCSI 160 drive.
The apparatus of
The controller 400, together with the display 500 and keyboard 600 can comprise a personal computer. Alternatively, the controller 400 could comprise a custom processor integrated circuit, or combination of such circuits, coupled to the display 500 and keyboard 600. Regardless of its form, the controller 400 must support the high transfer rates associated with the camera data and requires at least 512 MB of RAM together with an Ultra SCSI 160 or fiber channel interface that can sustain the high transfer rates. In addition, the controller 400 should ideally include dual processors to allow parallel processing which can increase both processing speed and performance.
An operator activates the system of
Advantageously, the real time image provides not only pictures of the soundtrack but also shows the presence of interference generating illumination emanating from the sprocket holes, or the picture area which can contaminate the soundtrack. This unwanted light ingress can be eliminated by using the on-screen camera image to permit manipulation of optical group 75 to remove such unwanted audio contributions by carefully framing the soundtrack using picture zoom, pan and tilt as well as by manipulating the position of the light source with respect to the track. In addition, the soundtrack image can be examined in detail by electronically magnifying selectable sections of the display envelope to permit camera azimuth alignment when reproducing a test film known as a buzz track. The magnified image is presented with an electronically cursor line which permits the evaluation of any perturbations or anomalies in the audio modulation envelope.
Width-optimized azimuth alignment modulation peaks appear concurrently with substantially equal magnitude but opposite polarity. An optimum azimuth adjustment will produce concurrently maximized envelope peaks. Misalignment of azimuth between the camera and the soundtrack can result in an image, which captures temporally different audio information, such as can occur with a stereo audio track pair.
Following processing of the image during step 915, a check occurs during step 920 whether the operator should undertake alignment of the camera 100 of
Following camera image optimization, framing, focus, exposure, etc. to reduce misalignment, the operator selects the Record mode the tool bar of
As described, after completing the scanning and storage steps 970 and 975, respectively, the digital soundtrack image undergoes processing during step 980. Such processing occurs upon operator selection of the Processing mode from the tool bar shown inn
As discussed, the processing control panel shown in
Soundtrack deficiencies can result from the various causes described previously. However, more specifically, dirt, debris, transverse or diagonal scratches or longitudinal cinches in a negative can produce white spots when printed. These flaws generate clicks and crackles. Such white spots tend to affect the dark areas of the track and are more noticeable during quiet passages whereas noise occurring during loud passages often originates in the clear areas of the print. Low frequency thuds or pops often result from relatively large holes or spots in a positive soundtrack formed as a consequence processing problems. Hiss can result from a grainy or slightly fogged track area. A noise envelope that follows the wanted audio signal is often caused by inter-modulation distortion.
Although the scanned audio track is represented as a continuous intensity modulated image, sections of the image can be read from the storage system 300 and configured for processing using statistical techniques. A first algorithm was developed using a computer program such as Matlab® to estimate the instantaneous amplitude value of the audio signal as represented by the density of the film track and digitized as a single line scan. Statistical techniques can be used to estimate the density value that truly represents the amplitude of the audio signal. First, finding the average of the density values in the line vector comprised of 2048 pixel provides a good estimate of the true audio amplitude representation. This averaging process also serves to minimize the effects of unwanted noise resulting from unwanted variations in optical transmission across the track width.
The concept here is to obtain the instantaneous audio amplitude which corresponds to the gray level value of the scanned image in a particular instance by means of adding such gray level values on each and all of the pixels in one scanned line and dividing by the total number of pixels in such line. In this example, there are 2048 pixel elements on the line scan CCD array. Each element will output a gray level that corresponds to the intensity of the audio track in that particular portion of the density track and the track is scanned at 30,000 such lines per second. All of the individual pixel values obtained during the scanning are added and the sum is and divided by 2048, the number of pixels per line, to obtain the mean value to be used as the instantaneous audio level.
Scratches across soundtrack can cause variations in light transmission, which produce transient or impulsive noise effects such as loud pops or clicks. This form of transient noise is advantageously eliminated by a second algorithm which is applied to the line image sections of the stored exemplary 12-bit digital envelope signal. This second algorithm uses a spatial image processing technique to derive the mean values of the pixel of each image section across the width of the track. These mean values are then used to generate the instantaneous audio amplitude of the track. The technique uses regression analysis with a weighted coefficient assigned to pixel values and their relative deviation from the mean. If a pixel has a standard deviation greater than a user set threshold, it is eliminated from the estimation process. In this way a linear approximation of the variations in density across the soundtrack width is obtained. The middle point in the data values across the line is then the mean value used to estimate the amplitude of the audio with very little effect from random noise and transient noise.
Often, density tracks are recorded beyond the linear portion of a film's response extending into the toe and shoulder areas of the gamma curve. To compensate for the amplitude distortion caused by this, an exponential curve can be chosen such that the toe's logarithmic shape can be linearized. A cubic function can be chosen to linearize the audio that falls in the shoulder portion of the gamma curve. Different slopes and lengths can be chosen for each segment and listening test can be performed to determine the best settings.
A vector with 4096 entries is generated to hold the values of the look up table. The 4096 coefficients are computed from the graph that was previously defined by the operator in the following manner: The N entry on the vector is calculated as N=F(X). In the case of the exponential function N=ex or in the linear portion N=Slope*X+intercept where X is the pixel intensity value. With a pre-calculated look up table the new intensity value N for a pixel X can be obtained without spending processor time evaluating the functions for each pixel.
A further advantageous arrangement utilizes look up tables to provide compensation for pixel intensity values that are occur in the non-linear toe and shoulder areas of the film transfer characteristic. The look up table provides linearizing correction values for densities that extend beyond the normal linear region of the film characteristics. A computer routine maps a Linear density value that corresponds to the mean amplitude values calculated with the previous methods if it falls within the non-linear range of the film. The net result is an increase in the dynamic range and signal-to-noise ratio of the audio signal.
This technique seeks to linearize the non-linear portions of the gamma response curve for an audio film. An operator is provided with an interface, as seen in
As discussed previously, the statistical processing performed by the processor 400 can include regression analysis. Again, the idea is to linearize the gamma response of the variable density audio track. In this case, linear regression is used to interpolate the pixel values that lie in the toe, shoulder and any other areas that are non linear. First, a data set of all the intensity values present in the track are gathered. Then, a least square fit is performed on that data set and obtain the slope and intercept for the gamma response that best approximates the track and use that curve to create a look up table in the same manner described above. In this case, the value N=slope*X+intercept, where the slope and intercept are the values obtained from the linear least squares.
Another statistical processing technique capable of being implemented by the controller 400 of
During initial camera alignment the track image is observed at several film locations and if film weave is apparent the image centering can be adjusted to position the nominal center of wandering soundtrack path in the middle of the display image. The image size is then adjusted such that the audio track fills the width of the CCD line array. Hence it can be appreciated that as the film weaves only the horizontal position, or distribution of the end pixels vary. However, mean of the pixel intensities, which represent the audio signal amplitude, remains substantially constant because although the intensity envelope image moved it remained on the sensor array. Thus the algorithm for converting the envelope image into an audio value advantageously eliminates and corrects the effects of film weave.
The foregoing describes a technique for restoration of recorded signal quality in variable density recordings on motion picture by scanning the soundtrack to yield a digital signal and then applying statistical processing techniques on such a signal.
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/US04/05690 | 2/26/2004 | WO | 10/18/2005 |
Number | Date | Country | |
---|---|---|---|
60467798 | May 2003 | US |