The present disclosure relates to improvements for viewing images/video on screens. More particularly, it relates to methods and systems for compensating for local ambient conditions that alter how images/videos being displayed are viewed by the user.
Video and image displays (screens or projections) present digital image content with the contrast set by the content creator. Brightness and contrast can be manually adjusted by the display owner to overcome environmental effects that change the brightness and contrast for a given display (such as ambient lighting, viewing angle and distance, eye condition, etc.), but it is difficult to determine what settings match what the content creator originally intended, and overall brightness adjustments can cause oversaturation in areas that are already very bright.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.
A screen for a display device (movie projector, television, computer monitor, smartphone, tablet computer, etc.) displays images by illuminating pixels at a fixed range of luminance according to the data provided for the image (e.g., 0 to 255).
Image and video content is typically coded to have luminance levels that assume ideal viewing conditions. In reality, the conditions are not ideal, particularly for devices that are used in brightly lit areas (e.g., outdoors), but other conditions can also affect the viewing experience, such as viewer age, eye condition, viewing angle, etc.
Being able to compensate for those conditions can improve the viewing experience and present the images in a condition close to what was originally intended, in terms of brightness and contrast.
An embodiment of the present invention is a method to modify an image displayed for a user on a target device in target surround conditions, said method comprising: determining an adjusted cone response based on a minimum target cone response and a delta cone response; calculating a target luminance from a local adaptation pooling and the adjusted cone response; and modifying the image by the target luminance to produce an adapted image.
A further embodiment of the present invention is a method for adjusting a target image to be viewed by a user comprising: modeling a target surround with the target image; modeling glare being added to the target image by a point spread function of an eye of the user; calculating local adaptation pooling from a local adaptation kernel and the glare; calculating an adjusted cone response; calculating a target luminance from the adjusted cone response; and generating an adjusted image by subtracting the glare from the target luminance.
Details of one or more implementations of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages will become apparent from the description, the drawings, and the claims. Note that the relative dimensions of the following figures may not be drawn to scale. Like reference numbers and designations in the various drawings generally indicate like elements, but different reference numbers do not necessarily designate different elements between different drawings.
As used herein, “point spread function” or “PSF” refers to a mathematical model that describes the response of an eye to a point source of illumination. The term “PH” refers to a distance to a screen or projection in terms of multiples of the screen/projection height.
As used herein, “adaptation” refers to the retina adapting to changing luminance levels. “Local adaptation pooling” refers to a model of per-pixel (or pixel block) adaptation based on spatial pooling (feature reduction by filtering over a region of field view). See, e.g., “A Model of Local Adaptation”, by Peter Vangorp et al. (ACM TOG 34(6)).
where, in some embodiments, α=0.654, * is the convolution operator, and the parameters for the Gaussian kernels g are σ1=0.428 deg and σ2=0.0824 deg. The n1 and n2 values are custom non-linearities which can be approximated with
where, in some embodiments, for n1, a=3.46, b=70.2, c=2.38, and d=2.37, and for n2, a=2.19, b=66.8, c=2.37, and d=2.37.
Other values for the adaptation pooling curves can be derived under different conditions. See e.g. “A Model of Local Adaptation”, supra. Typically, the functional form is steeper in the first few degrees off zero, then sloping off less steeply further from the zero mark (see e.g.,
In the method, an image is brightness compensated by maintaining the cone response difference between the image and “black”. To do this, the cone response difference is first calculated for ideal conditions, then used to adjust luminance levels under actual conditions.
In some embodiments, the cone response difference (or “cone delta”) can be determined by the following steps, as shown in
1) Model the image with ideal surround conditions (405):
An example of ideal surround conditions may be 5 cd/m{circumflex over ( )}2 at a distance of 3× picture height. However, the ideal surround conditions can change depending on application (e.g., in-theater movie vs. at-home streaming vs. mobile device, etc.).
2) Model the glare being added to the image (with ideal surround) based on ideal optics for a human eye (e.g., healthy eyes of a 20-year-old person) (410). The modelling can be done by using the inverse fast Fourier transform function of the product of a point spread function (“PSF( )”) of the ideal eye (“eye”) with a fast Fourier transform function of the image with ideal surround. This model can be referred to as the “retinal image” (“RetR”).
3) The local adaptation pooling (“LaR”) can then be calculated by applying a local adaptation kernel (“∂”) to the rental image (415).
The local adaptation kernel can be a convolution kernel based on pooling curves (see
4) From these, the cone response (“CR”) of the ideal eye under ideal conditions can be calculated (420).
Where n is a response constant and σ( ) is the adaptation state that is reached at complete adaptation. An example range of values of n is from 0.7 to 1.
5) Steps 1-4 are repeated for a solid black background (“S”) in place of the image. This provides a cone response for glare separated from image content (425).
“Black” is defined as the light/glare in your eye at that pixel location given a black (zero luminance) image, but still in the given surround condition.
6) From these cone responses, the delta cone response (“ΔC”) can be determined (430).
From this delta cone response, the luminance of an actual displayed image can be adjusted for glare.
In some embodiments, the image can be adjusted with the following steps, as shown in
1) Model the actual (i.e., target, “T”) surround with the target image (500) (“SurT”) (505). At a simplistic level this would include parameters such as the type of screen the user is looking at, the viewing angle between the user and the screen, the user's distance to the screen, and a flat field description of the ambient luminance conditions. Multiple parameters can be considered together. Basically, the parameters describe ways that the target image might be altered/warped from the creator's intended viewing presentation. For example, changing the distance to the screen would make the viewed images appear smaller and different screens have different glare/reflection characteristics. At a complex level this could include parameterized physical capture of the complex surround (e.g., visual scan of surround).
“Actual conditions” can be, for example, physically measured values, or they can be assumed values from predicted conditions, or even a mix of both.
2) Model the glare being added to the image by the optics of the user's eye (this would change from person to person). (510) This is the point spread function, “PSF”, of the eye, as described above. Attenuate it if needed by the pupil's diameter. This is now the target retinal image (“RetT”).
For example, and in simplistic terms, a smaller pupil diameter decreases the Nyquist threshold and will result in a smaller/narrower PSF and, likewise, a larger pupil diameter the opposite. In some embodiments PSF can be given as a gaussian:
Where x is the angle, s is a constant standard deviation, and p is the pupil diameter.
3) Given the target retinal image and the local adaptation kernel, calculate local adaptation pooling (515). This will provide, for every pixel, the target local adaptation.
4) From the target local adaptation and the retinal image for a “black” image (501) and actual surround conditions, calculate the minimum target cone response (“CTmin”) of the user's eye under these conditions (520).
5) Now add the delta cone response (calculated from ideal conditions) to the minimum target cone response to get an adjusted cone response (“CT”) (525).
6) From this, we can calculate our target luminance (530). This is local luminance (e.g., per pixel).
7) Now subtract the luminance from glare from the target luminance (535) to produce the adjusted (luminance) image (“Imager”) (540) from the input image (500).
The process itself does not assume target display characteristics. Therefore, it is possible that the compensated image created may exceed the capabilities of the target display. In some embodiments, traditional tone mapping algorithms are employed to return the image into the range of the target display if it is detected that the compensated image exceeds the capabilities of the target display.
While this method targets maintaining delta cone response of the eye, it is possible to also target absolute cone response. This will be effective when the image covers majority of the field of view. For surround conditions that cannot be changed, that will physically limit the extent to which the absolute cone responses may be reached. In the case of high ambient, for example, getting a cone response near zero is unlikely no matter what is done to the image characteristics.
In some embodiments, CT in equation (18) is set as an absolute cone response value, instead of calculated from a sum.
1) From the source image (600), convert the image from PQ (perceptible quantization) space to linear space (luminance). (605)
Example pseudocode is Lin_src=PQ2L(Img_src).
2) Add the target surround (ambient conditions) and add padding to simulate differences in viewing angles. (610) This can be accomplished by filter design.
An example of simulating surround environment is creating an image with a particular display (“TV”) in a target surround environment by taking the image and padding it with surround values. If the viewer is at a smaller viewing angle (further from the TV), decrease the size of the image in relation to the surround. Likewise, if the viewing angle is smaller, increase the size of the image in relation to the padding. In both cases one can simply adjust the amount of padding to reach appropriate viewing angles. For example, at 3 picture heights, the TV should take up approximately 40 degrees out of the 180-degree image.
Examples for source vs. target conditions includes a source at 5 nit surround at 3 PH viewing distance, and a target at 300 nit surround at 6 PH distance (e.g. the actual viewing conditions has greater ambient lighting and a smaller screen than the source was designed for).
3) For each viewing condition of the source and target, calculate the retinal images (taking into account glare, straylight, point spread function) stripping away any padding. (615)
Example pseudocode is: Ret_src=calculateGlare(Lin_src, Source); Ret_tgt=calculateGlare(Lin_src, Target).
4) For each retinal image (source and target), calculate the local adaptation. (620)
Example pseudocode is: La_src=calculateAdaptation(Ret_src); La_tgt=calculateAdaptation(Ret_tgt).
5) For each surround condition (source and target), calculate the retinal images for a black image. (625)
Example pseudocode is: Ret_srcBlack=calculateGlare(black, Source); Ret_tgtBlack=calculateGlare(black, Target).
6) For the source and target images, find the cone responses. The cone responses from the black images are considered “minimum black due to glare”. (630)
Example pseudocode is: ConeResp_src=getConeResponse(Ret_src, La_src); ConeResp_srcblk=getConeResponse(Ret_srcBlack, La_src); ConeResp_tgtblk=getConeResponse(Ret_srcBlack, La_tgt).
7) Find the delta cone response as the difference between the image cone response and the “minimum black due to glare” cone response for the source. (635)
Example pseudocode is: deltaConeResp=ConeResp_src-ConeResp_srcblk.
8) Calculate the target cone response as the sum of the target black response plus the delta cone response. (640)
Example pseudocode is: ConeResp_tgt=ConeResp_tgtblk+deltaConeResp.
9) Use the local adaptation model of the target image to drive the next step. Take this local adaptation model, the desired cone response, and back-calculate to get the required luminance level. While this is technically a circular relationship, the local adaptation of the target is a close enough approximation to adjust the image sufficiently. (645)
Example pseudocode is: Lin_tgt=getLuminance(ConeResp_tgt, La_tgt).
10) Subtract the retinal image (target) for black to remove the straylight from the surround and provide the correct contrast ratio. (650)
Example pseudocode is: Lin_tgt_noGlare=Lin_tgt-Ret_tgtBlack.
11) Convert the image back from linear space to display space (PQ) if needed (655) to provide the adapted image (690).
Example pseudocode is: Img_tgt=L2PQ(Lin_tgt_noGlare).
In this document is proposed a local adaptation pooling method/system to modify an image for display under non-ideal conditions. The method/system can include determining a delta cone response under ideal conditions to be used to adjust images shown in non-ideal conditions.
Embodiments of the present invention may be implemented with a computer system, systems configured in electronic circuitry and components, an integrated circuit (IC) device such as a microcontroller, a field programmable gate array (FPGA), or another configurable or programmable logic device (PLD), a discrete time or digital signal processor (DSP), an application specific IC (ASIC), and/or apparatus that includes one or more of such systems, devices or components. The computer and/or IC may perform, control, or execute instructions relating to the adaptive perceptual quantization of images with enhanced dynamic range, such as those described herein. The computer and/or IC may compute any of a variety of parameters or values that relate to the adaptive perceptual quantization processes described herein. The image and video embodiments may be implemented in hardware, software, firmware and various combinations thereof.
Certain implementations of the invention comprise computer processors which execute software instructions which cause the processors to perform a method of the disclosure. For example, one or more processors in a display, an encoder, a set top box, a transcoder or the like may implement methods related to adaptive perceptual quantization of HDR images as described above by executing software instructions in a program memory accessible to the processors. Embodiments of the invention may also be provided in the form of a program product. The program product may comprise any non-transitory medium which carries a set of computer-readable signals comprising instructions which, when executed by a data processor, cause the data processor to execute a method of an embodiment of the invention. Program products according to embodiments of the invention may be in any of a wide variety of forms. The program product may comprise, for example, physical media such as magnetic data storage media including floppy diskettes, hard disk drives, optical data storage media including CD ROMs, DVDs, electronic data storage media including ROMs, flash RAM, or the like. The computer-readable signals on the program product may optionally be compressed or encrypted.
Where a component (e.g. a software module, processor, assembly, device, circuit, etc.) is referred to above, unless otherwise indicated, reference to that component (including a reference to a “means”) should be interpreted as including as equivalents of that component any component which performs the function of the described component (e.g., that is functionally equivalent), including components which are not structurally equivalent to the disclosed structure which performs the function in the illustrated example embodiments of the invention.
According to one embodiment, the techniques described herein are implemented by one or more special-purpose computing devices. The special-purpose computing devices may be hard-wired to perform the techniques or may include digital electronic devices such as one or more application-specific integrated circuits (ASICs) or field programmable gate arrays (FPGAs) that are persistently programmed to perform the techniques or may include one or more general purpose hardware processors programmed to perform the techniques pursuant to program instructions in firmware, memory, other storage, or a combination. Such special-purpose computing devices may also combine custom hard-wired logic, ASICs, or FPGAs with custom programming to accomplish the techniques. The special-purpose computing devices may be desktop computer systems, portable computer systems, handheld devices, networking devices or any other device that incorporates hard-wired and/or program logic to implement the techniques.
For example,
The computer system (1500) also includes a main memory (1506), such as a random-access memory (RAM) or other dynamic storage device, coupled to bus (1502) for storing information and instructions to be executed by processor (1504). Main memory (1506) also may be used for storing temporary variables or other intermediate information during execution of instructions to be executed by processor (1504). Such instructions, when stored in non-transitory storage media accessible to processor (1504), render computer system (1500) into a special-purpose machine that is customized to perform the operations specified in the instructions.
Computer system (1500) further includes a read only memory (ROM) (1508) or other static storage device coupled to bus (1502) for storing static information and instructions for processor (1504). A storage device (1510), such as a magnetic disk or optical disk, is provided and coupled to bus (1502) for storing information and instructions.
Computer system (1500) may be coupled via bus (1502) to a display (1512), such as a liquid crystal display (LCD) or light emitting diode display (LED), for displaying information to a computer user. An input device (1514), including alphanumeric and other keys, is coupled to bus (1502) for communicating information and command selections to processor (1504). Another type of user input device is cursor control (1516), such as a mouse, a trackball, or cursor direction keys for communicating direction information and command selections to processor (1504) and for controlling cursor movement on display (1512). This input device typically has two degrees of freedom in two axes, a first axis (e.g., x) and a second axis (e.g., y), that allows the device to specify positions in a plane.
Computer system (1500) may implement the techniques described herein using customized hard-wired logic, one or more ASICs or FPGAs, firmware and/or program logic which in combination with the computer system causes or programs computer system (1500) to be a special-purpose machine. According to one embodiment, the techniques as described herein are performed by computer system (1500) in response to processor (1504) executing one or more sequences of one or more instructions contained in main memory (1506). Such instructions may be read into main memory (1506) from another storage medium, such as storage device (1510). Execution of the sequences of instructions contained in main memory (1506) causes processor (1504) to perform the process steps described herein. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions.
The term “storage media” as used herein refers to any non-transitory media that store data and/or instructions that cause a machine to operation in a specific fashion. Such storage media may comprise non-volatile media and/or volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as storage device (1510). Volatile media includes dynamic memory, such as main memory (1506). Common forms of storage media include, for example, a floppy disk, a flexible disk, hard disk, solid state drive, magnetic tape, or any other magnetic data storage medium, a CD-ROM, any other optical data storage medium, any physical medium with patterns of holes, a RAM, a PROM, and EPROM, a FLASH-EPROM, NVRAM, any other memory chip or cartridge.
Storage media is distinct from but may be used in conjunction with transmission media. Transmission media participates in transferring information between storage media. For example, transmission media includes coaxial cables, copper wire and fiber optics, including the wires that comprise bus (1502). Transmission media can also take the form of acoustic or light waves, such as those generated during radio-wave and infra-red data communications.
Various forms of media may be involved in carrying one or more sequences of one or more instructions to processor (1504) for execution. For example, the instructions may initially be carried on a magnetic disk or solid-state drive of a remote computer. The remote computer can load the instructions into its dynamic memory and send the instructions over a telephone line using a modem. A modem local to computer system (1500) can receive the data on the telephone line and use an infra-red transmitter to convert the data to an infra-red signal. An infra-red detector can receive the data carried in the infra-red signal and appropriate circuitry can place the data on bus (1502). Bus (1502) carries the data to main memory (1506), from which processor (1504) retrieves and executes the instructions. The instructions received by main memory (1506) may optionally be stored on storage device (1510) either before or after execution by processor (1504).
Computer system (1500) also includes a communication interface (1518) coupled to bus (1502). Communication interface (1518) provides a two-way data communication coupling to a network link (1520) that is connected to a local network (1522). For example, communication interface (1518) may be an integrated services digital network (ISDN) card, cable modem, satellite modem, or a modem to provide a data communication connection to a corresponding type of telephone line. As another example, communication interface (1518) may be a local area network (LAN) card to provide a data communication connection to a compatible LAN. Wireless links may also be implemented. In any such implementation, communication interface (1518) sends and receives electrical, electromagnetic or optical signals that carry digital data streams representing various types of information.
Network link (1520) typically provides data communication through one or more networks to other data devices. For example, network link (1520) may provide a connection through local network (1522) to a host computer (1524) or to data equipment operated by an Internet Service Provider (ISP). ISP in turn provides data communication services through the worldwide packet data communication network now commonly referred to as the “Internet” (1528). Local network (1522) and Internet (1528) both use electrical, electromagnetic or optical signals that carry digital data streams. The signals through the various networks and the signals on network link 920 and through communication interface (1518), which carry the digital data to and from computer system (1500), are example forms of transmission media.
Computer system (1500) can send messages and receive data, including program code, through the network(s), network link (1520) and communication interface (1518). In the Internet example, a server (1530) might transmit a requested code for an application program through Internet (1528), ISP on the Internet, local network (1522) and communication interface (1518).
The received code may be executed by processor (1504) as it is received, and/or stored in storage device (1510), or other non-volatile storage for later execution.
The methods herein can also be implemented on a decoder in the device, as shown in
A number of embodiments of the disclosure have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the present disclosure. Accordingly, other embodiments are within the scope of the following claims.
The examples set forth above are provided to those of ordinary skill in the art as a complete disclosure and description of how to make and use the embodiments of the disclosure, and are not intended to limit the scope of what the inventor/inventors regard as their disclosure.
Modifications of the above-described modes for carrying out the methods and systems herein disclosed that are obvious to persons of skill in the art are intended to be within the scope of the following claims. All patents and publications mentioned in the specification are indicative of the levels of skill of those skilled in the art to which the disclosure pertains. All references cited in this disclosure are incorporated by reference to the same extent as if each reference had been incorporated by reference in its entirety individually.
It is to be understood that the disclosure is not limited to particular methods or systems, which can, of course, vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to be limiting. As used in this specification and the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the content clearly dictates otherwise. The term “plurality” includes two or more referents unless the content clearly dictates otherwise. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the disclosure pertains.
Various aspects of the present invention may be appreciated from the following Enumerated Example Embodiments (EEEs):
Number | Date | Country | Kind |
---|---|---|---|
23207885.7 | Nov 2023 | EP | regional |
This application claims priority to EP patent application Ser. No. 23/207,885.7 filed Nov. 9, 2023 and U.S. Provisional Patent Application No. 63/588,077, filed Oct. 5, 2023, both of which are incorporated by reference herein in their entirety.
Number | Date | Country | |
---|---|---|---|
63588077 | Oct 2023 | US |