 
                 Patent Application
 Patent Application
                     20240205627
 20240205627
                    The invention relates to the field of set-top boxes provided with at least one speaker and one camera.
It is considered to design set-top boxes (or STB) provided with new components to implement new functionalities.
These new components comprise, for example, one or more speakers which thus make it possible for the set-top box to playback sound signals.
In reference to 
The set-top box 1, when it is in a “predefined” initial position and initial orientation, uses initial audio parameters such that the sound playback is optimised in an initial optimal listening position. This initial optimal listening position typically corresponds to the seated position of a user in an armchair or on a sofa, facing the television 2 and the set-top box 1, at a predefined distance from these. The initial audio parameters are, for example, defined in the factory, but could also be defined during a calibration phase performed at the time of installing the set-top box 1 at the user's home.
Yet, in service, it is highly possible that the position and/or the orientation of the set-top box 1 are modified by the user, either voluntarily or inadvertently. The initial audio parameters therefore no longer optimise the sound playback in the initial optimal listening position, although this is always used by the user. The user therefore no longer optimally benefits from the acoustical efficiency of their set-top box 1.
The invention aims to optimise the sound playback of a set-top box in service.
In view of achieving this aim, an optimisation method for a sound playback is proposed, achieved by a set-top box which comprises at least one speaker and to which at least one camera is secured, comprising the steps of:
The processing unit therefore uses the initial image and the current image to detect a change of position and/or orientation of the set-top box. A corrective action can thus be performed to correct the effects of this movement on the acoustical efficiency of the set-top box. The user can therefore best benefit from the qualities of the set-top box, even if this has been moved voluntarily or inadvertently.
In addition, an optimisation method such as described above is proposed, wherein the analysis of the current image and of the initial image comprises the steps of:
In addition, an optimisation method such as described above is proposed, wherein the analysis of the planar homography matrix comprises the steps of:
In addition, an optimisation method such as described above is proposed, wherein the analysis of the current image and of the initial image comprises the steps of:
In addition, an optimisation method such as described above is proposed, further comprising the steps of:
In addition, an optimisation method such as described above is proposed, wherein the corrective action comprises the steps of:
In addition, an optimisation method such as described above is proposed, comprising the steps of:
In addition, an optimisation method such as described above is proposed, wherein the new audio parameters comprise a gain applied on a current volume, which depends on the initial optimal listening position and on the new position and/or on the new orientation of the set-top box.
In addition, an optimisation method such as described above is proposed, wherein the set-top box comprises at least two speakers, the production of the new audio parameters comprising the step of adjusting an audio balance between said at least two speakers.
In addition, an optimisation method such as described above is proposed, wherein the set-top box comprises a first speaker and a second speaker, the optimisation method comprising the steps of:
In addition, an optimisation method such as described above is proposed, the optimisation method using, to define the new audio parameters, a predefined cross-reference table which associates precalculated audio parameters with distance and/or angle indications representative of the change of position and/or orientation of the set-top box.
In addition, an optimisation method such as described above is proposed, comprising the steps, to determine the new position of the set-top box, of:
In addition, an optimisation method such as described above is proposed, wherein the corrective action consists of defining a new optimal listening position associated with the new position and/or with the new orientation of the set-top box, and of indicating the new optimal listening position to the user, so that they use it.
In addition, an optimisation method such as described above is proposed, wherein the corrective action consists of emitting a message to a user of the set-top box, asking them to reposition the set-top box in the initial position and/or in the initial orientation.
In addition, a set-top box is proposed, comprising at least one speaker and to which at least one camera is secured, the set-top box further comprising a processing unit, wherein the optimisation method such as described above is implemented.
In addition, a computer program is proposed, comprising instructions which make the processing unit of the set-top box such as described above, execute the steps of the optimisation method such as described above.
In addition, a recording medium which can be read by a computer is proposed, on which the computer program such as described above is recorded.
The invention will be best understood in the light of the description below of particular, non-limiting embodiments of the invention.
Reference will be made to the accompanying drawings, among which:
    
    
    
    
    
    
    
    
    
    
In reference to 
The set-top box 11 is connected to the television 12 by an HDMI cable 13.
The set-top box 11 comprises, in this case, two speakers 14 comprising a first speaker 14a and a second speaker 14b.
The membrane of the first speaker 14a is located at a left face of the set-top box 11. The membrane of the second speaker 14b is located at a right face of the set-top box 11.
The set-top box 11 comprises an audio unit 16 arranged to acquire an audio stream and to produce first audio signals going to the first speaker 14a, and second audio signals going to the second speaker 14b, so as to playback sound signals corresponding to the audio stream.
The audio stream can be a single-channel or multichannel audio stream, can optionally accompany a video stream, and can come from any source, which is, for example, a broadcasting network (satellite television network, internet connection, digital terrestrial television (DTT) network, cable television network, etc.), another piece of equipment connected to the set-top box (11) (a CD, DVD or BlueRay player, a smartphone, a tablet, etc.), or also a storage medium (and, for example, a USB stick or a memory card connected to the set-top box 11).
The audio unit 16 comprises hardware components (hardware) and/or software. These components comprise in particular, one or more amplifiers. Some of these components implement an audio processing module 17 capable of applying and of modifying audio parameters to adjust the acoustical efficiency of the speakers 14a, 14b. A set of audio parameters forms an audio profile.
In particular, the audio processing module 17 makes it possible to distribute the channels of a multichannel audio stream between the speakers 14, so as to provide the spatialisation effect for the user. The audio processing module 17 can adapt the distribution, according to a position parameter of the user and to an angle defining the width of the optimal listening zone.
The set-top box 11 in addition comprises a camera 18.
The camera 18 is positioned at a central and upper portion of the front face 15 of the set-top box 11.
The set-top box 11 comprises an image processing module 19 arranged to acquire the images produced by the camera 18 and to apply signal processing algorithms on the images.
The set-top box 11 also comprises one or more microphones 20 arranged to capture sound signals present in the environment of the set-top box 11. The set-top box 11 in addition comprises an audio processing module 21 arranged to process and to record said captured sound signals.
The set-top box 11 also comprises a processing unit 22. The processing unit 22 comprises at least one processing component 23 (electronic and/or software), which is, for example, a “general-purpose” processor, a processor specialising in signal processing (or Digital DSP, Signal Processor), a microcontroller, or a programmable logic circuit, such as an FPGA (Field Programmable Gate Array) or an ASIC (Application Specific Integrated Circuit). The processing unit 22 also comprises one or more memories 24 (and, in particular, one or more non-volatile memories), connected to or integrated in the processing component 23. At least one of these memories 24 forms a recording medium which can be read by a computer, on which at least one computer program is recorded, comprising instructions which make the processing component 23 execute at least some of the steps of the optimisation method which will be described below.
The invention consists of comparing images of the scene located facing the set-top box 11 and captured at different points in time by the camera 18. This comparison aims to detect (and optionally also evaluate) movements of several objects of the scene and to deduce, from these movements, a possible change of position and/or orientation of the set-top box 11. If a change which can degrade the sound efficiency of the user is detected, a corrective action is performed.
In reference to 
The processing unit 22 acquires at least one initial image I0 produced by the camera 18, while the set-top box 11 is in an initial position and in an initial orientation: step E1. The initial image I0 is, for example, that of 
Then, the processing unit 22 acquires at least one current image In produced by the camera 18: step E2.
The processing unit 22 thus analyses the current image In and the initial image I0 to detect a change of position and/or orientation of the set-top box 11: step E3.
The processing unit 22 thus evaluates this change of position and/or orientation: step E4. If this is not significant, and therefore has no impact on the acoustical efficiency, the method returns to step E2.
If the change of position and/or orientation is significant, the processing unit 22 performs at least one corrective action having the aim of optimising the sound playback following the change of position and/or orientation of the set-top box 11: step E5.
A first embodiment of the optimisation method is now described, in a more detailed manner.
In step E1, the set-top box 11 is in the initial position and in the initial orientation, which are the nominal position and orientation. In reference to 
It is noted that in this position and in the initial image I0, the axis X1 and the axis Z1 of the first system R1 associated with the set-top box 11 are parallel respectively to the axis X and the axis Y of a reference system associated with the scene. The axis Z1 is, in this case, the optical axis A0 of the camera 18 when the set-top box 11 is in the initial position and the initial orientation.
The processing unit 22 acquires at least one initial image I0. Optionally, one or more processings, and for example, a border detection filtering and/or a colour equalisation, are applied on the initial image I0 and make it possible to prepare it for the next step. This image I0 is saved in the non-volatile memory.
The set-top box 11 therefore has been calibrated (for example, in the factory, or manually by the user upon its first start-up) for the initial optimal playback position U.
The U coordinates (ux, uy, uz) in the system R1 (X1, Y1, Z1) are noted. The U position is therefore known.
The result of this calibration is that the set-top box 11, from its start-up following said calibration (it is from its first start-up at the user's home if the calibration has been performed in the factory), uses initial audio parameters to optimise the sound playback. These initial audio parameters define a default sound profile. However, this adjustment makes it possible to obtain an optimised playback in the initial optimal playback position U, only when the set-top box 11 is in the initial position and in the initial orientation.
In step E2, the processing unit 22 acquires one or more current images In (with n≠0), which are captured after the acquisition of the initial image I0.
In a first embodiment, the processing unit 22 acquires a new capture of the scene (i.e. a new current image In) upon each start-up of the set-top box 11.
In a second embodiment, the processing unit 22 acquires a capture of the scene at regular intervals, for example every second, every minute or every 30 minutes.
In a third embodiment, the processing unit 22 acquires, each day, a capture of the scene at a predefined time, for example, at 12:00 pm.
The processing unit 22 can validate the capture, and therefore accept it, only if this meets one or more of the predefined criteria. For example, a capture of the scene can be considered as accepted, if the luminosity L of the scene is greater than a predetermined luminosity threshold.
For example, it is defined that the capture is accepted, if:
  
    
  
This information is generally accessible directly on the sensor embedded in the camera 18.
In step E3, the processing unit 22 analyses the current image (s) In and the initial image (s) I0 to detect a change of position and/or orientation of the set-top box 11. This movement can be, for example, defined only by a rotation angle about the vertical axis—in which case, the movement is a change of orientation of the set-top box 11.
It is considered, for example, that the processing unit 22 analyses one single current image In and one single initial image I0.
In a first embodiment, the analysis of the current image In and of the initial image I0 consists of determining a planar homography matrix making it possible to pass from the current image In to the initial image I0, then to analyse said planar homography matrix to detect the change of position and/or orientation of the set-top box 11.
The planar homography transformation technique makes it possible to find the coordinates of a point of a map of a three-dimensional scene from the same point in another map of this same scene.
In the article entitled, Creating Full View Panoramic Image Mosaics and Environment Maps, Richard Szeliski and Heung-Yeung Shum, which forms part of the work, “SIGGRAPH '97: Proceedings of the 24th annual conference on Computer graphics and interactive techniques”, ISBN 978-0-89791-896-1, which appeared on 3 Aug. 1997, on pages 251-258, and which deals with the transformation of images to obtain a panorama, the transformation matrix M with 8 coefficients (m0 . . . m7) is described, making it possible to pass from an image 1 to an image 2, each being a photograph of one same scene with different viewpoints. The matrix M is a planar homography matrix.
It is noted that the matrix M can be broken down according to the method described in the document, Decomposition of Homogenous 4×4 Matrices, Rammi (rammi@caff.de) , Apr. 14, 2020.
This breaking down makes it possible to express the matrix M as follows:
  
    
  
where P is a projection matrix, T is a translation matrix, R is a rotation matrix, H is a shearing matrix and S is a scaling matrix.
For a point P1(x, y, 1) and P2(x′, y′, 1) of homogeneous coordinates, the matrix M is written:
  
    
  
The (x′, y′) equations are given by:
  
    
  
  
    
  
If the coefficients:
Otherwise, where M is not an identity matrix, a movement has been detected.
To perform the analysis of the planar homography matrix M, the processing unit 22 therefore compares said planar homography matrix M with an identity matrix.
The processing unit 22 does not detect a change of position, nor orientation of the set-top box 11 when an absolute value of a difference between each element of the planar homography matrix M and a corresponding element of the identity matrix is less than a predetermined first detection threshold. The processing unit 22 detects a change of position and/or orientation of the set-top box 11 otherwise.
Indeed, movements, which are too small, of the set-top box 11, and/or objects in the environment will not have an actual impact on the quality of the sound playback. The processing unit 22 therefore adds a margin of error on the detection of an identity matrix.
In this case, the predetermined first detection thresholds are equal for all the elements of the matrix M and are, for example, equal to 1%.
The processing unit 22 does not therefore detect a change of position, nor orientation of the set-top box 11, when:
  
    
  
  
    
  
  
    
  
  
    
  
  
    
  
  
    
  
In reference to 
Applying a movement in the reference point of the scene, returns to moving the optimal audio position in the reference point of the camera 18.
In the second system R2 (X2, Y2, Z2), associated with the set-top box 11 after its movement, the point U (ux, uy, uz), i.e. the initial optimal listening position, has as coordinates those of the point U′ (u′x, u′y, u′z) transformed by the matrix M.
Therefore, in R2, the following is had: U˜M×U′
  
    
  
  
    
  
The new optimal listening position becomes U′.
Following step E4, if the movement undergone by the set-top box 11 is significant, the processing unit 22 performs, in step E5, at least one corrective action having the aim of optimising the sound playback following the change of position and/or orientation of the set-top box 11.
In a first embodiment, if the change of position and/or orientation of the set-top box 11 is detected since considered as significant, the processing unit 22 emits a message to the user of the set-top box 11 asking them to reposition the set-top box 11 in the initial position and/or in the initial orientation. The corrective action therefore consists of emitting this alert message.
The user is alerted, for example, by a message on the screen of the television 12, by a sound signal or by a light signal coming, for example, from a light-emitting diode integrated in the set-top box 11, or by a message sent by any radio means. This message asks the user to return the set-top box 11 into the initial position and/or into the initial orientation.
In a second embodiment, the processing unit 22 defines a new optimal listening position associated with the new position and/or with the new orientation of the set-top box 11. The new optimal listening position is therefore the position U′ in 
The processing unit 22 indicates to the user of the set-top box 11 the new optimal listening position U′, so that they use it.
The corrective action therefore consists of defining the new optimal listening position, and of indicating this new optimal listening position to the user.
In a third embodiment, the processing unit 22 determines the new position and/or the new orientation of the set-top box 11, and produces new audio parameters to optimise the sound playback, while the set-top box 11 is in the new position and/or the new orientation.
Generating new audio parameters can be done at the initiative of the user. The processing unit 22 proposes to the user to trigger, at their initiative, a calibration of the audio profile by automatic adjustment of the parameters.
Alternatively, the processing unit 22 performs this adjustment automatically as a background task, without intervention of the user.
Before performing the adjustment, the processing unit 22 verifies a reliability criterion on the detection of the new position and/or the new orientation of the set-top box 11.
The reliability criterion is that the value of the error, calculated to the equation (13) of the document mentioned above (Creating Full View Panoramic Image Mosaics and Environment Maps, Richard Szeliski and Heung-Yeung Shum), is less than (for example, less than or equal to) a predefined reliability threshold.
The error is given by:
  
    
  
This formula uses L0 and L1, which are the standardised intensities (between 0 and 1) of the reference image, respectively the new captured image.
Then, the sum of the squared differences of each pixel intensity “i” is calculated.
Finally, this result must be divided by the total number of pixels.
This gives the standardised overall intensity error between 0 and 1.
The following is had, if:
Indeed, if the initial image I0 and the current image In are completely uncorrelated, therefore without a common point, the processing unit 22 cannot reliably estimate the coordinates ux, uy of the point U in the system R2.
The predefined reliability threshold is, for example, equal to 40%.
The processing unit 22 considers that the initial image I0 and the current image In are not close enough if the error is greater than the predefined reliability threshold of 40% for example, which can be conveyed by the fact that the initial image I0 and the current image In overlap at 60%.
In this case, the processing unit 22 does not adjust the audio parameters, but only alerts the user of the movement.
If the reliability criterion is verified, the processing unit 22 adjusts the audio parameters, such that the sound playback is again optimised in the initial optimal listening position U.
The corrective action therefore consists of adjusting the audio parameters, such that the sound playback is again optimised in the initial optimal listening position U. The automatic adjustment of the audio parameters according to the movement of the set-top box 11 can be done several ways.
The new audio parameters can comprise a gain applied onto a current volume, which depends on the initial optimal listening position U and on the new position and/or on the new orientation of the set-top box 11.
This solution is particularly suitable in the case where the set-top box 11 only comprises one single speaker (single-channel system).
The processing unit 22 adjusts the volume of the speaker by applying, for example, a gain onto the current volume which is proportional to the extension of the initial optimal listening position U with respect to the optical axis A, of the camera 18 (this extension is equal to the distance separating U from its orthogonal projection over Ao) or with respect to the abscissa of the new optimal listening point U′.
The translation matrix T and/or the rotation matrix R can be used to calculate the distances [OU] and [OU′] O being the origin of the second system), and make the ratio between the two distances to obtain a factor to be applied onto the current volume of the speaker.
Optionally, this gain can be limited, such that the total current volume (equal to the sum of the present current volume and of the gain), does not exceed a maximum authorised volume. This volume limit can be adjusted by the user, or be automatically levelled according to the current volume of the room captured by the microphones 20 of the set-top box 11.
The processing unit 22, to produce the new audio parameters, can also adjust an audio balance between the first speaker 14a and the second speaker 14b. This solution therefore requires at least two speakers (stereo system).
In reference to 
The processing unit 22 adjusts the balances according to the position of U in the current image In and, more specifically, according to the position of U with respect to the axis OY.
In 
The increasing of the volume depends, in this case, on the extension of the point U with respect to the axis OY. In this case, for example, the processing unit 22 increases the volume of the speaker farthest from U, by a ratio equal to the ratio between the distance between U and the axis OY and the length of the half-image defined on the side of the axis OY where the point U is positioned.
In this case, the point U is located halfway (50%) from the current half-image located on the left of the axis OY, and the processing unit 22 increases the volume of the second speaker 14b by 50%.
In a multichannel system (two speakers or more), the processing unit 22 can use a so-called “ambisonic” spatialisation technique.
In reference to 
According to this embodiment, it is considered that the virtual sources do not change position after the movement of the set-top box 11.
The processing unit 22 first determines, by using the planar homography matrix M, a first angle β1 between a reference axis of the initial image I0 passing through the initial optimal listening position U, and a first current axis An1 passing through the initial optimal listening position U and the first speaker 14a in the current image In, and a second angle β2 between the reference axis and a second current axis An2 passing through the initial optimal listening position U and the second speaker 14b in the current image In.
The reference axis is, in this case, the optical axis Ao of the camera 18, which passes through the initial optimal listening position U and through the position of the camera 18 on the set-top box 11, when this is in the initial position and in the initial orientation.
The processing unit 22 also determines, by using the planar homography matrix M, a first distance di between the initial optimal listening position U and the first speaker 14a, and a second distance d2 between the initial optimal listening position U and the second speaker 14b.
The processing unit 22 uses the ambisonic method to place virtual sound sources 25 around the initial optimal listening position U, by calculating gains which depend on the first angle β1, on the second angle β2, on the first distance di and on the second distance d2, the new audio parameters comprising said gains.
The ambisonic method therefore requires to determine the values (d1, β1) and (d2, β2), d1 being the first distance, β1 being the first angle, d2 being the second distance and β2 being the second angle.
This method requires to define the initial position and initial orientation of the set-top box 11 in the initial system (before its movement).
In 
The processing unit 22 is thus capable, by using the ambisonic method, of finding the gains to be applied onto each speaker 14a, 14b to reconstitute four virtual sources 25: the source C positioned at the centre in front of the user, the source G on the left, the source D on the right and the source R behind the user.
It is noted that in 
  
    
  
  
    
  
  
    
  
  
    
  
In this case, C, D, G and R represent the audio signals emitted by the respective virtual sources.
To define the new audio parameters, the processing unit 22 can use a predefined cross-reference table 26 which associates precalculated audio parameters with distance and/or angle indications representative of the change of position and/or orientation of the set-top box 11. This predefined cross-reference table 26, is, for example, stored in one of the non-volatile memories.
The predefined cross-reference table 26 comprises, for example, a plurality of triplets of values (Δx, Δz, Δθ) and parameters Gi,j, each triplet of values (Δx, Δz, Δθ) being associated with a set of values of gains Gi,j.
(Δx, Δz) represents a distance unit step in the system (X1, Z1), equal to 50 cm for example (this value corresponds to a distance before projection of the matrix P).
Δθ represents an angle step about the axis Y, equal to 15° for example (this value comes from the rotation matrix R).
Now, a second embodiment of the sound playback optimisation method is described.
This embodiment uses an algorithm for detecting lines formed by the objects of the scene photographed by the camera 18. In reference to 
Again, 
In step E1, the processing unit 22 acquires the initial image I0.
In step E2, the processing unit 22 acquires the current image In.
In step E3, and in reference to 
The polar coordinates of each first line D0i are saved in a first database B0 stored, for example, in one of the non-volatile memories (in order to be returned later). For example, if a straight line D01 is defined by the coordinates (ρ01, θ01), the first database B0 will comprise the association:
  
    
  
More generally, for each straight line D0i of the image I0, the database will comprise the association:
  
    
  
Optionally, only the straight lines with vertical orientation can be considered, and for example those of which the angle θ is comprised in a predefined interval, which is, for example, the interval
  
    
  
The processing unit 22 thus only detects the translations on the abscissa axis and the rotations about the ordinate axis. In this way, the calculations are simplified, without degrading the detection, as this interval corresponds to the majority of cases of use. Indeed, it can be considered that the set-top box 11 is itself horizontally aligned and positioned on a flat surface.
Then, the processing unit 22 detects in the current image In, by using the Hough transform, a second number of second lines each having second polar coordinates.
The processing unit 22 applies for each current image In the same algorithm as that described for the initial image I0.
The processing unit 22 thus produces a second database B1, formed by straight lines:
  
    
  
The processing unit 22 thus evaluates a number of lines common to the initial image I0 and to the current image In.
The comparison between the initial image I0 and the current image In is made by counting the number of lines which are common to the two images by using the following algorithm.
In this case:
The algorithm is as follows:
  
    
      
        
        
        
        
          
            
            
          
        
        
          
            
            
            
          
          
            
            
            
          
          
            
            
            
          
          
            
            
            
          
        
      
      
        
        
        
          
            
            
          
        
      
      
        
        
        
        
          
            
            
            
          
          
            
            
            
          
          
            
            
            
          
          
            
            
            
          
          
            
            
          
        
      
    
  
The processing unit 22 detects a change of position and/or orientation of the set-top box 11 if the ratio of the number of common lines and of the total number of lines is less than a predetermined detection threshold.
The predetermined detection threshold is, for example, equal to 70%: the set-top box 11 is considered as having undergone a movement if L/L max, with L max=min(N, M), is less than 0.7. Step E5 can then be triggered. Otherwise, the method returns to step E2. The second database B1 can then be reset, and a new cycle restarts.
It is noted that optionally, a tolerance T (Tp, Tθ) can be introduced for the coordinates of the straight lines, in order to avoid detecting movements of the set-top box 11 which are too small and therefore not having an actual impact on the audio efficiency. Each equality test in line 4 above would become, for example:
  
    
      
        
        
        
        
          
            
            
          
        
        
          
            
            
            
          
        
      
      
        
        
        
          
            
            
          
        
      
      
        
        
        
          
            
            
          
        
      
      
        
        
        
          
            
            
          
        
      
      
        
        
        
        
          
            
            
            
          
          
            
            
            
          
        
      
      
        
        
        
          
            
            
          
          
            
            
          
          
            
            
          
          
            
            
          
        
      
    
  
  
  
    
  
It can happen that L is not representative, as it lacks systems in the image I0 or In making it possible to calculate enough straight lines.
The processing unit 22 therefore calculates a confidence index making it possible to estimate a confidence in the result of the detection, then compares the confidence index with a predetermined confidence threshold. The processing unit 22 then decides, according to the result of said comparison, to validate, or not, the result of the step of detecting the change of position and/or orientation.
The confidence index, in this case, depends on the first number N and on the second number M.
The confidence index is, in this case:
  
    
  
F is therefore the standardised difference between the first number and the second number.
The more F extends to 0, the more reliable the detection will be.
The processing unit 22 considers that the result of the detection step is reliable, only if:
  
    
  
where U is the predetermined confidence threshold.
For example, the following is fixed: U=0.2.
The movement of the set-top box 11 is therefore representative of the number of lines L counted as identical.
The processing unit 22 uses this confidence index, only if N>2 and M>2. If one of these numbers is less than or equal to 1, the processing unit 22 considers that the detection is not reliable without even calculating the value of F.
If the processing unit 22 does not validate the step of detecting the change of position and/or orientation, the processing unit 22 acquires a new current image and repeats the steps which have just been described.
The use of the Hough transform also makes it possible to proceed with a calculation of the effective movement of the set-top box 11.
The processing unit 22 compares the first database B0 and the second database B1. For all the lines present in the first database B0 and the second database B1, the processing unit 22 counts the number of identical angles θ whatever the (normal) distances ρ. Two identical angles are considered if their tangents are identical or close to near 0.001, for example. The processing unit 22 detects a particular angle which is the most present in the set of polar coordinates comprising the first polar coordinates of the first lines (first database B0) and the second polar coordinates of the second lines (second database B1).
θmax is called the particular angle which is the most represented in the first database B0 and the second database B1.
If the number of times, where said particular angle is present in the set of polar coordinates, is greater than a predetermined angle threshold, the processing unit 22 deduces from this that the set-top box 11 has undergone a lateral movement perpendicular to the optical axis Ao of the camera 18. More specifically, if the number of occurrences of lines oriented by θmax is greater than the predetermined angle threshold, for example equal to 80% of the total number of different angles referenced in the two bases B0, B1, the processing unit 22 deduces from this that the set-top box 11 has been translated over the horizontal plane of the optical axis Ao of the camera 18.
The processing unit 22 then estimates said lateral movement according to the distances of the first polar coordinates of the first lines and of the second polar coordinates, the second lines having the particular angle as the angle.
For the set of straight lines { D0i, D1i} for which the angle is identical and the most represented (angle equal to θmax), the average lateral movement in pixels is thus represented in the form:
  
    
  
If Dave>0, the processing unit 22 deduces from this that the movement of the set-top box 11 is done to the left with respect to its initial position in the scene S.
If Dave<0, the processing unit 22 deduces from this that the movement of the set-top box 11 is done to the right with respect to its initial position in the scene S.
If Dave≈0, the processing unit 22 deduces from this that the set-top box 22 has not been moved laterally.
The processing unit 22 thus detects that the set-top box 11 has been moved. The corrective action thus consists of emitting the alert message to alert the user of the movement.
It is noted that, once the analysis of the current image and of the initial image has made it possible to detect a change of position and/or orientation of the set-top box 11, and once the corrective action has been performed, a new cycle starts, to detect a subsequent change of position and/or orientation.
The optimisation method restarts.
The current image In becomes the new initial image I0, and the new audio parameters become the initial audio parameters. The set-top box acquires at least one new current image, then analyses the new current image and the new initial image to detect the subsequent change of position and/or orientation of the set-top box.
Naturally, the invention is not limited to the embodiments described, but includes any variant entering into the field of the invention such as defined by the claims.
The different steps of the optimisation method are not necessarily all implemented in the set-top box. All or some of these steps could be implemented in one or more different pieces of equipment, and for example remotely, on the cloud.
Although it has been indicated in this case that the camera is positioned at a central and upper portion of the front face of the set-top box, this configuration is not limiting. The camera could be eccentric. Likewise, the set-top box can be of any shape. The set-top box could have an asymmetrical geometric shape, or also circular or spherical.
The camera is not necessarily integrated in the set-top box, but must be secured to it (i.e. that it undergoes the same movements); it could, for example, be mounted on a support fixed on the upper face of the set-top box.
In this case, a set-top box is described, comprising two speakers located on the sides of it. The invention naturally is applied to different configurations, and for example, to set-top boxes comprising one single speaker, or to set-top boxes comprising four speakers, among which a low-frequency speaker is located, the membrane of which opens out to a lower or upper face of the set-top box. It is noted that, in the configuration where the set-top box integrates such a low-frequency speaker, this is not concerned by the adjustment of the balance and, more generally, of the audio parameters.
| Number | Date | Country | Kind | 
|---|---|---|---|
| FR2213840 | Dec 2022 | FR | national |