This invention relates generally to the field of audio engineering and digital signal processing and more specifically to systems and methods for enabling users to more easily self-fit a sound processing algorithm, for example by perceptually uncoupling fitting parameters on a 2D graphical user interface.
Fitting a sound personalization DSP algorithm is typically an automatic process—a user takes a hearing test, a hearing profile is generated, DSP parameters are calculated and then outputted to an algorithm. Although this may objectively improve the listening experience by providing greater richness and clarity to an audio file, the parameterization may not be ideal as the fitting methodology fails to take into account the subjective hearing preferences of the user (such as preference levels for coloration and compression). Moreover, to navigate the tremendous number of variables that comprise a DSP parameter set, such as the ratio, threshold, and gain settings for every DSP subband, would be cumbersome and difficult.
Accordingly, it is an object of this invention to provide improved systems and methods for fitting a sound processing algorithm by first fitting the algorithm with a user's hearing profile, then allowing a user on a two-dimensional (2D) interface to subjectively fit the algorithm through an intuitive process, specifically through the perceptual uncoupling of fitting parameters, which allows a user to more readily navigate DSP parameters on an x- and y-axis.
The problems and issues faced by conventional solutions will be at least partially solved according to one or more aspects of the present disclosure. Various features according to the disclosure are specified within the independent claims, additional implementations of which will be shown in the dependent claims. The features of the claims can be combined in any technically meaningful way, and the explanations from the following specification as well as features from the figures which show additional embodiments of the invention can be considered.
According to an aspect of the present disclosure, provided are systems and methods for fitting a sound processing algorithm in a two-dimensional space using interlinked parameters.
Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this technology belongs.
The term “sound personalization algorithm”, as used herein, is defined as any digital signal processing (DSP) algorithm that processes an audio signal to enhance the clarity of the signal to a listener. The DSP algorithm may be, for example: an equalizer, an audio processing function that works on the subband level of an audio signal, a multiband compressive system, or a non-linear audio processing algorithm.
The term “audio output device”, as used herein, is defined as any device that outputs audio, including, but not limited to: mobile phones, computers, televisions, hearing aids, headphones, smart speakers, hearables, and/or speaker systems.
The term “hearing test”, as used herein, is any test that evaluates a user's hearing health, more specifically a hearing test administered using any transducer that outputs a sound wave. The test may be a threshold test or a suprathreshold test, including, but not limited to, a psychophysical tuning curve (PTC) test, a masked threshold (MT) test, a pure tone threshold (PTT) test, and a cross-frequency simultaneous masking (xF-SM) test.
The term “coloration”, as used herein, refers to the power spectrum of an audio signal. For instance, white noise has a flat frequency spectrum when plotted as a linear function of frequency.
The term “compression”, as used herein, refers to dynamic range compression, an audio signal processing that reduces the signal level of loud sounds or amplifies quiet sounds.
One or more aspects described herein with respect to methods of the present disclosure may be applied in a same or similar way to an apparatus and/or system having at least one processor and at least one memory to store programming instructions or computer program code and data, the at least one memory and the computer program code configured to, with the at least one processor, cause the apparatus at least to perform the above functions. Alternatively, or additionally, the above apparatus may be implemented by circuitry.
One or more aspects of the present disclosure may be provided by a computer program comprising instructions for causing an apparatus to perform any one or more of the presently disclosed methods. One or more aspects of the present disclosure may be provided by a computer readable medium comprising program instructions for causing an apparatus to perform any one or more of the presently disclosed methods. One or more aspects of the present disclosure may be provided by a non-transitory computer readable medium, comprising program instructions stored thereon for performing any one or more of the presently disclosed methods.
Implementations of an apparatus of the present disclosure may include, but are not limited to, using one or more processors, one or more application specific integrated circuits (ASICs) and/or one or more field programmable gate arrays (FPGAs). Implementations of the apparatus may also include using other conventional and/or customized hardware such as software programmable processors.
It will be appreciated that method steps and apparatus features may be interchanged in many ways. In particular, the details of the disclosed apparatus can be implemented as a method, as the skilled person will appreciate.
Other and further embodiments of the present disclosure will become apparent during the course of the following discussion and by reference to the accompanying drawings.
In order to describe the manner in which the above-recited and other advantages and features of the disclosure can be obtained, a more particular description of the principles briefly described above will be rendered by reference to specific embodiments thereof, which are illustrated in the appended drawings. Understand that these drawings depict only example embodiments of the disclosure and are not therefore to be considered to be limiting of its scope, the principles herein are described and explained with additional specificity and detail through the use of the accompanying drawings in which:
Various example embodiments of the disclosure are discussed in detail below. While specific implementations are discussed, it should be understood that these are described for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the disclosure.
Thus, the following description and drawings are illustrative and are not to be construed as limiting the scope of the embodiments described herein. Numerous specific details are described to provide a thorough understanding of the disclosure. However, in certain instances, well-known or conventional details are not described in order to avoid obscuring the description. References to one or an embodiment in the present disclosure can be references to the same embodiment or any embodiment; and, such references mean at least one of the embodiments.
Reference to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment of the disclosure. The appearances of the phrase “in one embodiment” in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Moreover, various features are described which may be exhibited by some embodiments and not by others.
The terms used in this specification generally have their ordinary meanings in the art, within the context of the disclosure, and in the specific context where each term is used. Alternative language and synonyms may be used for any one or more of the terms discussed herein, and no special significance should be placed upon whether or not a term is elaborated or discussed herein. In some cases, synonyms for certain terms are provided. A recital of one or more synonyms does not exclude the use of other synonyms. The use of examples anywhere in this specification including examples of any terms discussed herein is illustrative only and is not intended to further limit the scope and meaning of the disclosure or of any example term. Likewise, the disclosure is not limited to various embodiments given in this specification.
Without intent to limit the scope of the disclosure, examples of instruments, apparatus, methods and their related results according to the embodiments of the present disclosure are given below. Note that titles or subtitles may be used in the examples for convenience of a reader, which in no way should limit the scope of the disclosure. Unless otherwise defined, technical and scientific terms used herein have the meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. In the case of conflict, the present document, including definitions will control.
Additional features and advantages of the disclosure will be set forth in the description which follows, and in part will be obvious from the description, or can be learned by practice of the herein disclosed principles. The features and advantages of the disclosure can be realized and obtained by means of the instruments and combinations particularly pointed out in the appended claims. These and other features of the disclosure will become more fully apparent from the following description and appended claims or can be learned by the practice of the principles set forth herein.
It should be further noted that the description and drawings merely illustrate the principles of the proposed device. Those skilled in the art will be able to implement various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and embodiment outlined in the present document are principally intended expressly to be only for explanatory purposes to help the reader in understanding the principles of the proposed device. Furthermore, all statements herein providing principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass equivalents thereof.
The disclosure turns now to
To this extent,
Multiband dynamic processors are typically used to improve hearing impairments. In the fitting of a DSP algorithm based on a user's hearing thresholds usually, there are many parameters that can be altered, the combination of which lead to a desired outcome. In a system with a multiband dynamic range compressor, these adjustable parameters usually at least consist of compression thresholds for each band which determine at which audio level the compressor becomes active and compression ratios, which determine how strong the compressor reacts. Compression is applied to attenuate parts of the audio signal which exceeds certain levels to then lift lower parts of the signal via amplification. This is achieved via a gain stage in which a gain level can be added to each band.
According to aspects of the present disclosure, a two-dimensional (2D) space offers the opportunity to disentangle perceptual dimensions of sound to allow more flexibility during a fine-tuning fitting step, such as might be performed by or for a user of an audio output device (see, e.g., the example 2D interface of
O=t+(I−t)*r+g
In the context of providing a 2D fitting interface (such as the example 2D interface seen in
A more complex multiband dynamics processor than that of
Although this more complex multiband dynamics processor offers a number of benefits, it can potentially create a much less intuitive parameter space for some users to navigate, as there are more variables that may interact simultaneously and/or in an opaque manner. Accordingly, it can be even further desirable to provide systems and methods for perceptual disentanglement of compression and coloration in order to facilitate fitting with respect to complex processing schemes.
The formula for calculating the output for this multiband dynamics processor can be calculated as:
O=[[(1−FFr)·FFt+I·FEr+FBt·FBc·FFr]/(1+FBc·FFr)]+g
Where O=output of multiband dynamics processor; I=input 401; g=gain 408; FBc=feed-back compressor 406 factor; FBt=feed-back compressor 406 threshold; FFr=feed-forward compressor 404 ratio; FFt=feed-forward compressor 404 threshold. Here again, as described above with respect to the multiband dynamics processor of the example of
Objective parameters may be calculated by any number of methods. For example, DSP parameters in a multiband dynamic processor may be calculated by optimizing perceptually relevant information (e.g., perceptual entropy), as disclosed in commonly owned U.S. Pat. No. 10,455,335. Alternatively, a user's masking contour curve in relation to a target masking curve may be used to determine DSP parameters, as disclosed in commonly owned U.S. Pat. No. 10,398,360. Other parameterization processes commonly known in the art may also be used to calculate objective parameters based off user-generated threshold and suprathreshold information without departing from the scope of the present disclosure. For instance, common fitting techniques for linear and non-linear DSP may be employed. Well known procedures for linear hearing aid algorithms include POGO, NAL, and DSL (see, e.g., H. Dillon, Hearing Aids, 2nd Edition, Boomerang Press, 2012).
Objective DSP parameter sets may be also calculated indirectly from a user hearing test based on preexisting entries or anchor points in a server database. An anchor point comprises a typical hearing profile constructed based at least in part on demographic information, such as age and sex, in which DSP parameter sets are calculated and stored on the server to serve as reference markers. Indirect calculation of DSP parameter sets bypasses direct parameter sets calculation by finding the closest matching hearing profile(s) and importing (or interpolating) those values for the user.
(√{square root over ((d5a−d1a)2+(d6b−d2b)2 . . . )}<√{square root over ((d5a−d9a)2+(d6b−d10b)2 . . . )}
(√{square root over ((y1−x1)2+(y2−x2)2 . . . )}<√{square root over ((y1−z1)2+(y2−z2)2 . . . )})
As would be appreciated by one of ordinary skill in the art, other methods may be used to quantify similarity amongst user hearing profile graphs, where the other methods can include, but are not limited to, methods such as a Euclidean distance measurements, e.g. ((y1−x1)+(y2−x2) . . . >(y1−x1)+(y2−x2)) . . . or other statistical methods known in the art. For indirect DSP parameter set calculation, then, the closest matching hearing profile(s) between a user and other preexisting database entries or anchor points can then be used.
DSP parameter sets may be interpolated linearly, e.g., a DRC ratio value of 0.7 for user 5 (u_id)5 and 0.8 for user 3 (u_id)3 would be interpolated as 0.75 for user 200 (u_id)200 in the example of
The objective parameters are then outputted to a 2D fitting application, comprising a graphical user interface to determine user subjective preference. Subjective fitting is an iterative process. For example, returning to the discussion of
Although reference is made to an example in which the y-axis corresponds to compression values and the x-axis corresponds to coloration values, it is noted that that is done for purposes of example and illustration and is not intended to be construed as limiting. For example, it is contemplated that the x and y-axes, as presented, may be reversed while maintaining the presentation of coloration and compression to a user; moreover, it is further contemplated that other sound and/or fitting parameters may be presented on the 2D fitting interface and otherwise utilized without departing from the scope of the present disclosure.
In some embodiments, the 2D-fitting interface can be dynamically resized or refined, such that the perceptual dimension display space from which a user selection of (x, y) coordinates is made is scaled up or down in response to one or more factors. The dynamic resizing or refining of the 2D-fitting interface can be based on a most recently received user selection input, a series of recently received user selection inputs, a screen or display size where the 2D-fitting interface is presented, etc.
For example, turning to
The initial selection step of
Here, parameters 1206,1207 comprise a feed-forward threshold (FFth) value, a feed-back threshold (FBth) value, and a gain (g) value for each subband in the multiband dynamic processor that is subject to the 2D-fitting process of the present disclosure (e.g., such as the multiband dynamic process illustrated in
Although changes in a selected (x, y) or (coloration, compression) coordinate made parallel to one of the two axes would seemingly affect only the value represented by that axis (i.e., changes on the y-axis would seemingly affect only coloration while leaving compression unchanged), the perceptual entanglement of coloration and compression means that neither value can be changed without causing a resultant change in the other value. In other words, when coloration and compression are entangled, neither perceptual dimension can be changed independently. For example, consider a scenario in which compression is increased by moving upwards, parallel to the y-axis. In response to this movement, compressiveness can be increased by lowering compression thresholds and making ratios harsher. However, depending on the content, these compression changes alone will often introduce coloration changes by changing the relative energy distribution of the audio, especially if the compression profile across frequency bands is not flat. Therefore, steady-state mathematical formulas are utilized to correct these effective level and coloration changes by adjusting gain parameters in such a way that the overall long-term frequency response for CE noise is not being altered. This way, a perceptual disentanglement of compressiveness to coloration is achieved in real time.
O=[[(1−FFr)·FFt+I·FFr+FBt·FBc·FFr]/(1+FBc·FFr)]+g.
Specifically,
In some embodiments computing system 1600 is a distributed system in which the functions described in this disclosure can be distributed within a datacenter, multiple datacenters, a peer network, etc. In some embodiments, one or more of the described system components represents many such components each performing some or all of the function for which the component is described. In some embodiments, the components can be physical or virtual devices.
Example system 1600 includes at least one processing unit (CPU or processor) 1610 and connection 1605 that couples various system components including system memory 1615, such as read only memory (ROM) 1620 and random-access memory (RAM) 1625 to processor 1610. Computing system 1600 can include a cache of high-speed memory 1612 connected directly with, in close proximity to, or integrated as part of processor 1610.
Processor 1610 can include any general-purpose processor and a hardware service or software service, such as services 1632, 1634, and 1636 stored in storage device 1630, configured to control processor 1610 as well as a special-purpose processor where software instructions are incorporated into the actual processor design. Processor 1610 may essentially be a completely self-contained computing system, containing multiple cores or processors, a bus, memory controller, cache, etc. A multi-core processor may be symmetric or asymmetric.
To enable user interaction, computing system 1600 includes an input device 1645, which can represent any number of input mechanisms, such as a microphone for speech, a touch-sensitive screen for gesture or graphical input, keyboard, mouse, motion input, speech, etc. Computing system 1600 can also include output device 1635, which can be one or more of a number of output mechanisms known to those of skill in the art. In some instances, multimodal systems can enable a user to provide multiple types of input/output to communicate with computing system 1600. Computing system 1600 can include communications interface 1640, which can generally govern and manage the user input and system output. There is no restriction on operating on any particular hardware arrangement and therefore the basic features here may easily be substituted for improved hardware or firmware arrangements as they are developed.
Storage device 1630 can be a non-volatile memory device and can be a hard disk or other types of computer readable media which can store data that are accessible by a computer, flash memory cards, solid state memory devices, digital versatile disks, cartridges, random access memories (RAMs), read only memory (ROM), and/or some combination of these devices.
The storage device 1630 can include software services, servers, services, etc., that when the code that defines such software is executed by the processor 1610, it causes the system to perform a function. In some embodiments, a hardware service that performs a particular function can include the software component stored in a computer-readable medium in connection with the necessary hardware components, such as processor 1610, connection 1605, output device 1635, etc., to carry out the function.
It should be further noted that the description and drawings merely illustrate the principles of the proposed device. Those skilled in the art will be able to implement various arrangements that, although not explicitly described or shown herein, embody the principles of the invention and are included within its spirit and scope. Furthermore, all examples and embodiment outlined in the present document are principally intended expressly to be only for explanatory purposes to help the reader in understanding the principles of the proposed device. Furthermore, all statements herein providing principles, aspects, and embodiments of the invention, as well as specific examples thereof, are intended to encompass equivalents thereof.
This application is a continuation-in-part of U.S. patent application Ser. No. 16/868,775 filed May 7, 2020 and entitled “SYSTEMS AND METHODS FOR PROVIDING PERSONALIZED AUDIO REPLAY ON A PLURALITY OF CONSUMER DEVICES”, which is a continuation of U.S. patent application Ser. No. 16/540,345 filed Aug. 14, 2019 and entitled “SYSTEMS AND METHODS FOR PROVIDING PERSONALIZED AUDIO REPLAY ON A PLURALITY OF CONSUMER DEVICES”, the contents of which are both herein incorporated by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
Parent | 16540345 | Aug 2019 | US |
Child | 16868775 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 16868775 | May 2020 | US |
Child | 17203479 | US |