Obesity is a growing global epidemic and are major contributors to the development of chronic diseases such as type 2 diabetes, asthma, cardiovascular diseases, cancers, and musculoskeletal disorders. Obesity is often considered to be a result of a sustained energy imbalance whereby energy intake (El) exceeds energy expenditure (EE), causing an accumulation of fat in the body. An accurate understanding of El can help develop strategies for weight loss interventions. Despite decades of obesity research, an accurate estimation of El in human is still a difficult challenge.
Energy intake is directly related to eating behavior, which is a complex interaction of a variety of physiological, emotional, social, cultural, environmental, and economic factors that influence the timing of an eating episode, amount of food intake, food choice and/or selection, and the way in which food is consumed. To address issues with accurate measurement of energy intake and eating behavior, research involving the automatic detection of eating episodes, recognition of the foods being consumed and measurement of the quantity and the manner of consumption in an eating episode is rapidly progressing.
In an embodiment, a non-contact chewing sensor is provided. The chewing sensor includes an optical proximity sensor specifically designed to monitor chewing. The chewing sensor may be attached to eyeglasses of a user or incorporated into an augmented reality headset of the user. The chewing sensor may include IR light emitter and IR light receiver. The IR emitter/receiver pair of the chewing sensor may be positioned over a muscle of the user that is involved in the chewing process such as the temporalis muscle. The IR emitter of the sensor emits IR light onto the surface of the skin covering the muscle where it is reflected. When the user chews, the amount of light that is received by the IR receiver changes due to the activation of the muscle. The amount of light received can be used as a signal to determine when the user is chewing and likely eating. Determining when and how long a user is eating may be useful for a variety of applications such as weight loss and scientific research. The chewing sensor may further include an eye gaze-aligned camera and other sensors that may be used to estimate portion sizes or amounts of foods that have been eaten.
In an embodiment, a food type and portion estimation sensor is provided. The portion estimation sensor comprises: a housing; a processing component contained in the housing; a distance sensor contained in the housing; a camera contained in the housing; and an inertial measurement unit contained in the housing. The processing component may be adapted to: receive an image of food from the camera; receive data measured by the inertial measurement unit; based on the data measured by the inertial measurement unit, determine a viewing angle of the camera; receive a distance measurement from the distance sensor; and based on the determined viewing angle and the received distance, estimate a size of the food in the image; based on the image captured by the camera, recognize the food type being consumed.
In an embodiment, a portion estimation sensor is provided. The sensor includes a housing, a processing component contained in the housing, a distance sensor contained in the housing; a camera contained in the housing; and an inertial measurement unit contained in the housing. The processing component is adapted to: receive an image of a vessel from the camera; receive data measured by the inertial measurement unit; based on the data measured by the inertial measurement unit, determine a viewing angle of the camera; receive a distance measurement from the distance sensor; and based on the determined viewing angle and the received distance, estimate a size of the vessel in the image.
Implementation may include some or all of the following features. The data measured by the inertial measurement unit may include heading, pitch, and roll angle. Estimating a size of the vessel may include: calculating a first plane of an eating surface that includes the vessel; calculating a height of the vessel; calculating a second plane that includes a top of the vessel based on the calculated height; and estimating the size of the vessel based on the first plane, the second plane, the height, and the image. The sensor may further, based on the estimated size of the plate, estimate a size of food in the vessel. The housing may be adapted to attach to a pair of glasses. The housing may be part of an AR or VR headset or goggles.
Other systems, methods, features and/or advantages will be or may become apparent to one with skill in the art upon examination of the following drawings and detailed description. It is intended that all such additional systems, methods, features and/or advantages be included within this description and be protected by the accompanying claims.
The components in the drawings are not necessarily to scale relative to each other. Like reference numerals designate corresponding parts throughout the several views.
Depending on the embodiment, the chewing sensor 100 may be configured to detect when the user or wearer is chewing (and not chewing) solid foods and/or sucking liquids. The chewing sensor 100 may further be configured to count the total number of chews or sucks made by the wearer. As may be appreciated, the number of chews made by a user may be useful for purposes of estimating the total amount of food consumed by the user over a particular time period, or for estimating the energy intake of the user.
The housing 105 of the chewing sensor 100 may be constructed from a lightweight material such as plastic. Other materials may be used. As shown, the housing 105 may be constructed to allow the chewing sensor 100 to be easily attached to the temples of a pair of eyeglasses. Depending on the embodiment, the housing 105 could be part of an existing pair of augmented reality or virtual reality glasses or headset. Furthermore, instead of glasses, the housing 105 could be adapted to be attached to a hat, helmet or other type of head-gear that can be worn by a user.
The IR sensor 160 may include an IR emitter and an IR receiver. The IR sensor 160 may be placed in the housing 105 such that when the chewing sensor 100 is worn by a user, the IR emitter is facing a muscle that is used by the user when chewing. A suitable muscle is the temporalis muscle. The temporalis muscle may be preferable because it is pronounced when chewing and less pronounced when performing other jaw moving activities such as talking. However, other muscles may be used. For example,
Returning to
For example,
The IR sensor 160 may include an IR emitter and an IR receiver as described above with respect to
The sensor circuit 170 may receive as an input a signal from the IR receiver, and may output another signal that may be used by the processing component 180 to determine if the wearer of the chewing sensor 100 is chewing or not chewing. The sensor circuit 170 may allow the chewing sensor 100 to use less power and operate in direct sunlight without saturating the IR receiver.
The processing component 180 may receive output from the sensor circuit 170, and based on a value of the output, may determine whether or not the wearer is chewing. Depending on the embodiment, the processing component 180 may determine that the wearer is chewing when computed metrics satisfies one or more criteria or when the output exceeds a threshold, for example. The processing component 180 may be implemented using a general-purpose computing device such as the computing device 1000 described with respect to
The processing component 180 may further store data about the wearer of the chewing sensor 100. The stored data may include number of chews detected, times when each chew was detected, and a duration of each chew. Depending on the embodiment, the processing component 180 may use the data to estimate how long the wearer ate, as well as the mass or amount of food that the wearer ate.
The I/O interface 190 may allow the chewing sensor 100 to receive and transmit data. The I/O interface 190 may include a wired interface such as USB and/or a wireless interface such as Bluetooth or Wi-Fi. Other interfaces may be used. The processing component 180 may periodically provide the stored and collected data about the wearer through the I/O interface 190. For example, a health or weight loss application may receive data for a wearer of the chewing sensor 100 that may be used by the application to determine if the wearer is adhering to a weight loss or eating plan.
The power source 195 may provide power to the chewing sensor 100. An example power source 195 is a battery. Other power sources 195 may be used. Depending on the embodiment, the power source 195 may be a rechargeable battery and may receive power from an external power supply.
In some embodiments, the chewing sensor 100 described above may be modified to allow for the estimation of portion size or to improve the estimation of the amount of food ingested. In particular, the chewing sensor 100 may be modified to include the camera 305, the inertial measurement unit 307 (IMU), and the distance sensor 309.
In some embodiments, the camera 305, IMU 307, and distance sensor 309 may be mounted on or embedded into an eyeglasses frame, may be mounted on or behind an ear, or may be mounted on a headset as shown in
The IMU 307 may be a simple 3D accelerometer, a 6D accelerometer, or a 9D accelerometer. Other types of accelerometers, gyroscopes, or magnetometers may be used. The IMU may respond to the Earth's gravity (and magnetic field, if implemented) field and may register the head pose of the wearer in terms of heading, pitch, and roll angles (μ, φ, ψ).
Food being eaten is usually located in a plane normal (902) to the Earth gravity field, so that the food is stationary and does not move. The food is also normally located in front of the person consuming it. Thus, the viewing angle of the camera can be estimated from simple geometric equations. For example, the viewing angle with respect to the food plane may be estimated from the pitch angle as: μ̆=90−μ. Thus, the viewing angle of the camera needed for portion size estimation may be computed from the angles measured by the IMU 307.
The distance sensor 309 (also referred to as ranging sensor or time of flight sensor) measures the distance to the scene with a high accuracy (for example, 1 mm). The optical axis of the distance sensor may be co-aligned with the optical axis of the camera 305, pointing to the known region within camera's 305 field of view. Thus, when the food being eaten is recognized to be located within this region of the image, the distance sensor 309 returns a valid distance measurement d to the food being eaten. The measured distance and view angle may then be utilized in conjunction with computer vision methods to estimate the portion size of the food being eaten. The measured portion size may be used in estimation of energy intake or combined with other metrics (e.g., the amount of food eaten as determined by the chewing sensor 100).
In some embodiments, the chewing sensor 100 may first determine the size of a plate in front of a wearer using a geometry 500 as shown in
The principal point 505 is the intersection of the imaging plane with the optical axis 501. The 2D image coordinate system, Ci, may be defined to be in the image plane with its origin located at the principal point 505, u-axis in the fast scan direction and v-axis in the slow scan direction of a camera sensor. Let p be the projection of P onto the image plane and let (ū ū)t be the coordinates of p in Ci. Then (ū ū)t are given by the following equation 2:
Referring to
Finally, the image sampling performed by a camera sensor (CCD) of the camera 305 may be modeled. Let Cp be the pixel coordinate system associated with the digital image. The pixel coordinates are related to the image coordinates by the following equation 4 where scx; scy are scale factors (pixel/mm), ccx; ccy are the pixel coordinates of the principal point and Kc is the distortion coefficient (pixel/mm):
Equation 5 is given by:
For purposes of volume estimation, the inverse of the equation 5 is given by the following equation 6:
With the known orientation provided by the IMU 307, a right angle between the surface of the lens and the optical axis 501, and the projection relationship in equation 5, it can be shown that the inverse of function ƒ in equation 6 exists for a tabletop 601 according to the following equation 7:
Note that Z=0 in equation 7 represents the plane equation of the tabletop 601.
Also, it may be assumed that the roll and yaw of the sensor is zero which results int the following equation 8:
From equation 4, the world coordinates of the tabletop 601 are related to the pixel coordinates by the following equation 9:
To calculate the translational and rotation matrices, the sensor pitch from the IMU 307 and distance readings from the distance sensor 309 are used. Depending on the embodiment, the camera 305 on the sensor 100 has an offset of 21 degrees.
Using the following equation 10, where dtof=distance the distance between the sensor 100 and the eating surface as determined by the distance sensor 309, the equation 11 can be derived as:
From the equation 9 the following equation 12 may be derived:
Finally, the equation 13 is obtained where T=[0; −h; 0]:
Equation 13 gives us the plane of the eating surface (Z=0).
Continuing to
First the height 705 of the vessel is measured along the y-axis as shown in
H=tan(ω)×H′ (14)
Once, the height 707 of the vessel is calculated, the equation of the plane Z=H is obtained instead of Z=0.
h′=h−(H×sec(ω)) (15)
Once h′ is obtained, the processing component 180 may plug h′ into the equation 11 followed by equations 12 and 13.
Once the equation of the plane is determined by the processing component 180 for both the table 601 and the top of the vessel 703, the processing component 180 may use the equation of the plane to calculate the dimensions of the vessel 703. The processing component 180 may calculate the dimensions of the vessel 703 based on where the vessel 703 intersects the planes Z=0 and Z=H in the image captured by the camera 305.
As may be appreciated, once the dimensions of the vessel 703 have been calculated, the processing component 180 may use the dimensions to calculate an amount of food contained in the vessel 703 (i.e., portion size). The processing component 180 may calculate the amount of food by determining the percentage of the vessel 703 that is taken up by the food. The percentage may be determined from the image captured by the camera 305. After determining the percentage, the amount of food may be determined using the percentage and the dimensions of the vessel 703. For example, the processing component 180 may determine that based on the dimensions of the vessel 703, the vessel 703 likely holds 12 ounces. The processing component 180 may further determine from the image of the vessel 703 that the food in the bowl takes up 70% of the vessel 703. The processing component 180 may then determine that the portion of food is approximately 8.4 ounces.
Once the portion size is determined, the processing component 180 may store the portion size along with other information about a current meal such as when the wearer of the chewing sensor 100 started eating, how long they ate, and how much they ate. The information may later be retrieved or viewed by the wearer, incorporated into a fitness or weight-loss application, or may be provided to a doctor or physician associated with the wearer.
The processing component 180 may further use computer vision techniques to identify the type of food being consumed by the wearer in the vessel 703. For example, a computer vision model that is trained to identify types of food may be applied to the image generated by the camera 305 and may generate a guess or estimate of the food that is in the vessel 703. Depending on the food in the vessel 703, the processing component 180 may retrieve known caloric information associated with the food, and in combination with the determined portion size, may determine the total number of calories eaten by the wearer of the sensor 100. The determined calories may be stored by the processing component 180.
As may be appreciated, the chewing sensor 100 as described above may be used to not only collect information about the calories consumed by the wearer, but to actively monitor the calories being eaten by the wearer and to notify the wearer when they have exceeded an allotted amount of calories. For example, a wearer of the chewing sensor 100 would like to lose weight is restricted to eat no more than 500 calories at each meal. Based on information from the IR sensor 160 of the camera 305, the processing component 180 may determine that the wearer is eating, and in response, may calculate the size of the portion of the food in a vessel 703 that the wearer is eating from. The processing component 180 may then use computer vison techniques to guess the food that is being eaten by the wearer. Based on the guess, the processing component 180 may determine the total amount of calories in the vessel 703.
As the wearer eats the food, the processing component 180 may continue to receive images from the camera 305 and may continuously recalculate the amount of food that remains in the vessel 703. Based on the amount of food remaining, the processing component 180 may determine the total amount of calories that have been consumed by the wearer. Once the total amount exceeds the threshold amount (e.g., 500 calories) the sensor 100 may vibrate, beep, or take some action to indicate to the wearer they have exceeded their calorie amount for the meal.
At 810, an image of a vessel is received. The image of the vessel may be received from a camera 305 of the sensor 100. The sensor 100 may be worn by a wearer on their head. For example, the sensor 100 may be attached to a pair of glasses or may be integrated into an AR headset. Depending on the embodiment, the image may have been taken by the camera 305 in response to the processing component 180 detecting that the wearer has begun eating. For example, the sensor 100 may detect movements of a muscle associated with eating using the IR sensor 160, or the processing component 180 may detect that the wearer has spoken a word, pressed a button, or made a gesture that is associated with eating. The vessel may be a plate, bowl, or other food container such as a takeout container.
At 820, data measured by an inertial measurement unit is received. The data may be received by the processing component 180 from the IMU 307. The data may include heading, pitch, and roll angle, for example.
At 830, a viewing angle of the camera is determined. The viewing angle may be determined by the processing component 180 using the data provided by the IMU 307.
At 840, a distance measurement is received. The distance measurement may be received by the processing component 180 from the distance sensor 309. The distance measurement may be the distance between the plate and the sensor 100 worn by the wearer.
At 850, a size of the vessel in the image is estimated. The size of the vessel may be estimated by the processing component 180 using the viewing angle and the distance measurement. In embodiments where the vessel is flat or mostly flat (e.g., plates), the processing component 180 may calculate a first plane that corresponds to an eating surface that the vessel is resting on and may estimate the size of the vessel based on the calculated first plane only.
In embodiments where the vessel 703 is a bowl or has sides, the processing component 180 may additionally estimate a height of the vessel 703 using the distance measurement and viewing angle. The processing component 180 may then calculate a second plane that includes the top of the vessel 703. The first plane may represent a bottom of the vessel. The processing component 180 may then use the first plane, the second plane, and the image to estimate the size of the vessel 703. Depending on the embodiment, estimating the size of the vessel 703 may include estimating one or both of an area or a volume of the vessel 703.
At 860, a food portion is estimated based on the estimated plate size. The portion size may be estimated by the processing component 180. In some embodiments, the processing component 180 may estimate the portion size by determining a percentage of the vessel that is covered by the food in the received image and estimating the size of the portion based on the percentage and the estimated size of the vessel. After estimating the size of the food portion, the processing component 180 may record the determined size and/or may estimate an amount of calories consumed by the wearer of the sensor 100.
At 910, light is emitted towards the skin of a wearer. The light may be emitted by an IR sensor 160 associated with chewing sensor 100. The IR sensor 160 may be part of the chewing sensor 100 and may be attached to the temples of a pair of glasses that are worn by the wearer. The IR sensor 160 may emit infra-red light towards the skin of the wearer, and in particular the skin of the wearer that covers a muscle associated with chewing such as the temporalis muscle.
At 920, a portion of the emitted light is received. A portion of the emitted light that is reflected off of the skin of the wearer is received by IR sensor 160. As may be appreciated, the size or portion of the light that is received changes based on the state of the temporalis muscle.
At 930, a magnitude of the received portion of light is determined. The magnitude of the light may be determined by a sensor circuit 170 associated with the chewing sensor 100. The sensor circuit 170 may receive a portion of light and may output a value that is proportional to the magnitude of the received portion of light.
At 940, that the wearer is chewing is determined based on the determined magnitude. That the wearer is chewing may be determined by the processing component 180 comparing the magnitude to a magnitude that is associated with chewing. In one embodiment, the processing component 180 may compare the changes in magnitude observed over some duration of time. As may be appreciated, as a wearer chews, the determined magnitude may increase and decrease as the temporalis muscle is activated and deactivated. If the patterns in magnitude changes match a pattern associated with chewing the processing component 180 may determine that the wearer is chewing.
At 950, one or more actions are performed based on the determined chewing. The one or more actions may include recording the time when the chewing was determined, counting the total number of chews for a wearer, recording the duration of time associated with the chewing, and estimating the size or portion of the food being eaten based on the number of chews. Other actions may be performed.
Numerous other general purpose or special purpose computing system environments or configurations may be used. Examples of well-known computing systems, environments, and/or configurations that may be suitable for use include, but are not limited to, personal computers, servers, handheld or laptop devices, multiprocessor systems, microprocessor-based systems, network personal computers (PCs), minicomputers, mainframe computers, embedded systems, distributed computing environments that include any of the above systems or devices, and the like.
Computer-executable instructions, such as program modules, being executed by a computer may be used. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. Distributed computing environments may be used where tasks are performed by remote processing devices that are linked through a communications network or other data transmission medium. In a distributed computing environment, program modules and other data may be located in both local and remote computer storage media including memory storage devices.
With reference to
Computing device 1000 may have additional features/functionality. For example, computing device 1000 may include additional storage (removable and/or non-removable) including, but not limited to, magnetic or optical disks or tape. Such additional storage is illustrated in
Computing device 1000 typically includes a variety of tangible computer readable media. Computer readable media can be any available tangible media that can be accessed by device 1000 and includes both volatile and non-volatile media, removable and non-removable media.
Tangible computer storage media include volatile and non-volatile, and removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Memory 1004, removable storage 1008, and non-removable storage 1010 are all examples of computer storage media. Tangible computer storage media include, but are not limited to, RAM, ROM, electrically erasable program read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by computing device 1000. Any such computer storage media may be part of computing device 1000.
Computing device 1000 may contain communications connection(s) 1012 that allow the device to communicate with other devices. Computing device 1000 may also have input device(s) 1014 such as a keyboard, mouse, pen, voice input device, touch input device, etc. Output device(s) 1016 such as a display, speakers, printer, etc. may also be included. All these devices are well known in the art and need not be discussed at length here.
It should be understood that the various techniques described herein may be implemented in connection with hardware or software or, where appropriate, with a combination of both. Thus, the methods and apparatus of the presently disclosed subject matter, or certain aspects or portions thereof, may take the form of program code (i.e., instructions) embodied in tangible media, such as floppy diskettes, CD-ROMs, hard drives, or any other machine-readable storage medium wherein, when the program code is loaded into and executed by a machine, such as a computer, the machine becomes an apparatus for practicing the presently disclosed subject matter. In the case of program code execution on programmable computers, the computing device generally includes a processor, a storage medium readable by the processor (including volatile and non-volatile memory and/or storage elements), at least one input device, and at least one output device. One or more programs may implement or utilize the processes described in connection with the presently disclosed subject matter, e.g., through the use of an application programming interface (API), reusable controls, or the like. Such programs may be implemented in a high level procedural or object-oriented programming language to communicate with a computer system. However, the program(s) can be implemented in assembly or machine language, if desired. In any case, the language may be a compiled or interpreted language and it may be combined with hardware implementations.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
This application claims priority to U.S. Provisional Patent Application Ser. No. 63/030,701 filed on May 27, 2020 entitled NON-CONTACT CHEWING SENSOR AND PORTION ESTIMATOR. The disclosure of which is hereby incorporated by reference in its entirely.
This invention was made with government support under grant No. 25165 awarded by the National Institutes of Health. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
63030701 | May 2020 | US |