A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The disclosure relates to the field of wellness maintenance programs with user-specific tracking of food consumption and more particularly to techniques for using crowdsourcing to determine nutritional content of foods depicted in an image.
Good nutrition is essential to wellness. There is a large body of science regarding nutrition (e.g., nutrition minimum daily allowances, nutrition interactions, caloric content, calories from fat, calories from carbohydrates, etc.), and much of the science can be used within a wellness program. For example, a nutrition component of a wellness program might recommend that a particular user should endeavor to consume an average of not more than 2500 calories per day, and that no more than 500 of those calories should come from fat.
While such an endeavor is easy to articulate, it is not so easy to administer. For example, a user might have the intention to “stick to” a nutrition regimen, but actually counting nutrition and calories as food is consumed can be challenging. Legacy techniques have required the user to type in the number of calories consumed (e.g., after eating an portion of guacamole, type in 400 calories for a large portion, and 300 calories for a small portion, type in an additional 250 calories for the tortilla chips). Other legacy techniques have attempted to aid the user by offering a list of foods and a means for selecting items from among the list. Such legacy techniques are deficient in that the burden placed on the user is far too high—especially when the user is asked to track intake for all foodstuff consumed each and every day, and on an ongoing basis.
Some attempts to reduce the burden placed on the user have involved use of crowdsourcing, where an image of the food that has been consumed (or that which is soon to be consumed) is sent to a crowd, and people in the crowd are asked to assess the caloric content of the food in the image. This technique falls short. First, the range of calories assessed by one member of the crowd versus another user of the crowd has been shown to be extreme—and extreme to the extent that the calorie counts from the crowd are deemed to be unreliable. Second, merely reporting calorie counts fails to account for nutritional content of the foods. What is needed is a technique or techniques to receive reliable descriptions of food items, and to use those reliable descriptions of food items to cross-reference into reliable nutrition corpora.
None of the aforementioned legacy approaches achieve the capabilities of the herein-disclosed techniques for using crowdsourcing consensus to determine nutritional content of foods depicted in an image. Therefore, there is a need for improvements.
The present disclosure provides an improved method, system, and computer program product suited to address the aforementioned issues with legacy approaches. More specifically, the present disclosure provides a detailed description of techniques used in methods, systems, and computer program products for using crowdsourcing to determine nutritional content of foods depicted in an image.
Some embodiments commence upon receiving a digital image of food or beverage items, and then transmitting the digital image to a repository configured to serve a plurality of accesses by a plurality of human members (e.g., a crowdsource repository). Members of the crowd generate food description annotations pertaining to aspects of the pictured food or beverage items. The food description annotations (e.g., menu picks, text descriptions) are used to look-up nutrition records. Some use cases correlate a set of food description annotations that are shared between two or more of the human members to generate a confidence score, and some use cases also receive food intake recommendations from the human members. The food intake recommendations can be recorded in a wellness profile which in turn can be used for progress tracking against nutrition goals.
Further details of aspects, objectives, and advantages of the disclosure are described below and in the detailed description, drawings, and claims. Both the foregoing general description of the background and the following detailed description are exemplary and explanatory, and are not intended to be limiting as to the scope of the claims.
Some embodiments of the present disclosure address the problem of determining nutritional content of food in a meal. More particularly, disclosed herein and in the accompanying figures are exemplary environments, methods, and systems for using crowdsourcing to determine nutritional content of foods depicted in a photograph.
A user can take a photo of his or her meal, and upload the photo to a crowdsourcing platform. The caloric content and other attributes of the meal can be identified or discussed by members of the crowd, and attributes of the meal that are deemed to be statistically reliable (e.g., using a statistical confidence interval and a threshold) and/or trending toward a consensus can be used in recording content and/or other aspects of the meal on behalf of the user. For example, if a user were to take a picture of a “big burger combo meal”, a viewer might be able to identify the foodstuff in the photo (e.g., as a “big burger combo meal”). Then, the identification can be used in a lookup operation to determine nutritional characteristics of the meal. For example, given an identification of foodstuff in a photo in the form of a text string comprising, “big burger combo meal”, a lookup operation might return a well-researched nutrition description. Continuing this example, a “big burger combo meal” might be known to comprise 980 calories, which includes 29 grams of fat, and includes 1040 mg of sodium. Still more nutrition-related information might be included in the nutrition description, such as pertaining to cholesterol, proteins, and a listing of vitamins and minerals (e.g., potassium and calcium) so as to track the contribution of the meal against daily requirements for those nutrients.
The aforementioned crowdsourcing platform might be any crowdsourcing platform where users perform tasks free or for token amounts of money or for recognition. Strictly as one example, one crowdsourcing implementation called the “Mechanical Turk” supports many different projects. Such projects often comprise tasks where a given task requires very little time from a human in exchange for some small measure of notoriety and/or a small amount of compensation (e.g., winning a game or scoring points or receiving micro-payments).
Crowdsourcing projects can be hosted in a manner such that when a user chooses a task, the user would have a higher likelihood of “winning” and/or of receiving other recognition or compensation if a particular type of task is selected. Over time, users would tend to select tasks that hold a higher likelihood of returning recognition or compensation (e.g., winning a game or scoring points or receiving micro-payments).
In some cases the results of a crowd can be filtered to as to remove wrong or incomplete responses, and in some cases human or automated referees can be employed so as to select-in or reject-out crowdsourcing results so as to achieve acceptable statistical measures over a population of crowdsource-generated descriptions.
There may be a time delay between the time that a user takes a photo of his or her meal (e.g., for upload to a crowdsourcing platform) and the time that the crowdsourcing results from the uploaded photo achieve acceptable statistical consensus (e.g., consensus or confidence based on the characteristics of a population of corresponding crowdsource-generated descriptions).
The following figures and descriptions show and describe techniques for implementing crowdsourcing into a wellness program.
Some of the terms used in this description are defined below for easy reference. The presented terms and their respective definitions are not rigidly restricted to these definitions—a term may be further defined by the term's use within this disclosure.
Reference is now made in detail to certain embodiments. The disclosed embodiments are not intended to be limiting of the claims.
As shown in
The food images are sent to the crowd. Some of the images are sent to the crowd without the tags, and participants in the crowd are tasked with adding tags. Also, sometimes in parallel, some of the images are sent to the crowd with the tags, and referees 108 in the crowd 104 are tasked with adding tags selected from the given tags. Using the responses from the crowd in combination with responses from referees 108, a subset of the crowd can be identified (see reliable participants 1061). For example, tags received from the referees 108 can be compared to tags received from others, and tag responses that deviate statistically from the responses from the referees might be considered to be coming from unreliable participants. A set of responses from the crowd, in particular a selection of responses from reliable participants 1061, can serve to training a learning model or other classifier within the consensus processor.
A configured consensus processor can be used to automatically add and possibly fill-out a set of food tags (e.g., food tag 112) pertaining to an image based on a smaller set of food tags. For example, if a crowd reports consensus that a photo of a “big burger meal” is accurately described as a “Big Whopping Meal” (e.g., from the menu of the “Big Whopping Burger” chain of restaurants), then the remaining nutrition information can be added to the food image.
The initial population process 113 can commence at any time and results of the initial population process can serve to tag images with known and/or refereed annotations, which annotations can be stored together with corresponding images in an annotated food image database. Also, at any point in time a newly photographed food item can be uploaded to crowd 1042 (see
An enterprise might sponsor a wellness program having incentives and having a wellness application 116 to aid employees in formulating wellness goals and for tracking achievement toward goals. To facilitate tracking, a user 105 can take a food photo 138 (e.g., using a smart phone), and upload it to a crowd using the computer-aided assistance of a wellness application user interface (e.g., a user interface pertaining to an uploader 118 and/or a user interface pertaining to a control panel 120). In exemplary situations, a user can interact with a tracking user interface 122, which in turn interacts with application logic 124. The application logic can receive an event 140 (e.g., the event attached to or corresponding to a respective food photo) and can perform a lookup (e.g., using lookup engine 126). The lookup engine in turn accesses a database of annotated food images 128 comprising an image 130, an image ID 132, a description of a food type 134, and a food quantity 136. Annotations for any number of, or ranges of, food types can be described in any form, including being described in a tabular form, such as is depicted in Table 1.
As heretofore mentioned, an enterprise might sponsor a wellness program including a wellness application 116 to aid employees in formulating wellness goals and for tracking achievement toward goals. To facilitate tracking, a user 105 can interact with a wellness application user interface which in turn interacts with application logic 124 to process instances of annotated food images 128. For example, application logic 124 may use a lookup engine to 126 to retrieve annotated food images that correspond to a food photo 138 as photographed and uploaded by the user. In exemplary cases, the food image is photographed contemporaneously with the act of consuming the food, and the event 140 can codify aspects (e.g., time, date, place, etc.) of the act of consuming the food. Any number of events, together with a food photo 138 can be collected and stored, and at any moment in time, the nutritional content (e.g., calories, protein, carbohydrates, etc.) of the foodstuff in the food image can be queried from the database of annotated food images. Calorie counts 142 and other summary information (e.g., average sodium per day, etc.) can be returned and from a query. In some cases the aforementioned query might not return food types and/or food quantity corresponding to the event as recorded by the food image. In such a case, aspects of the event are not included in the summary information, and at some later moment in time, the crowd might produce an annotated food image 110, which is then stored in the database of annotated food images 128 in a form that can be queried (e.g., by lookup engine 126).
Using the crowd, the food items pictured in a food photo 138 can be identified and, once identified, the nutritional content can be looked up in a nutrition database 129, possibly using a query processor 133 and a database engine 131. Ambiguity and/or omissions and/or errors in the database of annotated food images can be reduced by using data processing techniques within the consensus processor. Strictly as one example, to a first observer, a moderate portion of blanched lean chicken breast might look the same as a very large portion of mozzarella cheese, and the observer might report the image a containing a very large portion of mozzarella cheese. Yet multiple observers might see a moderate portion of blanched lean chicken breast, and a consensus across the multiple observers can improve reliability of crowd-sourced results. A subsystem for implementing various data processing techniques, including a consensus processor is shown and discussed as follows.
As shown in
The transformation process 300 depicts a transformation where a food photo 138 is transmitted to an annotation service bureau 302 for transformation into an annotated food image. The annotation service bureau might perform crowdsourcing activities within particular standards and/or with particular rigor. In some cases, the annotation service bureau might produce tagged data that follows a particular syntax and/or schema and/or that conforms to specific grammar or that carries particular semantics.
As shown, the annotation service bureau 302 asks questions about the food photo (e.g., “What food items do you see?”) and records responses in the form of submitted answers to the question(s). Strictly as an example, answers might comprise a generic description (e.g., “a big breakfast”, etc.), or might be specific to an identified food item and quantity (e.g., “two sausage links”, “two bangers”, etc.). In some cases the food item might be described in some detail, possibly including the presence or absence of condiments and/or observation of a preparation method or methods (e.g., “two eggs easy over”, “toast with butter”). In some cases a description might use regional descriptions, and/or slang and/or multiple languages across multiple responses, or even within a particular response. For example, a generic description for a photo of a big breakfast depicting vegetables and sausages might be characterized by a regional or common name description (e.g., “bubble and squeak”).
As heretofore described, attributes of the meal can be identified or offered by members of the crowd (e.g., the service bureau), and attributes of the meal that are deemed to be statistically reliable and/or trending toward a consensus can be used in recording aspects of the meal on behalf of the user. The characteristics that are used to deem an answer as statistically reliable and/or trending toward a consensus are configurable. Different configurations might be employed based on the makeup and/or geographic centroid of the crowd or service bureau. Moreover, different selections of syntax, schema, and semantics can be used in different data flows and/or in different environments. An XML-based example of a syntax, a schema, and corresponding semantics are discussed as follows.
As shown in
As another example, as shown in the XML representation 402, if a tag (e.g., an XLM attribute) indicates the text, “Big breakfast” (see the tagging of the “Description” attribute in XML element “Consensus”), then a query to the nutrition database 129 might return a listing of a caloric range pertaining to a big breakfast. Such a returned caloric range might be reconciled with items described in using the Item element. In some cases, the Description attribute might be present (or absent) and in some cases the Item elements might be present (or absent) and/or might be complete (or incomplete).
The calories and or nutrition might be presented to the user, possibly in a tracking user interface 122. One embodiment of a tracking user interface is shown and discussed as pertaining to the following figures.
As shown in
The information presented under a profile tab 502 might comprise personal characteristics of the user, including wellness goals 506, wellness rewards, team composition, weight loss progress and/or totals, activity progress and/or totals, and other aspects of a user.
The tracker tab 504 can be clicked or touched so as to bring up a nutrition target user interface.
As shown in
In some embodiments a series of events (e.g., Breakfast, Lunch, Dinner, or Meal 1, Meal 2, Meal 3, etc.) can be presented in a tracking array 510. In some cases historical events (this morning's breakfast) or predicted events (tomorrow's dinner) are shown spanning a time period. The time period and characteristics of the events are configurable.
In exemplary cases, an event that falls in a shown time period can be clicked or touched, and various characteristics of that event can be shown graphically (e.g., as a photo, or as a dynamically-generated table). In the example discussed hereunder, Monday's breakfast included toast, sausage, eggs, and ham. The user used a photo upload user interface to facilitate taking a photo of the breakfast. One example of a photo upload interface is depicted in the following
Such data gathering can be accomplished within wellness application environment 1B00 so as to adjust meal reminders to schedules and/or to relate nutritional intake to schedules and/or to relate nutritional tracking and goals to characteristics of the user.
Further, the burden of correlating any of the aforementioned schedules and/or characteristics of a user to a nutrition plan can be facilitated by the presence of certain data items within a wellness application environment 1B00. For example, a wellness profile 103 can store work and vacation schedules as well as any aspects of the user as the user may wish to use in goal-setting and tracking. The crowd and consensus processor can be used to receive recommendations from a crowd. For example, an uploader can be used to transmit portions of wellness profile to the crowd repository. Wellness program recommendations can be received from the crowd, and the wellness program recommendations from the crowd can be correlated using a consensus processor. In some cases, successful tracking of progress to a goal can be demonstrated to an insurance carrier (e.g., by transmitting documentation pertaining to tracking of progress to a goal), and the insurance carrier might reduce a premium amount.
As previously mentioned, an event that falls in a shown time period can be clicked or touched, and various characteristics of that event can be shown graphically. In the example of
In some cases, the food photo can be retrieved from a user's photo stream or the user's cloud repository. In some cases, a wellness application is configured with a smart phone app, which app serves to upload the food photo to a repository (e.g., defined by a URL) for subsequent access by a crowd. A user can allow (or deny) access to location services (e.g., via GPS or triangulation), and can co-upload other images that give the crowd participant some situational context. For example, a photo of a “big burger combo meal” might only depict the top portion of a sesame bun—and not any of the meat and/or cheese contents that would be consumed. While a particular crow participant might regard this to be by itself ambiguous, if a co-uploaded image included the name and/or setting of the dispensary (e.g., a photo of a “Big Burger Franchise” location), then a crowd source participant might be able to identify the foodstuff in the photo unambiguously as a “big burger combo meal”. Other information can be provided together with an uploaded image, and/or other information can be provided in a separate co-uploaded image. For example, the additional information might include time of day, or the name of the wireless network being used and/or the names of other co-located wireless networks. In some cases the name of the establishment can be determined from context, and can be provided in an upload. For example, the name of a wireless network might contain the name of the establishment. Any of the aforementioned information can be included as a part of a food image, and/or can be captured and/or transmitted as metadata. Such metadata can comprise a GPS coordinate (e.g., longitude and latitude), an establishment name (e.g. as a text string or as an image), a wireless network name, and/or a timestamp.
Referring to
As shown in
In addition to uploading food images to the crowd, any aspects of a user's wellness program (e.g., any aspects shown in tracker tab 504) can be uploaded to the crowd for comment. In some cases crowd participants can serve as wellness coaches, and might comment on a particular user's program. Further, components within the consensus processor subsystem 200 (e.g., consensus processor 111) might be used to correlate varied responses from the crowd.
A user's actual intake of food based on a series of food photos, and based on the heretofore-described lookups as compared with a target, can be shown in a chart. In this example, the chart is presented under the tracker tab 504.
In this example, the intake of “Calories” and “Protein” are shown as compared with a “Target” amount. In other instances, other intake amounts as may be pictured in a food photo (e.g., cups of coffee) may be shown on the same or a different chart. A chart such as exemplified in nutrition tracker user interface 5E00 can track any sorts of or combinations of intake, and a chart such as exemplified in nutrition tracker user interface 5E00 can track over any time period and/or time period granularity.
According to one embodiment of the disclosure, computer system 700 performs specific operations by processor 707 executing one or more sequences of one or more instructions contained in system memory 708. Such instructions may be read into system memory 708 from another computer readable/usable medium, such as a static storage device or a disk drive 710. In alternative embodiments, hard-wired circuitry may be used in place of or in combination with software instructions to implement the disclosure. Thus, embodiments of the disclosure are not limited to any specific combination of hardware circuitry and/or software. In one embodiment, the term “logic” shall mean any combination of software or hardware that is used to implement all or part of the disclosure.
The term “computer readable medium” or “computer usable medium” as used herein refers to any medium that participates in providing instructions to processor 707 for execution. Such a medium may take many forms, including but not limited to, non-volatile media and volatile media. Non-volatile media includes, for example, optical or magnetic disks, such as disk drive 710. Volatile media includes dynamic memory, such as system memory 708.
Common forms of computer readable media includes, for example, floppy disk, flexible disk, hard disk, magnetic tape, or any other magnetic medium; CD-ROM or any other optical medium; punch cards, paper tape, or any other physical medium with patterns of holes; RAM, PROM, EPROM, FLASH-EPROM, or any other memory chip or cartridge, or any other non-transitory medium from which a computer can read data.
In an embodiment of the disclosure, execution of the sequences of instructions to practice the disclosure is performed by a single instance of the computer system 700. According to certain embodiments of the disclosure, two or more computer systems 700 coupled by a communications link 715 (e.g., LAN, PTSN, or wireless network) may perform the sequence of instructions required to practice the disclosure in coordination with one another.
Computer system 700 may transmit and receive messages, data, and instructions, including programs (e.g., application code), through communications link 715 and communication interface 714. Received program code may be executed by processor 707 as it is received and/or stored in disk drive 710 or other non-volatile storage for later execution. Computer system 700 may communicate through a data interface 733 to a database 732 on an external data repository 731. Data items in database 732 can be accessed using a primary key (e.g., a relational database primary key). A module as used herein can be implemented using any mix of any portions of the system memory 708, and any extent of hard-wired circuitry including hard-wired circuitry embodied as a processor 707.
In the foregoing specification, the disclosure has been described with reference to specific embodiments thereof. It will, however, be evident that various modifications and changes may be made thereto without departing from the broader spirit and scope of the disclosure. For example, the above-described process flows are described with reference to a particular ordering of process actions. However, the ordering of many of the described process actions may be changed without affecting the scope or operation of the disclosure. The specification and drawings are, accordingly, to be regarded in an illustrative sense rather than in a restrictive sense.
The present application is related to co-pending U.S. patent application Ser. No. ______, entitled, “OPTIMIZING WELLNESS PROGRAM SPENDING” (Attorney Docket No. ORA140562-US-NP), filed on even date herewith; and present application is related to co-pending U.S. patent application Ser. No. ______, entitled, “FORMING RECOMMENDATIONS USING CORRELATIONS BETWEEN WELLNESS AND PRODUCTIVITY” (Attorney Docket No. ORA140676-US-NP), filed on even date herewith, each of which are hereby incorporated by reference in their entirety.