The present disclosure is generally related to systems for individualized career counseling.
In the educational world, it is widely understood that the earlier a child engages in formative experiences, the quicker they will master requisite skill sets that afford success later in life. However, the pressure on teens to constantly make forward-thinking decisions, frequently alone, is immense. Significant others in the family, community, and educational ecosystems often feel, or are, uninformed. Moreover, they can be ignorant of the potential ripple effects on lives and society when their loved one finds the “right” career pathway.
Career counseling services can help but are often underfunded, underemphasized, and sometimes incompetent or unmotivated. In the worst cases, a bad counselor can have a profoundly negative effect on the future of a child or young adult.
Automated career path services exist, but can be nebulous and unhelpful. Further, many of these tools and programs only use the most basic analysis techniques, simply giving students a few choices of a field of study based on their interests.
Teens and young adults need guidance on how to achieve their career goals. Human guidance can only be one part of a better guidance structure. Technology that uses artificial intelligence to create recommendations for career, college, and extracurricular activities is needed to optimize career planning.
It is with these observations in mind, among others, that various aspects of the present disclosure were conceived and developed.
Corresponding reference characters indicate corresponding elements among the view of the drawings. The headings used in the figures do not limit the scope of the claims.
A computer-implemented system generates personalized recommendations for a user seeking educational goal and career advice based on characteristics of the user. The system obtains a profile of a user (e.g., a student) that includes personality information, demographics information, and other information about the user, and applies one or more machine-learning model formulated at a processor to generate a set of recommendations for the user based on the profile of the user. In one aspect, the profile of the user can include a personality profile, an academic grade profile, an emotional intelligence (EQ) profile, and a positive intelligence (PQ) profile. Optionally, the profile of the user can further include a demographics profile, a physical characteristic profile, a goals profile, and/or a preferences profile. The system generates a set of recommendations for the user based on the profile of the user, including a set of recommended activities (e.g., clubs, sports etc.), a set of recommended study areas (e.g., college majors, trade school study areas, etc.), a set of recommended careers, and a set of recommended learning institutions.
The system is operable to retrieve questions from a database for administration to a user. Responses to these questions from the user can be applied to obtain the profile for the user. The questions can include a set of personality questions that the system uses to obtain a personality profile of the student—in one implementation, the personality profile can include OCEAN scores for each of 5 OCEAN personality factors: openness, conscientiousness, extraversion, agreeableness, and neuroticism; in other implementations, other personality characterization methods may be employed. The set of recommendations may be made based on the OCEAN scores, among other factors such as emotional intelligence (EQ), positive intelligence (PQ), academic grades, and interests/preferences of the user.
In some examples, the set of recommendations can be dependent upon one another—for instance, if a particular study area is recommended for a student, then the set of recommended learning institutions for the student can include learning institutions that provide high-quality education with respect to the particular study area. In a further aspect, the one or more machine learning models can adjust the set of recommendations over time based on a trajectory of the user and based on feedback from other users.
Embodiments of the present disclosure will be described more fully hereinafter with reference to the accompanying drawings in which like numerals represent like elements throughout the several figures and in which example embodiments are shown. However, the claims' Embodiments may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. The examples set forth herein are non-limiting examples and are merely examples among other possible examples.
The system 100 can include a profile engine 106 that acquires information about the user to construct a profile of the user and store the profile at the one or more databases 120. The system 100 can further include a decision framework 108. The decision framework 108 can receive information about the user including the profile of the user and can apply one or more machine learning models to generate a set of recommendations for the user based on the profile . The system 100 can communicate the set of recommendations to the user through the user interface 190 at the user device 101.
The system 100 can include or otherwise communicate with the one or more databases 120 including a user database 122 that includes information such as the profile of the user, a training database 124 that includes training data for training one or more machine learning models of the decision framework 108, a question database 126 that includes questions and other directives for information acquisition, profile construction and feedback, and a recommendations information database 128 that includes information about each of a plurality of recommendations that can be considered by the system 100. The decision framework 108 of the system 100 can communicate with the recommendations information database 128 to generate the set of recommendations for the user.
The system 100 can store and access information indicative of the profile of the user at the user database 122. This information may include identifying information, the set of recommendations made by the decision framework 108, the user's answers to the questions in the question database 126, and any feedback from the user.
The system 100 can also include the training database 124, which may include data used to train the machine learning (ML) models used by the system 100 to make recommendations. The data in the training database 124 may be similar to the data in the user database 122, and can include labeled data and/or unlabeled data for supervised, unsupervised, or semi-supervised learning (e.g., to train the one or more machine learning models of the decision framework 108). Instead of (or in addition to) feedback data, the training database 124 may include data on other success or failure metrics that can be used to improve recommendations from the decision framework 108 over time. For example, the success or failure metrics can indicate if a person quit an activity, or if a person was ranked among the top of their college major, or if a person was terminated from their job or barred from practicing their career, etc. These metrics may be used to train the one or more machine-learning models of the decision framework 108 to make better recommendations. The system may include a training engine 112, which can train the one or more machine-learning models of the decision framework 108 to make recommendations using data from the training database 124. The training engine 112 may also use data from the user database 122 to train the one or more machine-learning models of the decision framework 108 if there is sufficient data.
The question database 126 can include questions used to evaluate a user's personality, emotional intelligence, positive intelligence, school grades, and physical characteristics. These questions may be accessed by the appropriate sub-engine of the profile engine 106 and presented to the user at the user interface 190.
The recommendations information database 128 can include information about various recommendations that may be made by the system 100. For example, the recommendations information database 128 can include information about various activities, study areas, careers, and/or learning institutions that may be considered for recommendations. Examples include prerequisites/necessary skills, correlation information (e.g., that correlate aspects of the profile of the user such as personality traits to one or more recommendations, that correlate study areas to careers, etc.), statistics, etc. This information may be accessed by the decision framework 108 when generating the set of recommendations.
The system 100 can include a profile engine 106 that acquires information about the user for inclusion within the profile of the user. Information acquired by the profile engine 106 can be stored at the one or more databases 120 in association with the user. The profile engine 106 can include: a personality profile engine 160A that administers one or more personality questions to the user and stores results including information indicative of a personality profile of the user; an EQ profile engine 160B that administers one or more emotional intelligence questions to the user and stores results including information indicative of an EQ profile of the user; a PQ profile engine 160C that administers one or more positivity questions to the user and stores results including information indicative of a PQ profile of the user; and a grades profile engine 160D that requests academic grade information from the user and stores results including information indicative of an academic grade profile of the user. In some implementations, the profile engine 106 can further include at least one of: a physical characteristics profile engine 160E that requests physical characteristic information from the user and stores results including information indicative of a physical characteristics profile of the user, a demographics profile engine 160F that requests demographics information from the user and stores results including information indicative of a demographics profile of the user, a goals profile engine 160G that administers one or more goal-related questions to the user and stores results including information indicative of a goals profile of the user, and a preferences profile engine 160H that administers one or more preference-related questions to the user and stores results including information indicative of a preferences profile of the user. The personality profile, the EQ profile, the PQ profile, the academic grade profile, the demographics profile, the physical characteristics profile, the goals profile, and the preferences profile can be included within the profile of the user, stored at the user database 122 in association with the user, and can be used as input to the decision framework 108 to generate the set of recommendations for the user.
The decision framework 108 can include a plurality of recommendation sub-engines. For example, the decision framework 108 can include an activities engine 180A, which may recommend activities such as basketball, judo, piano, journalism, etc. These recommendations are based on the profile of the user obtained by the profile engine 106 using questions answered by the user via the user interface 190. The decision framework 108 can also include a study areas engine 180B, which may recommend areas of study such as trades and college majors (e.g., physics, economics, medicine, history, etc.). The decision framework 108 can also include a careers engine 180C, which may recommend careers such as earth science, accounting, botany, hairstyling, etc. based on the profile of the user. The decision framework 108 can also include a learning institutions engine 180D that can recommend specific colleges or other schools based on the profile of the user.
The system 100 can include a feedback engine 110, which may allow users to give feedback on a recommendation. Feedback may be given by answering feedback questions which may be retrieved from the question database 126 on the admin network 102, and the answers may be sent to the user database 122 on the admin network 102. The feedback engine 110 may require some amount of verification that the user has reached a certain step towards the recommendation. For example, the user may only give feedback if the user has tried the recommended activity once, tried the recommended activity for 6 months, studied the recommended major for a year, worked in the recommended career field for 3 years, etc.
The system can include the network 103, e.g., the cloud or Internet, which may be a wired and/or wireless communication network. The communication network, if wireless, may be implemented using communication techniques such as Visible Light Communication (VLC), Worldwide Interoperability for Microwave Access (WiMAX), Long Term Evolution (LTE), Wireless Local Area Network (WLAN), Infrared (IR) communication, Public Switched Telephone Network (PSTN), Radio waves, and other communication techniques known in the art. The communication network may allow ubiquitous access to shared pools of configurable system resources and higher-level services that can be rapidly provisioned with minimal management effort, often over the Internet, and relies on sharing of resources to achieve coherence and economies of scale, like a public utility, while third-party clouds enable organizations to focus on their core businesses instead of expending resources on computer infrastructure and maintenance
As discussed, the profile engine 106 can include the personality profile engine 160A that retrieves one or more personality questions from the question database 126, displays the one or more personality questions at the user interface 190, and receives responses from the user at the user interface 190. The personality profile engine 160A can analyze the responses from the user and generate a personality profile 260A for the user based on the responses. The personality profile engine 160A can determine one or more personality scores of the personality profile 260A of the user that quantify aspects of the personality of the user based on the responses to the one or more personality questions. In one example, the one or more personality scores can include OCEAN scores that quantify personality traits including as openness, conscientiousness, extraversion, agreeableness, and neuroticism. The one or more personality scores can also include other personality quantifiers such as, but not limited to, Myers-Briggs Type Indicator (MBTI), Enneagram, and/or DISC assessment.
A process applied at the personality profile engine 160A may begin with the personality profile engine 160A being initiated by the user. The user may select an option with the user interface 190, such as “Personality” or “Answer Personality Questions,”. The personality profile engine 160A may select a question from the question database 126 that is a personality question. These questions will assess the user's personality based on the 5-dimensional OCEAN personality type index. The personality profile engine 160A may determine if this user has an answer in the user database 122 for the selected question, and may determine if the answer has been provided recently (e.g., within the last 6 months). If the question has already been answered, the personality profile engine 160A may skip to another question or to evaluate the answers if all questions have been answered. Users may be able to edit previous answers if they choose. If the user has not already answered the question or wants to change their answer, or if the answer has not been verified recently, the personality profile engine 160A may prompt the user to answer the question. The options for the answer can be based on the question-answer format in the question database 126. For example, a personality question could have the user answer by agreeing or disagreeing with a statement such as “I see myself as extraverted and enthusiastic.” The user may have a range of answers from “Strongly Disagree” to “Strongly Agree”. The personality profile engine 160A may record the user's answer in the user database 122. The personality profile engine 160A may determine if there is another question in the question database 126 that is a personality question. If there is another personality question, the personality profile engine 160A may select the next question. If there are no other personality questions, the personality profile engine 160A may end.
The profile engine 106 can include the EQ profile engine 160B that retrieves one or more emotional intelligence questions from the question database 126, displays the one or more emotional intelligence questions at the user interface 190, and receives responses from the user at the user interface 190. The EQ profile engine 160B can analyze the responses from the user and generate the EQ profile 260B for the user based on the responses. The EQ profile engine 160B can determine one or more EQ scores of the EQ profile 260B of the user that quantify aspects of the emotional intelligence of the user based on the responses to the one or more EQ questions.
The profile engine 106 can include the PQ profile engine 160C that retrieves one or more positivity questions from the question database 126, displays the one or more positivity questions at the user interface 190, and receives responses from the user at the user interface 190. The PQ profile engine 160C can analyze the responses from the user and generate the PQ profile 260C for the user based on the responses. The PQ profile engine 160C can determine one or more PQ scores of the PQ profile 260C of the user that quantify aspects of the positivity of the user based on the responses to the one or more PQ questions.
The processes applied by either the EQ profile engine 160B or the PQ profile engine 160C may begin with the EQ profile engine 160B or the PQ profile engine 160C being initiated by the user. The user may select an option with the user interface 190, such as “EQ/PQ” or “Answer Emotion/Positivity Questions.” The EQ profile engine 160B or the PQ profile engine 160C may be initiated after the personality profile engine 160A has ended. The EQ profile engine 160B or the PQ profile engine 160C may select a question from the question database 126 that is an EQ or PQ question. These questions will assess the user's emotional intelligence, positive intelligence, or both. The EQ profile engine 160B or the PQ profile engine 160C may determine if this user has an answer in the user database 122 for the selected question. Users may be able to edit previous answers if they choose. If the user has not already answered the question or wants to change their answer, the EQ profile engine 160B or the PQ profile engine 160C may prompt the user to answer the question. The options for the answer are based on the question-answer format in the question database 126. For example, an EQ question is likely to have the user answer by agreeing or disagreeing with a statement such as “I am flexible and willing to adapt to new conditions.” The user may have a range of answers from “Disagree completely” to “Agree completely” and a PQ question is likely to have the user answer by agreeing or disagreeing with a statement such as “I am often Intrigued or fascinated” The user may have a range of answers from “Not At All” to “Extremely”. The EQ profile engine 160B or the PQ profile engine 160C may record the user's answer in the user database 122. The EQ profile engine 160B or the PQ profile engine 160C may determine if there is another question in the question database 126 that is an EQ or PQ question. If there is another EQ or PQ question, the EQ profile engine 160B or the PQ profile engine 160C may select the next question. If there are no other EQ or PQ questions, the EQ profile engine 160B or the PQ profile engine 160C may end.
The profile engine 106 can include the grades profile engine 160D that requests academic grade information from the user, displays multiple fields for entry of the academic grade information at the user interface 190 (e.g., course identifiers, when the course was taken, and associated letter or number grades that the user achieved in the course), and receives responses from the user at the user interface 190. The grades profile engine 160D can analyze the responses from the user and generate the grades profile 260D for the user based on the responses. The grades profile engine 160D enables quantification of one or more skills that the user may have and/or mastery of concepts demonstrated by the user based on the reported academic grades.
A process applied by the grades profile engine 160D may begin with the grades profile engine 160D being initiated by the user. The user may select an option with the user interface 190 such as “Grades” or “Answer Grades Questions.” The grades profile engine 160D may be initiated after the personality profile engine 160A, the EQ profile engine 160B and/or the PQ profile engine 160C have ended. The grades profile engine 160D may select a question from the question database 126 that is a grades question. These questions will assess the user's current or past grades in different school subjects. The grades profile engine 160D may determine if this user has an answer in the user database 122 for the selected question. Users may be able to edit previous answers if they choose. If the user has not already answered the question or wants to change their answer, the grades profile engine 160D may prompt the user to answer the question. The options for the answer are based on the question-answer format in the question database 126. For example, a grades question will likely have the user answer by responding with a letter grade to a question such as “what is your current grade in math?”. The user may have a range of answers from F to A+. The grades profile engine 160D may record the user's answer in the user database 122. The grades profile engine 160D may determine if there is another question in the question database 126 that is a grades question. If there is another grade question, the grades profile engine 160D may select the next question. If there are no other grade questions, the grades profile engine 160D may end.
In some implementations, the grades profile engine 160D may be operable to import grades from a school portal or through another method. This may be easier on the user, as they could bypass the time-consuming process of entering each course and grade received by hand, and can also ensure that the grades recorded within the grades profile 260D are accurate with respect to transcripts that a college or other learning institution may receive from the user's school in the future. Further, this may avoid ambiguity that could arise from differences or confusion in course identifiers and standards associated with each. In addition, information imported into the grades profile 260D from a school portal may include additional contextual information such as comments from the instructor, quarterly checkpoint grades, and other information. Written comments from the instructor can, for example, be subjected to natural language processing methods to extract concepts and add context to the profile 206 of the user.
The profile engine 106 can include the physical characteristics profile engine 160E that requests physical characteristics information from the student and/or retrieves physical characteristics questions from the question database 126, displays multiple fields for entry of the physical characteristics information and responses to the physical characteristics questions at the user interface 190 (e.g., “How tall are you?” “How tall are your parents?” “How many push-ups can you reliably complete?” “How long does it take you to run 300 yards?” “How long does it take you to run a mile?”), and receives responses from the user at the user interface 190. The physical characteristics profile engine 160E can analyze the responses from the user and generate the physical characteristics profile 260E for the user based on the responses. The physical characteristics profile engine 160E enables quantification of one or more physical skills and/or attributes that the user may have based on the reported physical characteristics—these factors may be relevant for generating recommendations for activities such as sports, as well as for generating career recommendations for physically-intensive occupations such as athletes and first responders.
A process applied by the physical characteristics profile engine 160E may begin with the physical characteristics profile engine 160E being initiated by the user. The user may select an option with the user interface 190, such as “Physical Characteristics” or “Answer Physical Characteristics Questions.” The physical characteristics profile engine 160E may be initiated after the personality profile engine 160A, EQ profile engine 160B, PQ profile engine 160C, and/or grades profile engine 160D have ended. The physical characteristics profile engine 160E may select a question from the question database 126 that is a physical characteristics question. These questions will assess the user's physical characteristics, such as height, weight, physical condition, whether the user has any disabilities, etc. The physical characteristics profile engine 160E may determine if this user has an answer in the user database 122 for the selected question. If the question has already been answered, the physical characteristics profile engine 160E may skip. Users may be able to edit previous answers if they choose. If the user has not already answered the question or wants to change their answer, the physical characteristics profile engine 160E may prompt the user to answer the question. The options for the answer can be based on the question-answer format in the question database 126. For example, a physical characteristics question is likely to have the user answer by responding with a numerical value to a question such as “what is your current height in inches?”. The user may only be able to answer in realistic values such as between 20 and 100. The physical characteristics profile engine 160E may record the user's answer in the user database 122. The physical characteristics profile engine 160E may determine if there is another question in the question database 126 that is a physical characteristics question. If there is another physical characteristics question, the physical characteristics profile engine 160E may select the next question. If there are no other physical characteristics questions, the physical characteristics profile engine 160E may end.
The profile engine 106 can include the demographics profile engine 160F that requests demographics information from the student and/or retrieves demographics questions from the question database 126, displays multiple fields for entry of the demographics information and responses to the demographics characteristics questions at the user interface 190, and receives responses from the user at the user interface 190. The demographics profile engine 160F can analyze the responses from the user and generate the demographics profile 260F for the user based on the responses. The demographics profile engine 160F enables quantification of information such as location and background of the user that may be relevant for generating recommendations for activities and learning institutions. For example, a user who identifies as Navajo may be eligible for scholarships and/or participation in various clubs and academic societies due to their heritage. In another example, location and/or economic information reported by a user may be relevant for recommending learning institutions for the user.
The profile engine 106 can include the goals profile engine 160G that retrieves one or more goal-related questions from the question database 126, displays the one or more goal-related questions at the user interface 190, and receives responses from the user at the user interface 190. The goals profile engine 160G can analyze the responses from the user and generate the goals profile 260G for the user based on the responses. The goals profile engine 160G can help determine relevancy factors for the set of recommendations for the user—for example, a user may express that their goals may include outcomes such as landing a particular job, achieving a particular income level, attending a prestigious college, and graduating with little to no debt. Other goals may include, for example, a desire to travel, take care of family, and/or participate in humanitarian efforts. Goals expressed by the user may be relevant for recommending activities, study areas, careers, and learning institutions.
The profile engine 106 can include the preferences profile engine 160H that retrieves one or more preferences related questions from the question database 126, displays the one or more preferences questions at the user interface 190, and receives responses from the user at the user interface 190. The preferences profile engine 160H can analyze the responses from the user and generate the preferences profile 260H for the user based on the responses. Preferences profile engine 160H can help determine relevancy factors for the set of recommendations for the user—for example, a user may express preferences about college prestige, location (e.g., distance from home, general region, weather preferences), cost (e.g., can include scholarship availability and tuition cost), attributes (e.g., historically black, technology-focused, Ivy league, etc.) and type (e.g., private, public, military, religious, etc.). Preferences may also include items pertaining to activities, study fields, and careers—for example, a user may indicate that they want a more “hands-on” career, that they may enjoy playing a certain genre of music, or may want to play a particular sport in college. The preferences profile engine 160H may provide one or more fields where a user can enter comments—these comments can be subjected to natural language processing methods to extract concepts and add context to the profile 206 of the user.
With additional reference to
The process may begin with the decision framework 108 being initiated by the user. The user may select an option with the user interface 190, such as “Recommendations” or “View Recommendations”. The base module 104 (
The activities engine 180A can receive information from the profile 206 of the user, including the personality profile 260A, the EQ profile 260B and the PQ profile 260C of the user. In some embodiments, the activities engine 180A can include an activities decision model 182A, which can be a machine-learning model trained to assign an activity relevancy label 282A to one or more activities for the user based on the profile 206 of the user. The activity relevancy label 282A can be a numeric relevancy value that might represent a probability that the associated activity would be relevant to the user, or that may represent a classification value (e.g., on a scale from 1-5 with 5 being most relevant, or a binary “yes” or “no”). Based on the personality profile 260A, the EQ profile 260B and the PQ profile 260C of the user, the activities engine 180A may initially construct the set of recommended activities 280A using the set of activity information and the set of activity correlation information.
The activities engine 180A may then receive additional information about the user, such as the physical characteristics profile 260E, and may modify the set of recommended activities 280A based on the additional information. For example, if a user is or is not athletic, then the activities engine 180A may modify an activity relevancy label for one or more sports activities and update the set of recommended activities 280A accordingly to include or exclude one or more sports based on the modified relevancy labels.
The activities engine 180A may then receive further information about the user, such as the grades profile 260D, demographics profile 260F, goals profile 260G, and/or preferences profile 260H, and may modify the set of recommended activities 280A based on this information. For example, if a user demonstrates math skill as evidenced by their grades profile 260D, then the activities engine 180A may modify a relevancy label for one or more math-related activities (such as robotics or another engineering club) and update the set of recommended activities 280A accordingly to include one or more math-related activities based on the modified relevancy labels. In another example, if the goals profile 260G or the preferences profile 260H indicates that the user enjoys working with children, then the activities engine 180A may modify a relevancy label for one or more related activities (such as tutoring or another club that involves volunteering with kids) and update the set of recommended activities 280A accordingly to include one or more activities based on the modified relevancy labels. In a further aspect, the demographics profile 260F may indicate that a user may be at a statistical advantage or disadvantage, and based on this information the activities engine 180A may modify an activity relevancy label of one or more activities that may improve a probability of success for the user—for example, a user attending school in a small farming town may be at a slight disadvantage due to lack of resources and world exposure, as such, the activities engine 180A may modify an activity relevancy label to emphasize certain sports or clubs that may allow the user to gain skills and travel in order to give them a competitive edge.
The process may begin with the activities engine 180A being initiated by the base module 104 and/or the decision framework 108. The activities engine 180A may select the user in the user database 122. The activities engine 180A may generate the set of recommended activities 280A based on the user's answers to personality questions as indicated within the profile 206 of the user. The questions may be analyzed to generate OCEAN scores for each of the 5 OCEAN personality factors: openness, conscientiousness, extraversion, agreeableness, and neuroticism. Then activity recommendations may be made based on the OCEAN scores. For example, a high score in openness may result in the activities engine 180A making the following activity recommendations. Sports: Archery, CrossFit, Jiu-Jitsu, Hapkido. Arts: Storytelling, Broadcasting, Culture, Drama. Mentoring: Drug and Alcohol Use Support, Field Trips, Advocacy, Home Visiting. Health: Self-Awareness, Nutrition, Body-Mind-Soul Development. Education: Career Exploration, Entrepreneurship, Foreign Language, Field Trips. Volunteerism: Multicultural Ministry, Outreach Events, Field Trips, Home Visiting. The activities engine 180A may, for example, make recommendations based on results that rate with the top and bottom 20% of the scale for each of the 5 factors, as the middle 60% may not show a clear association in either direction for a meaningful level of confidence. These 5 OCEAN factors, or the answers themselves, may be used as direct inputs into the activities decision model 182A, which can be a machine-learning model trained by the training engine 112. The activities engine 180A may alter the set of recommended activities based on the user's answers to EQ and PQ questions. The questions may be analyzed to generate a score for EQ and PQ. Activities may be added to the recommendations or removed based on these scores. For example, if the user scores between 81 and 90 EQ, the following activities would be added to the recommendations if not already recommended. Sports: Roller Skating, Kung Fu, Hockey. Arts: Ballet, Animation, Choreography. Mentoring: Counseling, Motivation. Health: Balance and Flexibility, Stepping, Therapy. Education: Current Events, Ethics, Reading Skills. Volunteerism: Crisis Management, Ex-Offender Assistance, Homelessness. Some score ranges may result in no changes to the generated recommendations, such as below 61 EQ and below 21 PQ. These EQ and PQ scores, or the answers themselves, may be used as direct inputs into the activities decision model 182A. The activities engine 180A may alter the activity recommendations based on the user's answers to school grade questions. For example, students consistently performing poorly in Math Subjects may have activities that require a keen understanding of Mathematics removed from some recommendations, such as Robotics. The user's grades may be used as direct inputs into the activities decision model 182A. The activities engine 180A may alter the activity recommendations based on the user's answers to physical characteristics questions. The user's physical characteristics will be used to change the physical activities recommended. Physical characteristics include height, weight, shoe size, etc.
Each physical characteristic may have a threshold for relevance. For example, the characteristic may be relevant if a user is in the top or bottom 20th percentile for height. A user in the top 20th percentile for height may have basketball, and volleyball added to their recommendations if not already recommended. A user in the bottom 20th percentile for height may have basketball and volleyball removed from their recommendations. Which physical characteristics are relevant, or the characteristics themselves may be used as direct inputs into the activities decision model 182A. The activities engine 180A may save the set of recommended activities 280A in the user database 122 associated with the selected user. The user can view these recommendations via the user interface 190.
In some embodiments, as shown in
For example, the set of study area information can include information about a study area such as an identifier and one or more keywords or other attributes about the study area, such as participation requirements or criteria, skills learned and/or required, general and/or specific fields associated with the activity (e.g., art, culture, music, engineering, math, etc.), availability (e.g., would the student need to travel to a specific school to study a particular subject, or can they go almost anywhere?), graduate or post-graduate availability, investment (e.g., a student wanting to study engineering may require a minimum of 4-year time and tuition investment, whereas a student wanting to study law may require a 4-year undergraduate degree before attending law school) and any other information that may be pertinent to a study area. Further, for example, the set of study area correlation information can include information about how each study area correlates with information that may be present in the profile 206 (such as compatible personality score ranges, preferences, etc.) and information about how each study area correlates with one or more activities, one or more careers (e.g., a user wishing to design computer hardware may want to consider degrees in electrical engineering or computer science), and/or one or more learning institutions (e.g., if a user wants to attend a prestigious arts school, then they may want to consider an arts major).
The study areas engine 180B can receive information from the profile 206 of the user, including the personality profile 260A of the user. In some embodiments, the study areas engine 180B can include a study areas decision model 182B, which can be a machine-learning model trained to assign a study area relevancy label 282B to one or more study areas for the user based on the profile 206 of the user. For example, based on the personality profile 260A of the user, the study areas engine 180B may initially construct the set of recommended study areas 280B using the set of study area information and the set of study area correlation information.
The study areas engine 180B may then receive additional information about the user, such as the grades profile 260D, the EQ profile 260B, the PQ profile 260C, the physical characteristics profile 260E, the demographics profile 260F, the goals profile 260G and the preferences profile 260H, and may modify the set of recommended study areas 280B based on the additional information. For example, if a user is good at math as evidenced by their grades profile 260D, then the study areas engine 180B may modify a study area relevancy label for one or more math-heavy study areas and update the set of recommended study areas 280B accordingly to include or exclude one or more study areas based on the modified study area relevancy labels. In a further aspect, the study areas engine 180B may receive activities information about the user, such as information about hobbies or clubs that the user enjoys and participates in and/or the set of recommended activities 280A (
Table 1 shows a portion of an example “Grades to Majors” correlation matrix that may be included within the set of study area correlation information of the study areas database 128B, where study areas correlate to academic courses based on relative importance of grades—for example, for a journalism major, grades in writing and speech courses are of high importance (assigned a score of “3”) but grades in calculus are of minimal importance (assigned a score of “1”). Based on this example, if journalism was initially included within the set of recommended study areas 280B for a user based on their personality profile, but their grades profile indicate that they do not have sufficient writing skill based on how they perform in writing courses, then the study areas engine 180B may modify a study area relevancy label for the “journalism” study area and update the set of recommended study areas 280B accordingly to de-emphasize or exclude journalism from the set of recommended study areas 280B based on the modified study area relevancy labels. In one aspect, the study areas engine 180B can obtain a grade relevance factor from the set of study area correlation information and modify one or more study area relevancy labels for the user based on the grades profile 260D with respect to the grade relevance factor.
Study area and learning institution recommendations may be separate recommendations or may be associated. For example, the study areas engine 180B may recommend mathematics as a major at any learning institution because of the ubiquity of math but may only recommend botany as a major at learning institutions where botany has a history of leading to a realistic career path. Answers to questions may be analyzed to generate a score for each of the 5 OCEAN personality factors: openness, conscientiousness, extraversion, agreeableness, and neuroticism. Then recommendations may be made based on the scores. For example, a high score in openness may result in the system recommending communications as a major. The study areas engine 180B may only make recommendations based on results that rate with the top and bottom 20% of the scale for each of the 5 factors, as the middle 60% may not show a clear association in either direction for a meaningful level of confidence. These 5 OCEAN factors, or the answers themselves, may be used as direct inputs into the study areas decision model 182E. The study areas engine 180B may alter the set of recommended study areas based on the user's answers to EQ and PQ questions. The questions may be analyzed to generate a score for EQ and PQ. Majors may be added to the recommendations or removed based on these scores. For example, if the user scores between 101 and 110 EQ, the study areas engine 180B may add psychology to the list of recommended study areas. Some score ranges may result in no changes to the generated recommendations, such as below 61 EQ and below 21 PQ. These EQ and PQ scores, or the answers themselves, may be used as direct inputs into the study areas decision model 182B. The study areas engine 180B may alter the set of recommended study areas based on the user's answers to school grade questions. For example, a user with good grades in business, math, and computer-related courses may be recommended majors relevant to careers in business operations, business management, and entrepreneurship. The user's grades may be used as direct inputs into the study areas decision model 182B. The study areas engine 180B may alter the set of recommended study areas based on the user's answers to physical characteristics questions. The user's physical characteristics may not be considered unless there is a physical component to a recommended major or a reason physical characteristics may be relevant to study area selection, such as disability scholarships. Which physical characteristics are relevant, or the characteristics themselves may be used as direct inputs into the study areas decision model 182B. The study areas decision model 182B may save the set of recommended study areas 280B in the user database 122 associated with the selected user. The user can view these recommendations at the user interface 190.
For example, the set of career information can include information about a career such as an identifier and one or more keywords or other attributes about the career, such as participation requirements or criteria, skills required, general and/or specific fields associated with the activity (e.g., art, culture, music, engineering, math, etc.), availability (e.g., is this an exclusive career path in a competitive market, such as if the student wants to be elected President, or is this career path relatively accessible or in high demand?), graduate/post-graduate/certification requirements (e.g., would the student be able to get the job they want with a bachelor's degree alone, or would they need additional schooling?), investment (e.g., a student wanting to be an engineer may only need a 4-year degree at minimum, whereas a student wanting to become a lawyer may require a 4-year undergraduate degree before attending law school, and would then need to pass at least one bar exam in order to practice) and any other information that may be pertinent to a career path. Further, for example, the set of career correlation information can include information about how each career correlates with information that may be present in the profile 206 and information about how each career correlates with one or more activities, one or more study areas, and/or one or more learning institutions.
The careers engine 180C can receive information from the profile 206 of the user, including the personality profile 260A of the user. In some embodiments, the careers engine 180C can include a careers decision model 182C, which can be a machine-learning model trained to assign a career relevancy label 282C to one or more careers for the user based on the profile 206 of the user. For example, based on the personality profile 260A of the user, the careers engine 180C may initially construct the set of recommended careers 280C using the set of career information and the set of career correlation information.
The careers engine 180C may then receive additional information about the user, such as the grades profile 260D, the EQ profile 260B, the PQ profile 260C, the physical characteristics profile 260E, the demographics profile 260F, the goals profile 260G and the preferences profile 260H, and may modify the set of recommended careers 280C based on the additional information. For example, if a user is good at math as evidenced by their grades profile 260D, then the careers engine 180C may modify a career relevancy label for one or more math-heavy careers and update the set of recommended careers 280C accordingly to include or exclude one or more careers based on the modified career relevancy labels. In a further aspect, the careers engine 180C may receive activities information about the user and/or study areas information about the user, such as information about hobbies or clubs that the user enjoys and participates in, information about a subject a user is or wants to minor in, and/or information about a subject that a student has taken classes towards and may use this information to modify a careers relevancy label for one or more related study areas and update the set of recommended careers 280C accordingly. For example, if a user participates in a robotics club, then career relevancy labels for one or more careers that employ skills that might have been learned due to participation in robotics may be modified accordingly. In another example, if a user is pursuing a technical degree as their major, but also has writing skills and/or shows deep interest in topics such as anthropology and history, then relevancy labels for one or more related careers may be modified accordingly—for example, the set of recommended careers 280C may be updated to include careers that involve technical writing or that could involve study of ancient technology.
Table 2 shows a portion of an example “Majors to Careers” correlation matrix that may be included with in the set of career correlation information, where careers correlate to majors based on applicability—for example, if a user pursues an accounting degree, then the set of recommended careers 280C can include careers within the fields of business, tax preparation and analysis, auditing, management, and mathematics. In one aspect, the careers engine 180C can obtain a study area relevance factor from the set of career correlation information and modify one or more career relevancy labels for the user based on the set of recommended study areas and/or based on a grades profile 260D (e.g., for students who have already selected a study area) with respect to the grade relevance factor.
The careers engine 180C may generate the set of recommended careers 280C based on the college majors recommended to the user by the study areas engine 180B (
The learning institutions engine 180D can receive information from the profile 206 of the user, including the preferences profile 260H of the user. In some embodiments, the learning institutions engine 180D can include a learning institutions decision model 182D, which can be a machine-learning model trained to assign a learning institution relevancy label 282D to one or more learning institutions for the user based on the profile 206 of the user. For example, based on the preferences profile 260H of the user, the learning institutions engine 180D may construct the set of recommended learning institutions 280D using the set of learning institution information. The learning institutions engine 180D may also modify one or more learning institution relevancy labels 282D and/or the set of recommended learning institutions 280D based on the personality profile 260A, the EQ profile 260B, the PQ profile 260C, the grades profile 260D, and/or the demographics profile 260F.
For example, the set of learning institution information can include information about a learning institution such as an identifier and one or more keywords or other attributes about the learning institution, such as type, size, accreditation status, public or private status, prestige, requirements and pre-requisites, general and/or specific fields associated with the learning institution (e.g., art, culture, music, engineering, math, etc.), acceptance rate, cost, sports and clubs offered, and any other information that may be pertinent to a learning institution. Further, for example, the set of learning institution correlation information can include information about how each learning institution correlates with information that may be present in the profile 206 (such as compatible personality score ranges, preferences, etc.) and information about how each learning institution correlates with one or more activities, one or more study areas, and/or one or more careers.
In a further aspect, as shown in
In another example, the system 100 may provide information to a user about one or more prerequisite courses that they may be able to take towards a study area that the user may want to target. The profile engine 106 can obtain information about activities and grades for the user, and the decision framework 108 may recommend one or more activities and/or courses that may help the user get on track or get ahead in their study area based on the profile of the user and based on information within the recommendations information database 128 that describe skills, prerequisites, and relative importance of course grades associated with the selected study area.
In a further example, the decision framework 108 may account for a time-aware trajectory of a user. For example, a user may first interact with the system 100 at age 12, where the user is still in middle school. At this age, the user may not have had the opportunity to take elective classes, as such, their academic record may not show any specialization in one or more topics that the user may be able to leverage. The user may also struggle with certain topics due to factors such as age, conditioning, and lack of world exposure. Over time, as the user interacts with the profile engine 106 during the following years, the user may demonstrate variations in their grades and interests and the set of recommendations for that user could change. If the user demonstrates rapid improvement in several completely different areas, then the decision framework 108 may account for an ability of the user to learn about many different topics very quickly as evidenced by information collected about the user over time, and may recommend activities, study areas, and career paths where the user may be able to leverage their ability as a “jack-of-all-trades”. Similarly, the system 100 may adjust physical characteristics of the user based on factors such as age, growth curve, and physical characteristics of their parents. Physical characteristics may also be considered in terms of likelihood that they may change—for example, most 18 year old users have less potential for height growth than most 12 year old users; in another example, significant weight loss can usually be more achievable than significant height gain for an 18 year old user, as such, the decision framework 108 may consider weight and other physical characteristics that are more likely to change over time with less importance than other physical characteristics that are somewhat “fixed”.
In another example, the decision framework 108 may account for nuances within a given topic and identify overarching skills of a user as evidenced by their performance with respect to different topics over time and as evidenced by trajectories and skills of similar users. This may be accomplished by constructing and training the one or more machine learning models of the decision framework 108 to consider time-aware trajectories of users. Further, the one or more machine learning models of the decision framework 108 may be constructed and trained in such a way that allows the decision framework 108 to identify connections between reported traits of a user that can reveal nuances of the user that may not be immediately identifiable.
For instance, a user may struggle with arithmetic but could still be gifted at visualizing patterns and abstract mathematical concepts. As a result, their grades in math during middle school where arithmetic skills are more important may be average or below average until they reach more abstract topics such as geometry, along with other topics such as chemistry and music where practical application of pattern recognition and visualization skills may be vital, during which their grades may improve (but still may suffer due to difficulties with arithmetic). The user may not even notice this ability in themselves, and may even consider themselves to be “bad at math”—this sort of thinking is a common drawback of current career guidance services where a student is pressed to make decisions based on their own perception of themselves and the limited information available to them. To address this, the profile engine 106 can retrieve and display questions that may help quantify hidden or otherwise latent skills of the user, and the decision framework 108 can examine the profile of the user to identify overarching skills and strengths that the user may have based on received responses to the questions. Based on the information within the profile of the user, the decision framework 108 may identify that the user seems to do well with one or more topics—for example, topics involving pattern recognition and abstract visualization. As a result, the system 100 may inform the user that they seem to have a pattern recognition skill and recommend that the user participates in activities such as chess, computer science, and robotics that may foster their pattern recognition abilities and help them overcome their deficits by learning how to use math in practical application.
In a further aspect, the decision framework 108 may be operable to account for contextualized experiences of the user to avoid, for example, taking responses at face value. The question database 126 may include questions that the profile engine 106 may administer to the user regarding grades, after-school jobs, and participation in clubs, as well as questions that add context to each. For example, a user who reports an uncharacteristically low grade in a topic they normally excel in may indicate one or more reasons for the low grade, such as: a bad or chronically unavailable instructor, disruption in home stability, economic stressors that required the user to spend more time at work and less time on studies, and/or an illness that may have prevented the user from operating at their best—this contextual information can indicate that their low grade does not necessarily reflect their skill level or interest in a topic. Other contextual information can be more informative as to which recommendations are suitable or unsuitable for the user—for example, the user may have received a lower grade because they have simply lost interest in the topic; this contextual information can be valuable as it reflects that the user may be better suited for other things, and may not necessarily reflect a deficit in the user's skill level or abilities that are related to the topic. This contextual information may be saved within the profile of the user and included within the inputs provided to the decision framework 108 to help develop a more complete characterization of the user and provide informed recommendations.
Table 3 shows an example entry within the user database 122. The user database 122 may include the profile 206 of the user, user answers to questions, including feedback questions, and the set of recommendations 208 for that user. Each user may have a user ID which may be an internal identifier or may be an identifier of the user, such as a username.
Table 4 shows an example entry within the question database 126. The question database 126 may contain questions used to evaluate a user's personality, emotional intelligence, positive intelligence, school grades, and physical characteristics, and can also include feedback questions. Each entry may include a question ID and the text of the question. Each entry may include the format of the answer. For example, for a question about the user's current letter grade in math, the only answers possible may be the letters A-F with a + or − sign. Each entry may also include a question type such as personality, emotional intelligence, positive intelligence, school grades, and physical characteristics.
The recommendations information database 128 can include the activities database 128A that includes information about various activities that may be considered for recommendation to the user, including the set of activity information descriptive of a plurality of activities and associated activity correlation information. The activities engine 180A can communicate with the activities database 128A to obtain the set of recommended activities 280A for the user based on the profile 206 of the user and the activity correlation information.
The recommendations information database 128 can also include a study areas database 128B that includes information about various study areas that may be considered for recommendation to the user, including the set of study area information descriptive of a plurality of study areas and associated study area correlation information. The study areas engine 180B can communicate with the study areas database 128B to obtain the set of recommended study areas 280B for the user based on the profile 206 of the user and the study area correlation information. In some aspects, the study area correlation information can also incorporate correlations between study areas and the set of recommended activities 280A (e.g., if a user is recommended a particular high school activity, then the set of recommended study areas for college may include degrees that are related to or that use skills associated with the particular high school activity).
The recommendations information database 128 can also include a careers database 128C that includes information about various careers that may be considered for recommendation to the user, including the set of careers information descriptive of a plurality of careers and associated careers correlation information for correlating aspects of the profile 206 of the user to obtain the set of recommended careers 280C for the user. The careers engine 180C can communicate with the careers database 128C to obtain the set of recommended careers 280C for the user based on the profile 206 of the user and the career correlation information. In some aspects, the career correlation information can also incorporate other sets of recommendations, such as the set of recommended study areas 280B (e.g., if a student is recommended a particular study area, then the set of recommended careers can include careers where the student may benefit from having a degree within the particular study area).
The recommendations database 128 can also include a learning institutions database 128D that includes information about various learning institutions that may be considered for recommendation to the user, including the set of learning institution information descriptive of a plurality of learning institutions and associated learning institution correlation information. The learning institutions engine 180D can communicate with the learning institutions database 128D to obtain the set of recommended learning institutions 280D for the user based on the profile 206 of the user and the learning institution correlation information. In some aspects, the learning institution correlation information can also incorporate other sets of recommendations, such as the set of recommended study areas 280B and the set of recommended careers 280C (e.g., if a student is recommended a particular study area and/or a particular career, then the set of recommended learning institutions may include schools that are particularly known for having high-quality education with respect to that study area and/or career).
The training engine 112 may determine if there is feedback in the selected entry for the selected recommendation. Feedback may refer to answers to feedback questions asked to a user. For example, a user may be asked a feedback question such as “On a scale of 1 to 10, how much have you enjoyed playing basketball”. Feedback may refer to measurable metrics of success. For example, data in the training database 124 may include class rankings for people by college major. If there is feedback for the selected recommendation, the training engine 112 may determine if the feedback in the selected entry for the selected recommendation is positive. For example, a user may be asked a feedback question such as “On a scale of 1 to 10, how much have you enjoyed playing basketball”. User answers above 5 may be considered positive feedback, and answers 5 or below may be considered negative feedback. For another example, data in the training database 124 may include class rankings for people by college major. Class rankings in the top 50% may be considered positive feedback, and class rankings in the bottom 50% may be considered negative feedback. Feedback may be neutral. For example, a user may be asked a feedback question such as “On a scale of 1 to 10, how much have you enjoyed playing basketball”. Answers within the range of 4-6 may be considered neutral.
If the feedback for the selected recommendation is positive, the training engine 112 may reinforce the machine learning model or portion of the machine learning model used to make the recommendation. Reinforcement may cause the machine learning model to be more likely to return the same output recommendation for the same or similar inputs. For example, user Bob was recommended a school with a good networking reputation because he is very outgoing and it would benefit his pursuit of a business career. This school was recommended over other more prestigious schools because of Bob's high EQ score. Bob decides to attend school. Bob networks very well at the school, which greatly benefits his business career. When asked to give feedback, Bob responds positively to questions about the school, whether the recommendation was helpful, whether the school helped Bob in his career, etc. Due to this feedback, the machine learning model is adjusted such that the school is more likely to be recommended or ranked higher on a list of recommendations for users with a high EQ. Having a database of the EQ score of users like Bob, and then a database of schools that are ranked by emotional levels (e.g., outgoing activities, social events, network events, etc.) and another database of surveys based on EQ for users like Bob and having a database of correlations (e.g., career value questions, grades) the correlated data from these databases to evaluate initial recommendations to final results to update the ML correlations database (learning) for new recommendations using these databases. If the machine learning model is a neural network, for example, this may be achieved by increasing the weights between active nodes when making the selected recommendation. These changes may be scaled based on how positive the feedback is. For example, a user may be asked a feedback question such as “On a scale of 1 to 10, how much have you enjoyed playing basketball”. An answer of 10 may result in more reinforcement than an answer of 7.
If the feedback for the selected recommendation is not positive, the training engine 112 may adjust the machine learning model or portion of the machine learning model used to make the recommendation. Adjustment may cause the algorithm to be less likely to return the same output recommendation for the same or similar inputs. For example, user Amy was recommended to study a foreign language because of a high openness score based on answers to personality questions. Amy has a C in English and decides to take a foreign language class. Amy enjoys her foreign language class, but her grades in English begin to drop due to studying two different languages. Due to this feedback, the machine learning model is adjusted such that learning a foreign language is less likely to be recommended or ranked lower on a list of recommendations for users with middle to low grades in English. Having a database of the EQ score of users like Amy, and then a database of activities that are ranked by school grades (e.g., math, English, history, etc.) and another database of surveys based on school grades for users like Amy and having a database of correlations (e.g., career value questions, grades) the correlated data from these databases to evaluate initial recommendations to final results to update the ML correlations database (learning) for new recommendations using these databases. If the machine learning model is a neural network, for example, this may be achieved by decreasing the weights between active nodes when making the selected recommendation or by making random changes to nodes that were not used when making the recommendation. These changes may be scaled based on how negative the feedback is. For example, a user may be asked a feedback question such as “On a scale of 1 to 10, how much have you enjoyed playing basketball”. An answer of 1 may result in more adjustment than an answer of 3.
The training engine 112 may determine if the machine learning model has another recommendation output for the selected entry. If there is another recommendation, the training engine 112 may select the next recommendation. If there are no more recommendations, the training engine 112 may determine if there is another entry in the database where the new data was found. If there is another entry, the training engine 112 may select the next entry.
In the example of
In one example implementation, Scikit-learn library was used to develop the machine learning model 400 for decision-tree classification. The features (corresponding to information available within the profile of the user) and target (corresponding to available study areas that may be selected for inclusion in the set of recommended study areas for the user) are as shown in Table 5 below:
The decision tree is also affected by the depth of the tree—hence, during development, the implementation of the machine learning model 400 was benchmarked with the decision tree depth being varied from 1 to 100 to find an optimal tree depth. The decision tree implementation of the machine learning model 400 is found to perform quite well at a depth of 16, yielding an accuracy of 0.959 when compared with a set of ground truth data of the set of training data. While the accuracy may slightly increase for a high depth of the tree; it is desirable to keep the tree as shallow as possible. This implementation was found to have highly diminishing returns for depth over 20. Keeping the tree shallow also ensures that the machine learning model 400 does not suffer from overfitting, keeping the model generalizable.
However, during development, it was found that the larger dataset (using the answers of each TIPI questionnaire as the feature set) to train the machine learning model 400 was found to be time consuming and unnecessary. As such, dimension reduction techniques were explored to determine if accurate classification can be achieved at lower computational and time cost. In machine learning, dimension reduction can be considered a “black box”, and the feature sets derived from a higher dimension to lower dimensions may have no intuitive real-world meaning. In another example implementation of the machine learning model 400, aspects of the personality profile used as input included OCEAN Index, which reduces to five interpretable attributes (as opposed to ten features from the TIPI question set).
The five features derived from social science research in the form of the OCEAN index along with the gender were used as features in this example implementation of the machine learning model 400. The decision tree of the machine learning model 400 considering OCEAN features as input was found to perform well at a depth of 17 having an accuracy of 0.92, and later at depth 20 with an accuracy of 0.95. Here, the number of feature sets was cut in half, reducing computational cost and time consumed, while the accuracy of this machine learning model 400 is still very close to results obtained using the raw data.
A third example implementation of the machine learning model 400 is also explored that uses Scikit-learn's Principal Component Analysis (PCA) for reducing the dimension of user responses into five components. Unlike OCEAN (considered by the second example implementation of the machine learning model 400), PCA-based features do not have a real-world meaning, however, they represent the 10 TIPI questions. In this implementation, data (e.g., responses to personality-based questions) was first fit into PCA to obtain five principal components as features for application as input to the machine learning model 400. Scikit-learn was used to develop decision-tree classification at the machine learning model 400. The decision tree of this implementation of the machine learning model 400 performs well at a depth of 20, yielding an accuracy of 0.94.
A fourth implementation of a machine learning model was developed according to a multi-layer perceptron (MLP)-based deep neural network classification technique. This implementation of the machine learning model optimizes a log-loss function using Limited-memory BGFS, or stochastic gradient descent. The classification task was done using a different solver. This implementation of the machine learning model achieved an accuracy of 0.66 with over 400 hidden neurons. While that is a moderately good result for a 14-class target variable, this was nowhere near the accuracy of the three decision tree implementations of the machine learning model discussed herein. Further, this implementation of the machine learning model took a lot longer to train. In an 8-core 4.4 GHz processor machine, training the neural network with 500 hidden neurons took over 21 minutes, while the decision trees were unusually trained within as little as 10 seconds.
Based on the validation examples and results discussed in this section, the machine learning model 400 implementing a decision tree classifier with the use of OCEAN score as the dimension reduction technique was found to perform well for the task of generating the set of recommended study areas based on personality profile, and also provides an insight that can be used for other use cases. In these validation examples, the decision tree classifier was found to achieve over 90% accuracy with relatively little training data and training time. Other classification techniques like Deep Neural Network (DNN) were also explored; the DNN achieved usable accuracy of about 65 percent and consumed 120× more training time to get that accuracy. Neural networks took over 20 minutes for 500 hidden neurons (with an accuracy of 65 percent), while decision tree classifier training took less than 10 seconds.
The goal of this validation study is not to perform an absolute benchmarking and comparison, but to find a machine learning model configuration that worked best for the limited data that was available at the time. Decision tree was found to perform the best for limited data, but this could change if there is different data, and can be evaluated continuously. For example, other machine learning model configurations may perform better when generating the set of recommendations while considering a large amount of information of different types that may be obtained through the profile engine 106 (e.g., jointly considering not only personality, but also EQ, PQ, grades, physical characteristics, demographics, goals, preferences, and trajectory of the user over time).
Further, the decision tree implementations discussed herein rely on supervised learning—requiring fully-labeled training data to train the machine learning model 400. However, other possible implementations include, for example, unsupervised or semi-supervised learning techniques in which the machine learning model 400 is trained to generate the set of recommendations using semi-labeled data and/or unlabeled data. This may be useful for continuously improving the decision framework 108 over time based on user data, where the outcome of many users may be generally unknown.
In a further aspect, the decision framework 108 can apply preprocessing techniques to data to improve its usefulness and/or accuracy prior to application of the data to the machine learning model 400. For example, the activities engine 180A may receive information from the physical characteristics profile 260E of a user and adjust the information to consider age-grading, growth curve, and/or expected physical characteristics of a user in the future based on those of their parents. In another example, the study areas engine 180B may estimate grades trajectories of a user for future courses based on their past grades in related courses as indicated within the grades profile 260D of the user. In yet another example, the decision framework 108 may receive information from the demographics profile 260F of the user and adjust recommendations for activities and learning institutions accordingly—e.g., adjusting activity recommendations to include activities that may be available to the user based on ZIP code and other demographics data (example: agriculture clubs may be more readily available to students in rural areas while technology and business related clubs may be more readily available to students in urban areas), and adjusting activity and learning institution recommendations based on statistical advantages or disadvantages (example: students at a statistical disadvantage may be recommended one or more activities that may help them gain a competitive “edge”; students from wealthy areas and income levels may be less concerned with selecting an affordable college; students who graduate near the top of their class at a low-performing school may not inherently have the same statistical advantage as students who graduate near the top of their class at an exclusive high-performing school). As such, data obtained from the profile engine 106 may be pre-processed prior to application as input at the one or more machine learning models of the decision framework. Additional pre-processing operations can include application of natural language processing methods to written comments and other inputs associated with the user to extract concepts and add context that may be best expressed through language.
Architecture 500 includes a neural network 510 defined by an example neural network description 501 in an engine model (neural controller) 530. The neural network 510 can represent a neural network implementation of the decision framework 108, including one or more of the activities decision model 182A, study areas decision model 182B, careers decision model 182C and learning institutions decision model 182D. The neural network description 501 can include a full specification of the neural network 510, including the neural network architecture 500. For example, the neural network description 501 can include a description or specification of the architecture 500 of the neural network 510 (e.g., the layers, layer interconnections, number of nodes in each layer, etc.); an input and output description which indicates how the input and output are formed or processed; an indication of the activation functions in the neural network, the operations or filters in the neural network, etc.; neural network parameters such as weights, biases, etc.; and so forth.
The neural network 510 reflects the architecture 500 defined in the neural network description 501. In an example corresponding to the activities decision model 182A, the neural network 510 includes an input layer 502, which includes input data, such as data indicative of a profile of a user including a personality profile corresponding to one or more nodes 508. In one illustrative example, the input layer 502 can include data representing a portion of input data such as answers responsive to questions presented by the profile engine 106, a set of OCEAN scores representing the personality profile, along with an EQ profile, a PQ profile, a physical characteristics profile, a demographics profile, a grades profile, a preferences profile, and/or a goals profile. Input data can also include data about one or more recommendations stored within the recommendations information database 128, such as information about various activities, study areas, careers, and learning institutions and correlation information that describe correlations between various traits of the user and one or more recommendations, as well as correlations between recommendations (e.g., connections between study areas and careers).
The neural network 510 includes hidden layers 504A through 504N (collectively “504” hereinafter). The hidden layers 504 can include n number of hidden layers, where n is an integer greater than or equal to one. The number of hidden layers can include as many layers as needed for a desired processing outcome and/or rendering intent. The neural network 510 further includes an output layer 506 that provides an output (e.g., set of recommendations including a set of recommended activities, a set of recommended study areas, a set of recommended careers, and/or a set of recommended learning institutions) resulting from the processing performed by the hidden layers 504. In an illustrative example corresponding to the activities decision model 182A, the output layer 506 can provide the set of recommended activities based on the profile of the user provided to the input layer 502.
The neural network 510 in this example is a multi-layer neural network of interconnected nodes. Each node can represent a piece of information. Information associated with the nodes is shared among the different layers and each layer retains information as information is processed. In some cases, the neural network 510 can include a feed-forward neural network, in which case there are no feedback connections where outputs of the neural network are fed back into itself. In other cases, the neural network 510 can include a recurrent neural network, which can have loops that allow information to be carried across nodes while reading in input.
Information can be exchanged between nodes through node-to-node interconnections between the various layers. Nodes of the input layer 502 can activate a set of nodes in the first hidden layer 504A. For example, as shown, each of the input nodes of the input layer 502 is connected to each of the nodes of the first hidden layer 504A. The nodes of the hidden layer 504A can transform the information of each input node by applying activation functions to the information. The information derived from the transformation can then be passed to and can activate the nodes of the next hidden layer (e.g., 504B), which can perform their own designated functions. Example functions include convolutional, up-sampling, data transformation, pooling, and/or any other suitable functions. The output of the hidden layer (e.g., 504B) can then activate nodes of the next hidden layer (e.g., 504N), and so on. The output of the last hidden layer can activate one or more nodes of the output layer 506, at which point an output is provided. In some cases, while nodes (e.g., nodes 508A, 508B, 508C) in the neural network 510 are shown as having multiple output lines, a node has a single output and all lines shown as being output from a node represent the same output value.
In some cases, each node or interconnection between nodes can have a weight that is a set of parameters derived from training the neural network 510. For example, an interconnection between nodes can represent a piece of information learned about the interconnected nodes. The interconnection can have a numeric weight that can be tuned (e.g., based on a training dataset), allowing the neural network 510 to be adaptive to inputs and able to learn as more data is processed.
The neural network 510 can be pre-trained to process the features from the data in the input layer 502 using the different hidden layers 504 in order to provide the output through the output layer 506. In an example corresponding to the activities decision model 182A, in which the neural network 510 is used to generate the set of recommended activities based on the profile of the user, the neural network 510 can be trained using training data that includes example profiles and associated activities that are labeled according to suitability for individuals represented within the example profiles. For instance, training data can be input into the neural network 510, which can be processed by the neural network 510 to generate outputs which can be used to tune one or more aspects of the neural network 510, such as weights, biases, etc.
In some cases, the neural network 510 can adjust weights of nodes using a training process called backpropagation. Backpropagation can include a forward pass, a loss function, a backward pass, and a weight update. The forward pass, loss function, backward pass, and parameter update is performed for one training iteration. The process can be repeated for a certain number of iterations for each set of training media data until the weights of the layers are accurately tuned.
For a first training iteration for the neural network 510, the output can include values that do not give preference to any particular class due to the weights being randomly selected at initialization. For example, if the output is a vector with probabilities that the object includes different product(s) and/or different users, the probability value for each of the different product and/or user may be equal or at least very similar (e.g., for ten possible products or users, each class may have a probability value of 0.1). With the initial weights, the neural network 510 is unable to determine low level features and thus cannot make an accurate determination of what the classification of the object might be. A loss function can be used to analyze errors in the output. Any suitable loss function definition can be used.
The loss (or error) can be high for the first training dataset (e.g., images) since the actual values will be different than the predicted output. The goal of training is to minimize the amount of loss so that the predicted output comports with a target or ideal output. The neural network 510 can perform a backward pass by determining which inputs (weights) most contributed to the loss of the neural network 510, and can adjust the weights so that the loss decreases and is eventually minimized.
A derivative of the loss with respect to the weights can be computed to determine the weights that contributed most to the loss of the neural network 510. After the derivative is computed, a weight update can be performed by updating the weights of the filters. For example, the weights can be updated so that they change in the opposite direction of the gradient. A learning rate can be set to any suitable value, with a high learning rate including larger weight updates and a lower value indicating smaller weight updates.
The neural network 510 can include any suitable neural or deep learning network. One example includes a convolutional neural network (CNN), which includes an input layer and an output layer, with multiple hidden layers between the input and out layers. The hidden layers of a CNN include a series of convolutional, nonlinear, pooling (for downsampling), and fully connected layers. In other examples, the neural network 510 can represent any other neural or deep learning network, such as an autoencoder, a deep belief nets (DBNs), and recurrent neural networks (RNNs), etc.
The components shown in
Mass storage device 630, which may be implemented with a magnetic disk drive or an optical disk drive, is a non-volatile storage device for storing data and instructions for use by processor unit 610. Mass storage device 630 can store the system software for implementing embodiments of the present invention for purposes of loading that software into main memory 620.
Portable storage device 640 operates in conjunction with a portable non-volatile storage medium, such as a floppy disk, compact disk or Digital video disc, to input and output data and code to and from the computer system 600 of
Input devices 660 provide a portion of a user interface. Input devices 660 may include a touch-screen display, an alpha-numeric keypad, such as a keyboard, for inputting alpha-numeric and other information, or a pointing device, such as a mouse, a trackball, stylus, or cursor direction keys. Additionally, the system 600 as shown in
Display system 670 may include a liquid crystal display (LCD) or other suitable display device. Display system 670 receives textual and graphical information, and processes the information for output to the display device.
Peripherals 680 may include any type of computer support device to add additional functionality to the computer system. For example, peripheral device(s) 680 may include a modem or a router.
The components contained in the computer system 600 of
The present invention may be implemented in an application that may be operable using a variety of devices. Non-transitory computer-readable storage media refer to any medium or media that participate in providing instructions to a central processing unit (CPU) for execution. Such media can take many forms, including, but not limited to, non-volatile and volatile media such as optical or magnetic disks and dynamic memory, respectively. Common forms of non-transitory computer-readable media include, for example, a floppy disk, a flexible disk, a hard disk, magnetic tape, any other magnetic medium, a CD-ROM disk, digital video disk (DVD), any other optical medium, RAM, PROM, EPROM, a FLASHEPROM, and any other memory chip or cartridge.
The functions performed in the processes and methods may be implemented in differing order. Furthermore, the outlined steps and operations are only provided as examples, and some of the steps and operations may be optional, combined into fewer steps and operations, or expanded into additional steps and operations without detracting from the essence of the disclosed embodiments.
In particular,
As shown in
As shown in
As shown in
As shown in
It should be understood from the foregoing that, while particular embodiments have been illustrated and described, various modifications can be made thereto without departing from the spirit and scope of the invention as will be apparent to those skilled in the art. Such changes and modifications are within the scope and teachings of this invention as defined in the claims appended hereto.
This is a U.S. Non-Provisional Patent Application that claims benefit to U.S. Provisional Patent Application Ser. No. 63/370,185 filed 2 Aug. 2022, which is herein incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
63370185 | Aug 2022 | US |