The present disclosure generally relates to methods and systems for classifying software components to different groups based on their relevance and catering to specific functions based on the information available on them. These category functions can be for different purposes including but not limited to business functions, technology stack functions and others.
There are more than 25 million open-source libraries, and the cloud API is rapidly growing, presenting a huge number of components for building today's applications. Understanding the spread of capabilities presented within these large numbers of components requires representing these components under a manageable number of categories. The abstraction of these components under different categories helps to organize the listings of these software components in a more structured way for different purposes like analysis, searching and browsing the lists to name a few.
Most of the software library components do not have a standardized categorization for them. Some of them are labelled by the authors which are not consistent with a reliable taxonomy nomenclature. Grouping the software components in a standard and consistent way will help in understanding the choices of available components in those specific categories. To visually represent these listings of software component will also bring a consistent expectation to the users on broad capability of the components.
When considering some of the systems and methods in the prior art, the above discussed drawbacks are evident. For example, United States Patent Application Publication Number 2010/0174670A1 discloses a pattern-based classification process that can use only very short patterns for classification and does not require a minimum support threshold. The training phase allows each training instance to “vote” for top-k, size-2 patterns, such as in a way that provides an effective balance between local, class, and global significance of patterns. Unlike certain approaches, the process need not make Boolean decisions on patterns that are shared across classes. Instead, these patterns can be concurrently added to all applicable classes and a power law-based weighing scheme can be applied to adjust their weights with respect to each class. However, the '670 publication describes data classification and hierarchical clustering based on patterns, but is silent on type and domain of data, data processing methods, scoring mechanism and ML techniques.
United States Patent Application Publication Number 2014/0163959A1 discloses an arrangement and corresponding method are described for multi-domain natural language processing. Multiple parallel domain pipelines are used for processing a natural language input. Each domain pipeline represents a different specific subject domain of related concepts. Each domain pipeline includes a mention module that processes the natural language input using natural language understanding (NLU) to determine a corresponding list of mentions, and an interpretation generator that receives the list of mentions and produces a rank-ordered domain output set of sentence-level interpretation candidates. A global evidence ranker receives the domain output sets from the domain pipelines and produces an overall rank-ordered final output set of sentence-level interpretations. However, the '959 publication describes multi-domain Natural language processing for sentence level interpretation but is silent about classification of software documents by hierarchically applying different techniques.
United States Patent Application Publication Number 2015/0127567A1 discloses a data mining system extracts job opening information and derives, for a given job, relevant competencies and derives, for a given candidate, relevant competencies, for the candidate. In some embodiments, the data mining performs authentication of relevant competencies before performing matching. The matching outputs can be used to provide data to a candidate indicating possible future competencies to obtain, to provide data to a teaching organization indicating possible future competencies to cover in their coursework, and to provide data to employers related to what those teaching organizations are covering. However, the '567 publication discloses processing of natural language for skilling and recruitment of human resources but is silent on classification and hierarchical representation of software documents.
U.S. Pat. No. 7,774,288 discloses records including category data that is clustered by representing the data as a plurality of clusters, and generating a hierarchy of clusters based on the clusters. Records including category data are classified into folders according to a predetermined entropic similarity condition. However, the '288 patent describes data classification and hierarchical clustering but is silent in terms of type and domain of data, data processing methods, scoring mechanism and ML techniques.
U.S. Pat. No. 8,838,606 discloses systems and methods for classifying electronic information or documents into a number of classes and subclasses are provided through an active learning algorithm. In certain embodiments, seed sets may be eliminated by merging relevance feedback and machine learning phases. Such document classification systems are easily scalable for large document collections, require less manpower and can be employed on a single computer, thus requiring fewer resources. Furthermore, the classification systems and methods can be used for any pattern recognition or classification effort in a wide variety of fields, including electronic discovery in legal proceedings. However, the '606 patent is silent on type and domain of data, data processing methods, scoring mechanism and ML techniques.
U.S. Pat. No. 9,471,559 discloses creating training data for a natural language processing system that includes obtaining natural language input, the natural language input annotated with one or more important phrases; and generating training instances including a syntactic parse tree of nodes representing elements of the natural language input augmented with the annotated important phrases. In another aspect, a classifier may be trained based on the generated training instances. The classifier may be used to predict one or more potential important phrases in a query. However, the '559 patent describes automatic generation of phrases to use in training of question answering system based on annotation but is silent on taxonomy of software documents.
In view of the above examples and the drawbacks described in each, there is a need for a method and a system that classifies the software components into a defined taxonomy structure. A method or system that uses the different information available in the software component documentation and source code to understand the semantic context of the purpose the software component.
The following presents a simplified summary of the subject matter in order to provide a basic understanding of some of the aspects of subject matter embodiments. This summary is not an extensive overview of the subject matter. It is not intended to identify key/critical elements of the embodiments or to delineate the scope of the subject matter. Its sole purpose to present some concepts of the subject matter in a simplified form as a prelude to the more detailed description that is presented later.
The present disclosure provides an automated and consistent way to organizing these software components into different category dimensions to provide a structured way to browse them in an easier way for the users.
The present disclosure is a new software system which classifies the software components into a defined taxonomy structure. The system uses the different information available in the software component documentation and source code to understand the semantic context of the purpose the software component. The area of application of the software component is also understood by the system in different dimensions, for example, is this component serving a database interaction function or it can be further abstracted to level of serving a finance business function like payment gateway.
Therefore, a system for classifying software components is disclosed herein including at least one processor that operates under control of a stored program including a sequence of program instructions to control one or more components. The components including a Component Categories Portal, a Categorizer, a Classification ML Model builder, a Natural Language Processing (NLP) Cleanup Services, a NLP Extractor, a Clustering Service, a Classification Service, a Dictionary Rules Service, and a Category Ranking Service. The Component Categories Portal views the software components with their classification details and the Categorizer is in communication with the Component Categories Portal to create an overall ranked classification for the software components. The Classification ML Model builder is in communication with the Categorizer to create machine learning models based on the software components with training for prediction tasks and the NLP Cleanup Services is in communication with the Classification ML Model builder to extract needed and clean up sections of information associated with the software components. The NLP Extractor is in communication with the NLP Cleanup Services to extract key software entities based on the software components for classification and the Clustering Service is in communication with the NLP Extractor to group similar software components together. The Classification Service is in communication with the Clustering Services to classify the software components and the Dictionary Rules Service is in communication with the Classification Service to provide software dictionary terms and rules based on the software components for classification ranking. Finally, the Category Ranking Service in communication with the Dictionary Rules Service to compute final top ranked classifications for the software components.
In an embodiment, the Component Categories Portal is configured to view different categories for the software components and view the software components under a category. In an embodiment, the Categorizer is configured to invoke the Classification services to classify the software components, classify the software components based on different information collected and techniques, and apply the techniques including document classification, clustering and entity mapping to classify the software components. In an embodiment, the Classification ML Model builder is configured to create the machine learning models for classifying the software components based on different information sources, train a plurality of models with data extracted from the documentation, code, and model services are provided to classify based on entity extraction, clustering and document classification techniques.
In an embodiment, the NLP Cleanup Services is configured to provide natural language processing services for removing unnecessary information from the extracted data. In an embodiment, the NLP Extractor is configured to provide natural language processing services to extract key software entities from the information collected based on the software components and train the plurality of models based on the software dictionary and the software component information. In an embodiment, the Clustering Services is configured to collect all the software component information and group the software component information into clusters having similar software components, and extract key definition terms from a cluster documentation associated with the clusters and other information collected earlier.
In an embodiment, the Classification Service is configured to predict a category of the software components classification based on the cluster documentation extracted from the software components and apply a threshold-based mechanism to report top categories of software components classification with corresponding confidence scores. In an embodiment, the Dictionary Rules Service is configured to provide the rules for ranking the classifications based on different parameters of project source code metrics and documentation maturity level and provide key software dictionary terms for the categories.
In an embodiment, the Category Ranking Service is configured to fetch the different classifications for the software components done with different services, evaluate classification scores based on the project source code metrics, evaluate the classification score based on the documentation maturity level, and apply rules and normalize score based on the classification scores to rank the classification scores. In an embodiment, the Repo Services is configured to provide integration services for connecting to Project Repository, enable to retrieve source code of the software component and software component documentations, and save the software component information, the retrieved source code and the software component documentations to a database and file store.
A method associated with a system to classify software components into different categories is disclosed. Here, at least one processor that operates under control of a stored program including a sequence of program instructions. A first instruction step including collecting different sources of information about the software components. A second instruction step including extracting required sections of the software component information from a software component documentation and a source code associated with the software component. A third instruction step including pre-processing the extracted software component information using natural language processing techniques. A fourth instruction step including fetching dictionary for classification and rules associated with the classification. A fifth instruction step including running a categorization process on the software components. A sixth instruction step including ranking the different categorization identified for the software components.
One implementation of the present disclosure is a system for classifying software components based on multiple information sources. The system includes one or more processors and memory storing instructions that, when executed by the one or more processors, cause the one or more processors to perform operations. The operations include retrieving a number of sources including a project documentation file, source code, and dependent project list associated with a software component, extracting a number of entities from the number of sources, processing the number of entities based on a machine learning model, mapping the number of entities to a set of rules, generating a number of categorizations based on the mapping of the number of entities to the set of rules, and ranking the number of categorizations based on the set of rules.
In some embodiment, the machine learning model uses natural language processing techniques for removing unnecessary information including hyperlinks, stopwords, and version information.
In some embodiments, the machine learning model is generated based on training data extracted from a number of project documentation files associated with the dependent project list.
In some embodiments, generating the number of categorizations includes providing a first categorization associated with a direct match between one or more entities of the number of entities with the set of rules and providing a second categorization associated with an indirect match between the one or more entities of the number of entities and the set of rules, the indirect match associated with a similarity score, the similarity score identified as equal to or greater than a threshold score.
In some embodiments, one or more of the entities are identified as at least one of a short description, a full description, features, code comments, project tags, and dependent libraries.
In some embodiments, ranking the number of categorizations based on the set of rules includes determining whether a categorization matches a name of the project documentation file.
In some embodiments, the operations further including presenting a user with the ranked categorizations.
Another implementation of the present disclosure is a method for classifying software components based on multiple information sources. The method includes retrieving a number of sources including a project documentation file, source code, and dependent project list associated with a software component, extracting a number of entities from the number of sources, processing the number of entities based on a machine learning model, mapping the number of entities to a set of rules, generating a number of categorizations based on the mapping of the number of entities to the set of rules, and ranking the number of categorizations based on the set of rules.
In some embodiments, the method includes presenting a user with the ranked categorizations.
Another implementation of the present disclosure is one or more non-transitory computer-readable media for classifying software components based on multiple information sources. The non-transitory computer-readable media store instructions thereon. The instructions, when executed by one or more processors, cause the one or more processors to retrieve a number of sources including a project documentation file, source code, and dependent project list associated with a software component, extract a number of entities from the number of sources, process the number of entities based on a machine learning model, map the number of entities to a set of rules, generate a number of categorizations based on the mapping of the number of entities to the set of rules, and rank the number of categorizations based on the set of rules.
The following drawings are illustrative of particular examples for enabling systems and methods of the present disclosure, are descriptive of some of the methods and mechanism, and are not intended to limit the scope of the present disclosure. The drawings are not to scale (unless so stated) and are intended for use in conjunction with the explanations in the following detailed description.
Like reference numbers and designations in the various drawings indicate like elements.
Persons skilled in the art will appreciate that elements in the figures are illustrated for simplicity and clarity and may represent both hardware and software components of the system. Further, the dimensions of some of the elements in the figure may be exaggerated relative to other elements to help to improve understanding of various exemplary embodiments of the present disclosure. Throughout the drawings, it should be noted that like reference numbers are used to depict the same or similar elements, features, and structures.
Exemplary embodiments now will be described. The disclosure may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey its scope to those skilled in the art. The terminology used in the detailed description of the particular exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting. In the drawings, like numbers refer to like elements.
In some embodiments, the primary functional block of the present disclosure includes the Component Categories Portal 101 which has a User Interface for the user to view the different categories available for the software components with their classification details. The user can explore the software components under the categories as a list of software components for that category.
In some embodiments, the Categorizer 104 in communication with the Component Categories Portal 101 and is responsible for the overall ranked classification of the software components. The Categorizer 104 invokes the services to classify the software components on multiple dimensions of information sources and techniques used. The Categorizer 104 classifies the software components based on different information collected and techniques. The techniques used are but not limited to techniques of document classification, clustering and entity mapping to classify the software components.
In some embodiments, the Classification ML Model builder 105 is responsible for creating the machine learning models for classifying the software components based on the different information sources with training for prediction tasks. The Classification ML Model builder 105 trains multiple models with data extracted from the documentation, code, etc., and provides for multiple Model services to classify based on entity extraction, clustering and document classification techniques.
In some embodiments, the Natural Language Processing (NLP) Cleanup Services 106 is in communication with the Classification ML Model builder 105 to extract needed and clean up sections of information associated with the software component, or in other words, provides natural language processing services for removing the unnecessary information from the extracted content. Key technology related words are retained by using a special dictionary of software technology terms during the cleanup process.
In some embodiments, the NLP Extractor 107 is in communication with the NLP Cleanup Services 106 to extract key software entities based on the software components for classification. Hence, the NLP Extractor 107 uses natural language processing services to extract key software entities from the collected software component information. The trained models that are based on software dictionary and collected software component information is used with natural language processing to extract the information.
In some embodiments, the Clustering Service 108 is in communication with the NLP Extractor 107 to group similar software components together. The Clustering Service 108 takes all the software component information and groups them into clusters having similar software components using the machine learning services and trained models. The Clustering Service 108 then extracts key definition terms from the cluster documentation associated with the clusters and other information collected earlier aided by the software technology terms dictionary.
In some embodiments, the Classification Service 109 is in communication with the Clustering Service 108 to classify the software components. The Classification Service 109 is used to predict the category of the software components classification based on the cluster documentation extracted from the Classification Service 109. The Classification Service 109 applies a threshold-based mechanism to report top categories of software components classification with their confidence scores from classification done utilizing the trained models with existing labelled content with the categories.
In some embodiments, the Dictionary Rules Service 110 is in communication with the Classification Service 109 provides software dictionary terms and rules based on the software components for classification ranking. The Dictionary Rules Service 110 provides the different rules for ranking the classifications based on different parameters of project source code metrics and documentation maturity level. The Dictionary Rules Service 110 provides the key software dictionary terms for categories.
In some embodiments, the Category Ranking Service 111 is in communication with the Dictionary Rules Service to compute final top ranked classifications for the software components. The Category Ranking Service 111 fetches the different classification for the software components done with different services. The Category Ranking Service 111 then evaluates the classification score based on the project source code metrics. The Category Ranking Service 111 evaluates the classification score based on the documentation maturity level. The Category Ranking Service 111 applies rules to and normalizes score based on all the classification scores to rank the classification scores.
In some embodiments, the Repo Services 112 provides integration services for connecting to Project Repository 116. This enables the System 100 to get the source code of software component and software component documentations. After fetching this information, the Repo Services 112 saves the software component information, source code and documentations to the database and file store.
In some embodiments, in step 302, the Source Code 308 is parsed and code comments are collected. Then the documentation is analysed to extract the contextual sections for the software component. The information sections include but are not limited to Short Description 310, Full description 311, Features 312, Code Comments 313, Project Tags 314, Dependent Libraries 315, and Release notes (not shown). Below is a sample output from step 302:
In some embodiments, Process 300 further includes steps 303, 304, 305, and 306, described in further detail below in regards to
Referring to
Referring to
‘predicted_topics’: {‘spring-boot’, ‘spring’, ‘spring-mvc’}
Referring to
In regards to the sample output of step 305, for every predicted topic, the respective finds from rule book are displayed above. The first value of array contains index from rule book, the second value is a subcategory of software document under test, the third value is a category of software document under test, the fourth value indicates it will be considered as a technical label of the software document under test, the fifth value indicates whether the match from rule book is a direct match or not, and the sixth value indicates whether name of the software document under test matches with the predicted topic or not.
Referring again to
In regards to the sample output of step 306, the key technologyCategoryDetails contains all the possible categories in array view. The example here contains only one category along with computed rank. The concluded labels, the category, the subcategory and the probability of classification are found against keys techLabels, primaryTechCategory, primaryTechSubcategory, and primaryTechProbability respectively.
As will be appreciated by one of skill in the art, the present disclosure may be embodied as a method and system. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, a software embodiment or an embodiment combining software and hardware aspects. It will be understood that the functions of any of the units as described above can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts performed by any of the units as described above.
Instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act performed by any of the units as described above.
Instructions may also be loaded onto a computer or other programmable data processing apparatus like a scanner/check scanner to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts performed by any of the units as described above.
In the specification, there has been disclosed exemplary embodiments of the present disclosure. Although specific terms are employed, they are used in a generic and descriptive sense only and not for purposes of limitation of the scope of the present disclosure.
This application claims the benefit of and priority to U.S. Provisional Patent Application No. 63/154,381 filed Feb. 26, 2021, the entire disclosure of which is incorporated by reference herein.
Number | Name | Date | Kind |
---|---|---|---|
5953526 | Day et al. | Sep 1999 | A |
7322024 | Carlson et al. | Jan 2008 | B2 |
7703070 | Bisceglia | Apr 2010 | B2 |
7774288 | Acharya et al. | Aug 2010 | B2 |
7958493 | Lindsey et al. | Jun 2011 | B2 |
8010539 | Blair-Goldensohn et al. | Aug 2011 | B2 |
8051332 | Zakonov et al. | Nov 2011 | B2 |
8112738 | Pohl et al. | Feb 2012 | B2 |
8112744 | Geisinger | Feb 2012 | B2 |
8219557 | Grefenstette et al. | Jul 2012 | B2 |
8296311 | Rapp et al. | Oct 2012 | B2 |
8412813 | Carlson et al. | Apr 2013 | B2 |
8417713 | Blair-Goldensohn et al. | Apr 2013 | B1 |
8452742 | Hashimoto et al. | May 2013 | B2 |
8463595 | Rehling et al. | Jun 2013 | B1 |
8498974 | Kim et al. | Jul 2013 | B1 |
8627270 | Fox et al. | Jan 2014 | B2 |
8677320 | Wilson et al. | Mar 2014 | B2 |
8688676 | Rush et al. | Apr 2014 | B2 |
8838606 | Cormack et al. | Sep 2014 | B1 |
8838633 | Dhillon et al. | Sep 2014 | B2 |
8935192 | Ventilla et al. | Jan 2015 | B1 |
8943039 | Grieselhuber et al. | Jan 2015 | B1 |
9015730 | Allen et al. | Apr 2015 | B1 |
9043753 | Fox et al. | May 2015 | B2 |
9047283 | Zhang et al. | Jun 2015 | B1 |
9135665 | England et al. | Sep 2015 | B2 |
9176729 | Mockus et al. | Nov 2015 | B2 |
9201931 | Lightner et al. | Dec 2015 | B2 |
9268805 | Crossley et al. | Feb 2016 | B2 |
9330174 | Zhang | May 2016 | B1 |
9361294 | Smith | Jun 2016 | B2 |
9390268 | Martini et al. | Jul 2016 | B1 |
9471559 | Castelli et al. | Oct 2016 | B2 |
9558098 | Alshayeb et al. | Jan 2017 | B1 |
9589250 | Palanisamy et al. | Mar 2017 | B2 |
9626164 | Fuchs | Apr 2017 | B1 |
9672554 | Dumon et al. | Jun 2017 | B2 |
9977656 | Mannopantar et al. | May 2018 | B1 |
10305758 | Bhide et al. | May 2019 | B1 |
10474509 | Dube et al. | Nov 2019 | B1 |
10484429 | Fawcett et al. | Nov 2019 | B1 |
10761839 | Migoya et al. | Sep 2020 | B1 |
10922740 | Gupta et al. | Feb 2021 | B2 |
11023210 | Li et al. | Jun 2021 | B2 |
11238027 | Frost et al. | Feb 2022 | B2 |
11256484 | Nikumb et al. | Feb 2022 | B2 |
11288167 | Vaughan | Mar 2022 | B2 |
11294984 | Kittur et al. | Apr 2022 | B2 |
11295375 | Chitrapura et al. | Apr 2022 | B1 |
11301631 | Atallah et al. | Apr 2022 | B1 |
11334351 | Pandurangarao | May 2022 | B1 |
11461093 | Edminster et al. | Oct 2022 | B1 |
11474817 | Sousa et al. | Oct 2022 | B2 |
11704406 | Lee et al. | Jul 2023 | B2 |
11893117 | Segal et al. | Feb 2024 | B2 |
11966446 | Socher et al. | Apr 2024 | B2 |
12034754 | O'Hearn et al. | Jul 2024 | B2 |
20010054054 | Olson | Dec 2001 | A1 |
20020059204 | Harris | May 2002 | A1 |
20020099694 | Diamond et al. | Jul 2002 | A1 |
20020150966 | Muraca | Oct 2002 | A1 |
20020194578 | Irie et al. | Dec 2002 | A1 |
20040243568 | Wang et al. | Dec 2004 | A1 |
20060090077 | Little et al. | Apr 2006 | A1 |
20060104515 | King et al. | May 2006 | A1 |
20060200741 | Demesa et al. | Sep 2006 | A1 |
20060265232 | Katariya et al. | Nov 2006 | A1 |
20070050343 | Siddaramappa et al. | Mar 2007 | A1 |
20070168946 | Drissi et al. | Jul 2007 | A1 |
20070185860 | Lissack | Aug 2007 | A1 |
20070234291 | Ronen et al. | Oct 2007 | A1 |
20070299825 | Rush et al. | Dec 2007 | A1 |
20090043612 | Szela et al. | Feb 2009 | A1 |
20090319342 | Shilman et al. | Dec 2009 | A1 |
20100106705 | Rush et al. | Apr 2010 | A1 |
20100121857 | Elmore et al. | May 2010 | A1 |
20100122233 | Rath | May 2010 | A1 |
20100174670 | Malik et al. | Jul 2010 | A1 |
20100205198 | Mishne et al. | Aug 2010 | A1 |
20100205663 | Ward et al. | Aug 2010 | A1 |
20100262454 | Sommer et al. | Oct 2010 | A1 |
20110231817 | Hadar et al. | Sep 2011 | A1 |
20120143879 | Stoitsev | Jun 2012 | A1 |
20120259882 | Thakur et al. | Oct 2012 | A1 |
20120278064 | Leary et al. | Nov 2012 | A1 |
20130103662 | Epstein | Apr 2013 | A1 |
20130117254 | Manuel-Devadoss et al. | May 2013 | A1 |
20130254744 | Sahoo | Sep 2013 | A1 |
20130326469 | Fox et al. | Dec 2013 | A1 |
20140040238 | Scott et al. | Feb 2014 | A1 |
20140075414 | Fox et al. | Mar 2014 | A1 |
20140122182 | Cherusseri et al. | May 2014 | A1 |
20140149894 | Watanabe et al. | May 2014 | A1 |
20140163959 | Hebert et al. | Jun 2014 | A1 |
20140188746 | Li | Jul 2014 | A1 |
20140297476 | Wang et al. | Oct 2014 | A1 |
20140331200 | Wadhwani et al. | Nov 2014 | A1 |
20140337355 | Heinze | Nov 2014 | A1 |
20150127567 | Menon et al. | May 2015 | A1 |
20150220608 | Crestani Campos et al. | Aug 2015 | A1 |
20150331866 | Shen et al. | Nov 2015 | A1 |
20150378692 | Dang et al. | Dec 2015 | A1 |
20160253688 | Nielsen et al. | Sep 2016 | A1 |
20160350105 | Kumar et al. | Dec 2016 | A1 |
20160378618 | Cmielowski et al. | Dec 2016 | A1 |
20170034023 | Nickolov et al. | Feb 2017 | A1 |
20170063776 | Nigul | Mar 2017 | A1 |
20170154543 | King | Jun 2017 | A1 |
20170177318 | Mark et al. | Jun 2017 | A1 |
20170220633 | Porath et al. | Aug 2017 | A1 |
20170242892 | Ali et al. | Aug 2017 | A1 |
20170286541 | Mosley et al. | Oct 2017 | A1 |
20170286548 | De et al. | Oct 2017 | A1 |
20180046609 | Agarwal et al. | Feb 2018 | A1 |
20180067836 | Apkon et al. | Mar 2018 | A1 |
20180107983 | Tian et al. | Apr 2018 | A1 |
20180114000 | Taylor | Apr 2018 | A1 |
20180189055 | Dasgupta | Jul 2018 | A1 |
20180191599 | Balasubramanian et al. | Jul 2018 | A1 |
20180329883 | Leidner et al. | Nov 2018 | A1 |
20180349388 | Skiles | Dec 2018 | A1 |
20190026106 | Burton et al. | Jan 2019 | A1 |
20190229998 | Cattoni | Jul 2019 | A1 |
20190278933 | Bendory et al. | Sep 2019 | A1 |
20190286683 | Kittur et al. | Sep 2019 | A1 |
20190294703 | Bolin et al. | Sep 2019 | A1 |
20190303141 | Li et al. | Oct 2019 | A1 |
20190311044 | Xu et al. | Oct 2019 | A1 |
20190324981 | Counts et al. | Oct 2019 | A1 |
20200097261 | Smith et al. | Mar 2020 | A1 |
20200110839 | Wang et al. | Apr 2020 | A1 |
20200125482 | Smith et al. | Apr 2020 | A1 |
20200133830 | Sharma et al. | Apr 2020 | A1 |
20200293354 | Song et al. | Sep 2020 | A1 |
20200301672 | Li et al. | Sep 2020 | A1 |
20200301908 | Frost et al. | Sep 2020 | A1 |
20200348929 | Sousa et al. | Nov 2020 | A1 |
20200356363 | Dewitt et al. | Nov 2020 | A1 |
20210049091 | Hikawa et al. | Feb 2021 | A1 |
20210065045 | Kummamuru | Mar 2021 | A1 |
20210073293 | Fenton et al. | Mar 2021 | A1 |
20210081189 | Nucci et al. | Mar 2021 | A1 |
20210081418 | Silveira et al. | Mar 2021 | A1 |
20210141863 | Wu et al. | May 2021 | A1 |
20210149658 | Cannon et al. | May 2021 | A1 |
20210149668 | Gupta et al. | May 2021 | A1 |
20210303989 | Bird et al. | Sep 2021 | A1 |
20210349801 | Rafey | Nov 2021 | A1 |
20210357210 | Clement et al. | Nov 2021 | A1 |
20210382712 | Richman et al. | Dec 2021 | A1 |
20210397418 | Nikumb et al. | Dec 2021 | A1 |
20210397546 | Cser et al. | Dec 2021 | A1 |
20220012297 | Basu et al. | Jan 2022 | A1 |
20220083577 | Yoshida et al. | Mar 2022 | A1 |
20220107802 | Rao et al. | Apr 2022 | A1 |
20220215068 | Kittur et al. | Jul 2022 | A1 |
20230308700 | Perez | Sep 2023 | A1 |
Number | Date | Country |
---|---|---|
108052442 | May 2018 | CN |
10-2020-0062917 | Jun 2020 | KR |
WO-2007013418 | Feb 2007 | WO |
WO-2020086773 | Apr 2020 | WO |
Entry |
---|
Chung-Yang et al. “Toward Singe-Source Of Software Project Documented Contents: A Preliminary Study”, [Online], [Retrieve from INternet on Sep. 28, 2024], <https://www.proquest.com/openview/c15dc8b34c7da061fd3ea39f1875d8e9/1?pq-origsite=gscholar&cbl=237699> (Year: 2011). |
Lampropoulos et al., “REACT—A Process for Improving Open-Source Software Reuse”, IEEE, pp. 251-254 (Year: 2018). |
Leclair et al., “A Neural Model for Generating Natural Language Summaries of Program Subroutines,” Collin McMillan, Dept. of Computer Science and Engineering, University of Notre Dame Notre Dame, IN, USA, Feb. 5, 2019. |
Schweik et al, Proceedings of the OSS 2011 Doctoral Consortium, Oct. 5, 2011, Salvador, Brazil, pp. 1-100, Http:/Avorks.bepress.com/charles_schweik/20 (Year: 2011). |
Stanciulescu et al, “Forked and Integrated Variants in an Open-Source Firmware Project”, IEEE, pp. 151-160 (Year: 2015). |
Zaimi et al, “:An Empirical Study on the Reuse of Third-Party Libraries in Open-Source Software Development”, ACM, pp. 1-8 (Year: 2015). |
Iderli Souza, An Analysis of Automated Code Inspection Tools for PHP Available on Github Marketplace, Sep. 2021, pp. 10-17 (Year: 2021). |
Khatri et al, “Validation of Patient Headache Care Education System (PHCES) Using a Software Reuse Reference Model”, Journal of System Architecture, pp. 157-162 (Year: 2001). |
Lotter et al, “Code Reuse in Stack Overflow and Popular Open Source Java Projects”, IEEE, pp. 141-150 (Year: 2018). |
Rothenberger et al, “Strategies for Software Reuse: A Principal Component Analysis of Reuse Practices”, IEEE, pp. 825-837 (Year:2003). |
Tung et al, “A Framework of Code Reuse in Open Source Software”, ACM, pp. 1-6 (Year: 2014). |
M. Squire, “Should We Move to Stack Overflow?” Measuring the Utility of Social Media for Developer Support, 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, Florence, Italy, 2015, pp. 219-228, doi: 10.1109/ICSE.2015.150. (Year: 2015). |
S. Bayati, D. Parson, T. Sujsnjak and M. Heidary, “Big data analytics on large-scale socio-technical software engineering archives,” 2015 3rd International Conference on Information and Communication Technology (ICoICT), Nusa Dua, Bali, Indonesia, 2015, pp. 65-69, doi: 10.1109/ICoICT.2015.7231398. (Year: 2015). |
Andreas DAutovic, “Automatic Assessment of Software Documentation Quality”, published by IEEE, ASE 2011, Lawrence, KS, USA, pp. 665-669, (Year: 2011). |
S. Bayati, D. Parson, T. Susnjakand M. Heidary, “Big data analytics on large-scale socio-technical software engineering archives,” 2015 3rd International Conference on Information and Communication Technology (ICoICT), Nusa Dua, Bali, Indonesia, 2015, pp. 65-69, doi: 1 0. 1109/ICoICT.2015.7231398. (Year: 2015). |
Number | Date | Country | |
---|---|---|---|
20220291921 A1 | Sep 2022 | US |
Number | Date | Country | |
---|---|---|---|
63154381 | Feb 2021 | US |