1. Field of the Invention
The invention disclosed and claimed herein generally pertains to crowdsourcing, wherein software service providers are engaged online to perform specified tasks. More particularly, the invention pertains to crowdsourcing of the above type, wherein tasks are of varying complexity, and are generated automatically according to specifications provided by an entity such as the task owner or requester.
2. Description of the Related Art
As is known by those of skill in the art, Web 2.0 Technologies® have significantly enhanced interactive information sharing and collaboration over the Internet. This has enabled crowdsourcing to develop as an increasing popular approach for performing certain kinds of important tasks. In a crowdsourcing effort or procedure, a large group of organizations, individuals and other entities that desire to provide pertinent services, such as a specific community of providers or the general public, are invited to participate in a task that is presented by a task requester. Examples of such tasks include, but are not limited to, developing specified software components or the like.
At present, a crowdsourcing platform may serve as a broker or intermediary between the task requester and software providers who are interested in undertaking or participating in task performance. Crowdsourcing platforms generally allow requesters to publish or broadcast their challenges and tasks, and further allow participating providers that are successful in completing the task to receive specified monetary rewards or other incentives. Innocentive®, TopCoder®, and MechanicalTurk® are examples of presently available platforms.
Currently however, there is no system or process available for automatically creating tasks that are to be broadcast to a crowd for crowdsourcing. Instead, such tasks typically must be manually assembled. At present, scripts or API mechanisms may be used to generate multiple instances of a task that has already been manually created. However, creation of the core tasks is generally not addressed by the current state of the art. Thus, popular and successful crowdsourcing platforms require task requesters to manually enter their requirements and post their tasks.
Embodiments of the invention provide a method, system and computer program product for automatically generating tasks for crowdsourcing, wherein each task is characterized by one of multiple levels of complexity. Also, some embodiments may be adapted to mine the vast amounts of data which are presently available online, for use in generating crowdsourcing tasks. One exemplary embodiment of the invention is directed to a method for automatically generating a set of tasks for use in connection with a crowdsourcing procedure, in order to achieve a prespecified goal. The method comprises the step of constructing a first template which pertains to the prespecified goal, provides a specified context and is associated with one or more data repositories. The method further comprises using the first template to generate one or more first level tasks, wherein a given one of the first levels tasks is generated by selecting a single object of a specified type from a repository containing a plurality of objects, and then generating a specification for performing a single operation on the single object encapsulated by the task. The given first level task is submitted to the crowdsourcing procedure for execution, in order to complete the given first level task.
As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or computer program product. Accordingly, aspects of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects of the present invention may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C++ or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer may be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
Referring to
As an input to enable creation or generation of respective tasks, Task Generator 102 must receive a task template such as task template 108a or 108b. A task template specifies a high level description of a method with operations to perform over one or more repositories to generate a task. Users who wish to use crowdsourcing first define a template for the task and then identify one or more repositories or other data sources that contain data objects, such as repositories 110a and 110b, from which tasks are to be generated using the procedure defined in the template. A repository may consist of a static data source such as a spreadsheet, a file or a database. A repository may also be a dynamic data source such a distributed file system or even the World Wide Web which may be searched for a specific type of data using the method specified in the template. Using the method and directives in the template and the specified repositories, Task Generator 102 then repeatedly performs actions over the repositories to automatically produce or define specific tasks. The task template may further specify a set of rules and/or preconditions to be used when processing the data in the repositories. Task Generator 102 is additionally supported by a context specific taxonomy such as contextual taxonomy schema 112a and 112b. These further define the context for respective tasks which must be carried out in order to achieve the specified goal.
In embodiments of the invention, a first level task, also referred to as a pattern I type of task, is characterized as being created from well defined atomic objects which are contained in a repository. A first level task may be defined and generated by selecting a single atomic data object from the repository. Such a task may then be completed by performing a single, well defined operation on the information supplied by or contained in the selected single object. First level tasks are generally characterized further by well defined choices, explicit patterns and rules and well defined data sources, and can be readily carried out using patterns and rules.
Examples of first level tasks include transcription, classification, topic specific content generation, data collection, image tagging, feedback, usability tests, and proof reading. It will be seen that tasks of this sort may be readily generated by repeatedly applying a template a number of times, on each of multiple objects contained in a repository. Generation of first level tasks can be automated in a straightforward manner.
A second level task, also referred to as a pattern II type of task, is generated by combining two or more data objects which are taken either from the same repository or from different repositories. Thus, in order to generate a pattern II type of task, it is necessary to select two or more objects from the same or different repositories and perform a set of operations on these objects. The associated templates specify the method for selecting data objects and the operations to carry out. In some cases, a mash-up of data may be required for defining a task of this type. That is, a second level task requires an operation that selectively blends or combines information which is provided by each of the two or more objects. Examples of a mash-up or blending operation include, without limitation, filtering, sorting, merging and correlating the provided information, as shown by function 114 of Task Generator 102.
Second level tasks are further characterized as having well-defined task operations and well-defined choices. However, some parts or portions of a second level task pattern may not be explicitly defined, or may require further evaluation. Thus, further search and discovery may be required. Moreover, some parts of the task pattern may need results from certain first level tasks or from dynamic rules. In like manner with first level tasks, second level tasks can be iterated to perform the same task a number of times, on different objects contained in one or multiple repositories. The decomposition of second level tasks into subtasks, orchestration of subtasks, merging and processing of intermediate results that may further generation of additional subtasks is handled by Task Generator 102. The subtasks may be crowdsourced or processed via machine automation. The entire orchestration for each main task is performed using a workflow for that task. The task generator generates the workflow using the associated template. Some portions of the workflow may be evaluated dynamically based on the intermediate results of the subtasks. For example, the workflow may consist of branching or additional search and subtask generation based on the aggregation of results from completed subtasks.
As described above, Task Generator 102 is provided with the capability to recognize the comparative complexity of the tasks which it generates. Using the criteria specified in the templates and the rules, it recognizes the method to use for task evaluation. Tasks that meet the automation criteria are handled by routing to machine processing by invoking appropriate functions in Automatable Tasks Function 106. Tasks that meet the crowdsourcing criteria are dispatched to a crowdsourcing platform that is described in a subsequent section. The remaining tasks that do not meet either criteria are flagged as exception tasks and are sent back for further evaluation by experts of the system. In one embodiment, by way of example, the criteria could be that each task defined to be a first level task would be routed to Automatable Tasks Function 106 for automated processing, and each second level task would be routed for crowdsourcing. However, it is not intended to limit the invention thereto. In another embodiment, there could be criteria that would route some first level tasks to the crowdsourcing platform, and/or would route some second level tasks for automated processing.
When a task is routed to the crowdsourcing platform, Task Generator 102 also specifies the qualifications of the task evaluator and, when possible, choices to be provided to the task evaluator. Tasks designated for evaluation using crowdsourcing, are routed to Crowdsourced Tasks Manager 104 which interfaces with the crowdsourcing platform described subsequently. Inputs to Crowdsourced Tasks Manager 104 include a Task Description/Operations input, which sets forth respective properties of a task, and a Choices to be Made input. When a particular task is selected for crowdsourcing, Crowdsourced Tasks Manager 104 specifies a Target Crowd as an output, to which the particular task is to be directed. The Target Crowd could, for example, be the general public or a specified community of crowdsourcing providers. The Target Crowd information would usefully be sent to a crowdsourcing platform along with the particular task. When subtasks are processed as part of a larger workflow associated with a task, Crowdsourced Tasks Manager 104 also acts as an aggregator of the intermediate results from crowdsourcing and feeds those back to the task orchestrator in Task Generator 102. Note that both Task Generator 102 and Crowdsourced Tasks Manager 104 are multi-task enabled. In other words, these components are capable of simultaneous marshalling of multiple tasks generated from the same template.
First level tasks and second level tasks as described above, and their use in achieving a prespecified goal, are usefully illustrated by an example. The example pertains to an effort to transform the workload of a legacy application to a cloud computing environment for processing. A legacy application is typically an older application, such as a financial or accounting application or other product, which resides on a server at some location. There is increasing interest in transforming the workloads of such applications, so that they can be migrated to providers of cloud computing services for processing.
Legacy applications of the above type usually have a number of software and hardware components, which each have specific characteristics such as versions, releases, patch levels, and component dependencies. However, the resources available in a cloud computing environment are generally not customizable to the needs of different cloud user entities. Instead, such resources are limited to standardized offerings. As a common practice, providers of cloud computing services will have catalogues or portfolios of the cloud instances which they support.
In view of this situation in cloud computing, it is usually necessary to make modifications or adjustments to components of a legacy application, in order to fit the application to available standardized offerings in the cloud environment. For example, a specific function may have to be configured for a legacy application, in order to run the application on a standard cloud based server, rather than on the original application server. Moreover, the use of standard cloud based offerings, together with required modifications made to legacy application components, must not significantly diminish the capability to process legacy application workloads.
It will be appreciated that determining a small or acceptable number of standard cloud based offerings, which can be used to process substantially all of the application workloads, is likely to require a great deal of effort. For example, it will typically be necessary to access substantial amounts of information from databases that pertain to both application components and to the cloud based offerings, and to then analyze and compare such information and make decisions.
In order to automate the above effort, an embodiment of the invention can use a mechanism such as Task Generator 102 to generate the following set of tasks:
Task 1 could be carried out by successively accessing each component of a legacy application from a repository, and then applying a prespecified set of rules or conditions to the accessed component, in order to determine whether or not such component is a candidate for migration for cloud computing. Accordingly, Task 1 could be a first level task, as defined above. In a useful embodiment, based on criteria specified in the template with which Task 1 is associated, it would be determined that Task 1 is to be carried out by an automation process. Task 1 would, therefore, be routed by invoking Automatable Tasks Function 106.
Task 2 seeks to determine the environment in which the existing legacy application is running. Accordingly, it determines information for candidate components such as O/S version, servers used, release and patch level information. Task 3 is provided to determine the constraints on each candidate component, with respect to dependencies on other components. Task 3 is focused more specifically on characteristics such as versions, releases, and supported protocols and platforms of these other components. It is anticipated that all the determinations required for Tasks 2 and 3 could be carried out by a series of first level tasks, each directed to accessing specific information regarding a particular one of the candidate components selected from a component repository. Moreover, it is anticipated that respective actions required for Tasks 2 and 3 could each be performed using automation. Thus, criteria specified in the template would route both Tasks 2 and 3 by invoking Automatable Tasks Function 106, in like manner with Task 1.
Task 4 pertains to determining whether the constraints determined by Task 3 can be relaxed sufficiently to allow migration to a given one of the standard cloud based offerings. Task 4 is further concerned with assessing the capacity to process legacy workloads using the given offering, and migration costs and efforts associated therewith. It is considered that Task 4 is a fairly complex second level task, as defined above. For example, Task 4 can require comparing information for each of multiple candidate components selected from one repository, with information for each of multiple given standard offerings selected from another repository.
As a second level task, Task 4 can be broken into subtasks as described above, wherein the subtasks are orchestrated by Task Generator 102. Task Generator 102 also merges and processes intermediate results, which may provide subtasks. Moreover, criteria specified by the template associated with Task 4 may route one or more of the subtasks for automated machine processing, by means of Automatable Tasks Function 106. At the same time, the criteria could direct other subtasks to a crowdsourcing platform. Orchestration of Task 4 is performed using a workflow, which is generated by Task Generator 102. Portions of this workflow can be evaluated dynamically, using intermediate results of the subtasks.
Task 5 incorporates steps of Task 4, and could thus be performed in a manner similar thereto.
As stated above, tasks of the third level are the most complex type. Third level tasks use multiple objects that are dissimilar from one another, and mash-up is required for task creation. Third level tasks involve meta-patterns, or a pattern of patterns that requires other tasks to be completed first to determine the rules for generating the pattern to be followed for getting to the solution. Examples of third level task include, without limitation, problem determination, trouble shooting types of problems, design optimization problems and template generation for first and second level tasks.
Referring further to
Referring to
Referring further to
As described above, a task template is a required input for task generation. Accordingly, the initial step 202 in the flow chart is to define a task template, which requires specifying a goal or set of goals as stated above. Information box 220 corresponding to step 202 discloses a conjunction of goals, which include (1) verifying a level of application compliance with certain compliance specifications; (2) verifying application hosting; and (3) verifying application support. These goals are respectively indicated by notations contained in information box 220, which are in conventional AI form. Thus, information box 220 discloses that the template sets forth three atomic tasks.
Step 204 identifies each of the input repositories that contain information needed for the respective atomic tasks. Information box 222 corresponding to step 204 discloses that for the task of verifying the level of application compliance, the repositories of interest would be the application repository and the compliance repository. The application repository contains a number of different applications as objects and the compliance repository contains pertinent compliance regulations for the applications.
While not shown in information box 222 of
Step 206 defines objects of interest. For the task of verifying the level of application compliance, information box 224 indicates that objects of interest would include a specific application selected from the application repository, and a specific compliance item selected from the compliance repository.
Step 208 defines object specific operations that are to be performed on the objects of step 206. As an exemplary operation shown by information box 226, each application of step 206 could be mapped to certain specified compliance regulations. Results of the mapping operations could also be provided.
Step 210 requires defining outcomes and choices. As an example indicated by information box 228, it is recognized that pertinent compliance regulations could be different for different countries or other geographic entities. Accordingly, step 208 would be contextualized with specific geographic information.
In constructing a task in accordance with the procedure of steps 202-210, it is useful to think of the task as starting initially as an empty pattern. Data is then fed into the pattern from different repositories, in order to customize it, during successive steps of the procedure.
At step 212, Task Generator 102 of
Step 214 is provided to identify exceptions. As an example, an exception could occur if it is recognized that a new task is required for which there is no previously generated template. In this event the procedure would go to step 218, to add a new template. A further exception could occur if it was recognized that human involvement was needed in the task construction process, as referred to above. The process would then go to step 216 to obtain expert skills.
Referring to
As an important feature, the embodiment of the invention disclosed by
It is to be emphasized that the sources set forth in list 310 are provided as examples only, and are not intended to limit the scope of the invention in any way.
It is anticipated that a system such as the embodiment of
Referring further to
Referring to
In the depicted example, the data processing system of
In the depicted example, local area network (LAN) adapter 412 connects to SB/ICH 404. Audio adapter 416, keyboard and mouse adapter 420, modem 422, read only memory (ROM) 424, disk 426, CD-ROM 430, universal serial bus (USB) ports and other communication ports 432, and PCI/PCIe devices 434 connect to SB/ICH 404 through bus 438 and bus 440. PCI/PCIe devices 434 may include, for example, Ethernet adapters, add-in cards, and PC cards for notebook computers. PCI uses a card bus controller, while PCIe does not. ROM 424 may be, for example, a flash binary input/output system (BIOS).
Disk 426 and CD-ROM 430 connect to SB/ICH 404 through bus 440. Disk 426 and CD-ROM 430 may use, for example, an integrated drive electronics (IDE) or serial advanced technology attachment (SATA) interface. Super I/O (SIO) device 436 may be connected to SB/ICH 404.
An operating system runs on processing unit 406 and coordinates and provides control of various components within the data processing system of
As a server, the data processing system of
Instructions for the operating system, the object-oriented programming system, and applications or programs are located on storage devices, such as disk 426, and may be loaded into main memory 408 for execution by processing unit 406. The processes for embodiments of the present invention are performed by processing unit 406 using computer usable program code, which may be located in a memory such as, for example, main memory 408, ROM 424, or in one or more peripheral devices, such as, for example, disk 426 and CD-ROM 430.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of the invention and the practical application, and to enable others of ordinary skill in the art to understand the invention for various embodiments with various modifications as are suited to the particular use contemplated.