This disclosure relates to the field of Internet technology, and in particular to a data processing method and apparatus, a device, a storage medium, and a program product.
At present, the annotation of target objects in images mainly includes manual annotation alone, machine annotation alone, and artificial intelligence aided annotation. The manual annotation alone means that there is no model assistance in annotation, and the annotation depends on the recognition of target objects by the annotator. The machine annotation alone means that there is no manual intervention in annotation, and a prediction result of an artificial intelligence model is used as the annotation result. The artificial intelligence aided annotation means that in annotation, the artificial intelligence model predicts an image and generates a prediction result, and the annotator annotates the target object in the image in cooperation with the prediction result.
In the related artificial intelligence aided annotation, the annotator is often only a user of the artificial intelligence model and does not participate in the update of the artificial intelligence model, which results in a failure to update the model in time, and finally affects the accuracy of aided annotation. In addition, in the related artificial intelligence aided annotation methods, there is a lack of a link of re-checking the existing annotation results, which results in a failure to update the existing annotation results. Existing annotation results with low accuracy, if present, would continue to be used in subsequent training or use.
The disclosure provide a data processing method and apparatus, a device, a storage medium, and a program product, which help improve the recognition capability of an image recognition model and improve the accuracy of an annotation result.
In one aspect, an embodiment of this disclosure provides a data processing method performed in a computer device, including:
In another aspect, an embodiment of this disclosure provides a data processing apparatus, including a memory operable to store computer-readable instructions and a processor circuitry operable to read the computer-readable instructions. When executing the computer-readable instructions, the processor circuitry is configured to:
In another aspect, an embodiment of this disclosure provides a non-transitory machine-readable media, having instructions stored on the machine-readable media. When being executed, the instructions are configured to cause a machine to:
In order to illustrate the technical solutions in the embodiments of this disclosure or in the related art more clearly, the following briefly introduces the accompanying drawings for describing the embodiments or the related art. Apparently, the accompanying drawings in the following description show merely some embodiments of this disclosure, and a person of ordinary skill in the art may still derive other drawings from the accompanying drawings without creative efforts.
In conjunction with the drawings in the embodiments of this disclosure, the technical solutions in the embodiments of this disclosure will be clearly and fully described below. Apparently, the embodiments described are only some, but not all embodiments of this disclosure. Based on the embodiments of this disclosure, all other embodiments obtained by a person of ordinary skill in the art without inventive effort shall fall within the protection scope of this disclosure.
In order to facilitate understanding, some terms are first briefly explained as follows.
Artificial intelligence (AI) is a theory, method, technology, and application system that utilizes a digital computer or digital computer-controlled machine to simulate, extend, and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain optimal results. In other words, AI is a comprehensive technology in computer science and attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. AI is to study the design principles and implementation methods of various intelligent machines, and to enable the machines to have the functions of perception, reasoning, and decision-making.
The AI technology is a comprehensive discipline, and relates to a wide range of fields including both hardware-level technologies and software-level technologies. The basic AI technologies generally include technologies such as sensors, dedicated AI chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and electromechanical integration. AI software technologies mainly include computer vision, speech processing, natural language processing, machine learning/deep learning, automatic driving, and intelligent transportation.
Computer vision (CV) is a science that studies how to make machines “see”. More specifically, it refers to replacing human eyes with cameras and computers for machine vision, such as object recognition and measurement, and further performing graphics processing, so as to generate, by computer processing, images that are more suitable for observation of human eyes or for transmission to the instrument for detection. As a scientific discipline, CV studies related theories and technologies and attempts to establish an AI system that can obtain information from images or multidimensional data. CV technology generally includes image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, 3D object reconstruction, 3D technology, virtual reality, augmented reality, simultaneous localization and mapping, automatic driving, intelligent transportation, as well as common biometric feature recognition technologies such as face recognition and fingerprint recognition. In an embodiment of this disclosure, CV technology may be used for recognizing a target object (e.g., a human, dog, cat, bird, etc.) in an image and delineate and annotate the target object.
Machine learning (ML) is a multi-field interdisciplinary subject, involving probability theory, statistics, approximation theory, convex analysis, algorithmic complexity theory, and other disciplines. ML specializes in studying how a computer simulates or implements a human learning behavior to acquire new knowledge or skills, and reorganize an existing knowledge structure, so as to keep improving its performance. ML is the core of AI and the fundamental way to impart computers intelligence, which is applied in various fields of AI. ML and deep learning generally include technologies such as an artificial neural network, a belief network, reinforcement learning, transfer learning, inductive learning, and learning from demonstrations. In an embodiment of this disclosure, both the initial image recognition model and the updated image recognition model are AI models based on machine learning technology, which can be used for image recognition.
Reference may be made to
The annotation terminal cluster may include annotation terminals corresponding to one or more annotation users. The traffic server 100 may be a device that acquires an initial candidate annotation result and an updated candidate annotation result (same as the candidate annotation results described below) provided by the annotation terminals. The first check terminal may be a check terminal which checks at least two candidate annotation results. The second check terminal may be a check terminal that checks target candidate annotation results.
There may be communication connections in the annotation terminal cluster, for example, a communication connection between the annotation terminal 100a and the annotation terminal 100b, and a communication connection between the annotation terminal 100a and the annotation terminal 100c. Meanwhile, there may be a communication connection between any annotation terminal in the annotation terminal cluster and the traffic server 100, for example, a communication connection between the annotation terminal 100a and the traffic server 100. There may be a communication connection between any annotation terminal in the annotation terminal cluster described above and the check terminal (including the first check terminal 200a and the second check terminal 200b) described above, for example, a communication connection between the annotation terminal 100a and the first check terminal 200a, a communication connection between the annotation terminal 100b and the first check terminal 200a, and a communication connection between the annotation terminal 100b and the second check terminal 200b.
There may be a communication connection between the first check terminal 200a and the second check terminal 200b. There may be a communication connection between any check terminal (including the first check terminal 200a and the second check terminal 200b) and the traffic server 100, for example, a communication connection between the first check terminal 200a and the traffic server 100.
The communication connection described above is not limited to a connection mode, and can be directly or indirectly connected via a wired communication mode, directly or indirectly connected via a wireless communication mode, and connected via other modes, which is not limited in this disclosure.
It is to be understood that an application client may be installed on each of the annotation terminals in the annotation terminal cluster shown in
It is to be understood that in the detailed description of this disclosure, data related to user information (e.g., initial standard annotation results in this disclosure) and the like needs to be approved or agreed upon by the user when the embodiments of this disclosure are applied to a specific product or technology, and the collection, use, and processing of related data need to comply with relevant laws, regulations, and standards of relevant countries and regions.
In order to facilitate subsequent understanding and description, the embodiment of this disclosure may select one of the annotation terminals in the annotation terminal cluster shown in
Further, upon receiving the initial candidate annotation result transmitted by the annotation terminal 100a, the traffic server 100 may obtain initial standard annotation results based on the initial candidate annotation result. The original images include a first original image and a second original image, and the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image. The initial aided annotation results include a first initial aided annotation result of the first original image. Further, the traffic server 100 adjusts, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model, thereby updating the initial image recognition model. Further, the traffic server 100 predicts, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquires an updated standard annotation result obtained by adjusting the second initial standard annotation result based on the updated aided annotation result, thereby updating the annotation result of the annotated second original image (i.e., the second initial standard annotation result). Subsequently, the traffic server 100 determines, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model, the target image recognition model being used for generating a target aided annotation result of the target image. The functions of the first check terminal 200a and the second check terminal 200b are described in step S103 in an embodiment corresponding to
Optionally, if the initial image recognition model described above is stored locally in the annotation terminal 100a, the annotation terminal 100a may acquire the initial aided annotation results of the original images via the local initial image recognition model, and then generate the initial standard annotation results based on the initial aided annotation results. Likewise, if the updated image recognition model described above is stored locally in the annotation terminal 100a, the annotation terminal 100a may acquire the updated aided annotation result of the second original image via the local updated image recognition model, and then generate the updated standard annotation result based on the updated aided annotation result, with the remaining processes the same as the processes described above, which are therefore not detailed herein, with reference to the description above.
It is to be understood that since training the initial image recognition model and the updated image recognition model involves a lot of off-line calculations, both the initial image recognition model and the updated image recognition model local to the annotation terminal 100a may be transmitted to the annotation terminal 100a after being trained by the traffic server 100.
The traffic server 100, the annotation terminal 100a, the annotation terminal 100b, . . . , and the annotation terminal 100c, the first check terminal 200a, and the second check terminal 200b described above can all be block chain nodes in a block chain network. The data described throughout the text (such as the initial image recognition model, the original data, and the initial standard annotation results) may be stored in a manner that a block chain node generates a block based on the data and adds the block to the block chain for storage.
Block chain is a new application mode of computer technology, such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm, and it is mainly used for sorting data in a time sequence and encrypting it into a ledger, such that it cannot be manipulated or forged, and at the same time, the data can be verified, stored, and updated. Block chain is essentially a decentralized database, and each node in the database stores an identical block chain, and a block chain network can distinguish nodes into core nodes, data nodes, and light nodes. The core nodes, data nodes, and light nodes form block chain nodes together. The core nodes are responsible for the consensus of the whole block chain network, i.e., the core nodes are consensus nodes in the block chain network. The flow of writing the transaction data in the block chain network into the ledger may be as follows. A data node or a light node in the block chain network acquires the transaction data, and transmits the transaction data in the block chain network (i.e., the nodes transmit in a relay manner) until a consensus node receives the transaction data. The consensus node then packages the transaction data into a block, and performs consensus on the block, and after the consensus is completed, the transaction data is written into the ledger. Here, the original data and the initial standard annotation results are used for exemplifying the transaction data, and after reaching the consensus on the transaction data, the traffic server 100 (a block chain node) generates a block based on the transaction data and stores the block into the block chain network. With regard to the reading of the transaction data (i.e., the original data and the initial standard annotation results), a block chain node in the block chain network may acquire the block containing the transaction data, and further acquire the transaction data in the block.
It is to be understood that the method provided by the embodiment of this disclosure may be performed by a computer device, including but not limited to an annotation terminal or a traffic server. The traffic server above may be an independent physical server, a server cluster or distributed system formed by a plurality of physical servers, or a cloud server that provides basic cloud computing services, such as cloud databases, cloud services, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDN, big data, and artificial intelligence platforms. The annotation terminals include, but are not limited to mobile phones, computers, intelligent speech interaction devices, intelligent appliances, vehicle-mounted terminals, etc. The annotation terminal and the traffic server may be directly or indirectly connected by wired or wireless means, which is not limited in the embodiments of this disclosure.
Further, reference may be made to
Step S101: Predict, based on an initial image recognition model, initial aided annotation results of original images, and acquire initial standard annotation results determined by correcting the initial aided annotation results; the original images include a first original image and a second original image; the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image; and the initial aided annotation results include a first initial aided annotation result of the first original image.
In one embodiment, step S101 includes the following operations: acquiring the original images; the original images including a target object; inputting the original images into the initial image recognition model, and acquiring image features of the original images in the initial image recognition model; determining, based on the image features, an initial region recognition feature of the target object and an initial object recognition feature of the target object; generating, based on the initial region recognition feature, an initial aided annotation region for the target object, and generating, based on the initial object recognition feature, an initial aided object label for the initial aided annotation region; and determining the initial aided annotation region and the initial aided object label as the initial aided annotation results.
The initial image recognition model refers to an AI model used for recognizing the target object in the original images, and the embodiment of this disclosure does not limit the model type of the initial image recognition model. The initial image recognition model may be determined according to practical application scenes, including but not limited to convolutional neural networks (CNN), fully convolutional networks (FCN), and residual networks (Res-Net).
The embodiment of this disclosure does not limit the quantity of the original images which is at least two, and does not limit the image type of the original images which may be any image type. The embodiment of this disclosure does not limit the object type of the target object which may be any object type, such as a human, a bicycle, a table, and a medical endoscope object, and may be set according to practical application scenes. In addition, the embodiment of this disclosure does not limit the quantity of the target object. For example, when the target object is a human, there may be no target object in the original images, or at least one target object in the original images. It is to be understood that the target object may include one or more types of objects, e.g., the target object may include a bicycle or a bicycle and a human. In one embodiment, the original images are medical images and the target object is a medical detection target, i.e., the target in the medical images.
In order to facilitate understanding, reference is further made to
Referring back to
Further, the traffic server 30a transmits the initial aided annotation image 30e carrying the first initial aided annotation result 301e to the annotation terminal 30f, and the annotation object 301f may correct the first initial aided annotation result 301e, for example, viewing the original image 301b and the initial aided annotation image 30f through the annotation application software installed on the annotation terminal 30f. The annotation object 301f may first confirm whether the original image 301b contains a human, and if so, the initial aided annotation region in the first initial aided annotation result 301e may be viewed. If the initial aided annotation region is approved, the annotation terminal 30f may determine the first initial aided annotation result 301e as the initial candidate annotation result (because there is only one target object, the initial aided object label is default to the object label of the target object). If the initial aided annotation region is not approved by the annotation object 301f, the location and shape of the target object are annotated by a polygon. In annotation, the annotation object 301f is required to be as close to the edge of the target object as possible, and the target object is all contained in the region, and the annotated region may be referred to as a region of interest (ROI). Optionally, the annotation object 301f modifies the initial aided annotation region to obtain the initial candidate annotation result. As shown in
Further, the annotation terminal 30f returns the initial candidate annotation image 30g carrying the initial candidate annotation result 301g to the traffic server 100. The embodiment of this disclosure does not limit the quantity of annotation objects for independent annotations, and there may be one or more annotation objects. In this step, the generation process of the second initial standard annotation result is exemplified with one annotation object (e.g., the annotation object 301f in
Referring back to
The image database 30b may be a database dedicated for storing images by the traffic server 30a, and the image database 30b described above may be regarded as an electronic file cabinet—a place for storing electronic files (this disclosure may include original images, initial aided annotation results, and initial standard annotation results, etc.). The traffic server 30a may perform operations such as addition, query, update, and deletion on the original images, the initial aided annotation results, and the initial standard annotation results in the files. The so-called “database” is a collection of data that is stored together in a manner that may be shared with multiple users, has as little redundancy as possible, and is independent of the applications.
Step S102: Adjust, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model.
The embodiment of this disclosure may be applied to various scenes such as cloud technology, AI, intelligent traffic, and aided driving. In recent years, with the breakthrough of new-generation AI technologies represented by deep learning, revolutionary progresses have been made in the field of automatic recognition of medical images, AI oriented to medical images may aid real-time in detection and classification of lesions, which is expected to help clinicians improve the quality of examination and reduce the missed diagnosis of lesions.
An excellent image recognition model relies on a large number of representative high-quality annotation data, and the quality of data annotation determines the stability and accuracy of an algorithm model. However, different modal data and different disease lesions show apparent differences and complexity among individuals, so it is necessary to continuously update the existing image recognition model, and further update the annotated data. Based on this, an embodiment of this disclosure provides an AI aided annotation method based on bidirectional quality control, which is intended to improve the accuracy and efficiency of annotation.
Reference is further made to
The traffic server determines an initial region error between the initial aided annotation region 401a and the initial standard annotation region 402a, and determines an initial object error between the initial aided object label 401b and the initial standard object label 402b. Further, weighted summation is performed on the initial region error and the initial object error to obtain a first annotation result error. The traffic server adjusts the model parameters in the initial image recognition model 30c based on the first annotation result error so as to generate an updated image recognition model 40d.
The embodiment of this disclosure does not limit the update condition of the initial image recognition model 30c, which may be that the traffic server responds to a model update instruction for the initial image recognition model 30c. For this scene, reference may be made to the description of step S202 in an embodiment corresponding to
In summary, the embodiment of this disclosure may determine the first initial standard annotation result and the first initial aided annotation result based on requirements of the annotation object, and thus may perform personalized update on the initial image recognition model. In other words, the embodiment of this disclosure may specifically train the model for personalized needs (e.g., for specific recognition of medical images by physicians) to improve the accuracy of object recognition in personalized scenes. For example, the target object includes a plurality of object types that may include a first target object (such as a malignant tumor) and a second target object (such as a benign tumor). The prediction accuracy for the first target object is lower than the prediction accuracy for the second target object of the initial image recognition model. Therefore, the initial standard annotation result including the first target object may be used as a first initial standard annotation result, and the initial aided annotation result including the first target object may be used as a first initial aided annotation result. In this case, model parameters in the initial image recognition model are adjusted based on the first initial standard annotation result and the first initial aided annotation result described above, so as to generate an updated image recognition model for the first target object, thereby improving the prediction accuracy for the first target object. Thus, the embodiment of this disclosure may improve the accuracy of hospital testing in medical scenes. The updated image recognition model does not change the prediction accuracy for the second target object.
Step S103: Predict, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquire an updated standard annotation result of the second original image; the updated standard annotation result is obtained by adjusting the second initial standard annotation result based on the updated aided annotation result.
Specifically, the updated aided annotation result is transmitted to at least two annotation terminals. In this way, the at least two annotation terminals separately adjust the second initial standard annotation result based on the updated aided annotation result so as to obtain candidate annotation results of the second original image. The candidate annotation results returned by the at least two annotation terminals are acquired. At least two candidate annotation results separately include candidate annotation regions for annotating the target object in the second original image. Region quantities corresponding to the candidate annotation regions included separately in the at least two candidate annotation results are determined. Initial check annotation results for the at least two candidate annotation results are determined based on at least two region quantities. The updated standard annotation result is acquired based on the initial check annotation results.
The specific process of determining initial check annotation results for the at least two candidate annotation results based on at least two region quantities may include: comparing the at least two region quantities; the at least two region quantities including a region quantity Ba; a being a positive integer, and a being less than or equal to a result quantity of the at least two candidate annotation results; determining, when there is a region quantity different from the region quantity Ba in the remaining region quantities, the at least two candidate annotation results separately as the initial check annotation results; the remaining region quantities including region quantities other than the region quantity Ba among the at least two region quantities; acquiring, when the remaining region quantities are all the same as the region quantity Ba, candidate annotation regions separately included in every two candidate annotation results in the at least two candidate annotation results; and determining coincidence degrees between the candidate annotation regions separately included in every two candidate annotation results, and determining the initial check annotation results based on the coincidence degrees. In one embodiment, the coincidence degree of two candidate annotation regions is, for example, a coincidence degree between two pieces of location information of the two candidate annotation regions, and the coincidence degree between the two pieces of location information is, for example, a ratio of an intersection to a union of the two pieces of location information. Here, the location information of the candidate annotation region may characterize a location of the candidate annotation region in an image in which it is located.
The at least two candidate annotation results further separately include candidate object labels for annotating the included candidate annotation regions. The specific process of determining the initial check annotation results based on the coincidence degrees may include: determining, when at least one of the coincidence degrees is less than a coincidence degree threshold, the at least two candidate annotation results separately as the initial check annotation results; dividing, when each of the coincidence degrees is equal to or greater than the coincidence degree threshold, same candidate object labels in the at least two candidate annotation results into a same object label group so as to obtain n object label groups; n being a positive integer; and determining the initial check annotation results based on the n object label groups.
The specific process of determining the initial check annotation results based on the n object label groups may include: counting object label quantities of the candidate object labels separately included in the n object label groups, and acquiring a maximum object label quantity from the object label quantities separately corresponding to the n object label groups; determining quantity ratios between the maximum object label quantity and the object label quantities corresponding to the at least two candidate annotation results; comparing the quantity ratios with a quantity ratio threshold, and determining, when the quantity ratios are less than the quantity ratio threshold, the at least two candidate annotation results separately as the initial check annotation results; determining, when the quantity ratios are equal to or greater than the quantity ratio threshold, an object label group corresponding to the maximum object label quantity as a target object label group; and acquiring target candidate annotation results from candidate annotation results associated with the target object label group, and determining the target candidate annotation results as the initial check annotation results.
The specific process of acquiring the updated standard annotation result based on the initial check annotation results may include: transmitting, when the initial check annotation results are the at least two candidate annotation results, the initial check annotation results to a first check terminal, such that the first check terminal determines check annotation results to be transmitted to a second check terminal based on the at least two candidate annotation results; the second check terminal being configured to return the updated standard annotation result based on the check annotation results; and transmitting, when the initial check annotation results are the target candidate annotation results, the initial check annotation results to the second check terminal, such that the second check terminal returns the updated standard annotation result based on the target candidate annotation results.
The description of predicting, based on the updated image recognition model, an updated aided annotation result of the second original image may refer to the description of predicting, based on an initial image recognition model, initial aided annotation results of original images in step S101 above. The data processing processes of the two are the same, only differing in that the updated image recognition model is a model obtained after the update of the initial image recognition model, and therefore the description is not repeated here.
The process of the traffic server acquiring the updated standard annotation result is substantially the same as the process of acquiring the initial standard annotation results, and therefore the process of an annotation terminal adjusting the second initial standard annotation result based on the updated aided annotation result to obtain the updated standard annotation result is not detailed herein, and reference may be made to the description in step S101 above.
Optionally, in order to ensure data annotation quality and reduce the differences among individual annotation objects, the annotation process may include independent annotations of a plurality of annotation objects. Therefore, the traffic server may transmit the updated aided annotation result to annotation terminals corresponding to at least two annotation objects, such that the annotation terminals corresponding to the at least two annotation objects separately adjust the second initial standard annotation result based on the updated aided annotation result so as to obtain candidate annotation results of the second original image.
Reference may be made to
As shown in
Referring back to
It is to be understood that there is no difference in the three updated candidate annotation images in
Based on the coordinates described above, the traffic server 502d acquires the location information L501c of the candidate annotation result 501c and the location information L501b of the candidate annotation result 501b in the updated candidate annotation image 501a, and acquires the location information L502c of the candidate annotation result 502c and location information L502b of the candidate annotation result 502b in the updated candidate annotation image 502a. The traffic server 502d determines a location information intersection L501c∩502c of the location information L501c of the candidate annotation result 501c with the location information L502c of the candidate annotation result 502c, and determines a location information union L501c∪502c of the location information L501c with the location information L502c. The traffic server 502d determines a location information intersection L501b∩502c of the location information L501b of the candidate annotation result 501b with the location information L502c of the candidate annotation result 502c, and determines a location information union L501b∪502c of the location information L501b with the location information L502c. The traffic server 502d determines a location information intersection L501c∩502b of the location information L501c of the candidate annotation result 501c with the location information L502b of the candidate annotation result 502b, and determines a location information union L501c∪502b of the location information L501c with the location information L502b. The traffic server 502d determines a location information intersection L501b∩502b of the location information L501b of the candidate annotation result 501c with the location information L502b of the candidate annotation result 502b, and determines a location information union L501b∪502b of the location information L501b with the location information L502b.
For example, a first coincidence degree of the candidate annotation result 501c in the updated candidate annotation image 501a (same as the coincidence degree of the candidate annotation region included in the candidate annotation result 501c) is determined below, and the determination of a first coincidence degree of the candidate annotation result 501b in the updated candidate annotation image 501a may refer to the following process.
The traffic server 502d may determine a candidate coincidence degree C(501c,502c) between the candidate annotation result 501c and the candidate annotation result 502c according to Formula (1).
In the formula, ROI501c may represent a candidate annotation region of the candidate annotation result 501c and may be determined by the location information L501c, ROI502c may represent a candidate annotation region of the candidate annotation result 502c and may be determined by the location information L502c, ROI501c ∩ROI502c may represent an intersection area of the candidate annotation region of the candidate annotation result 501c with the candidate annotation region of the candidate annotation result 502c and may be determined by the location information intersection L501c∩502c, and ROI501c ∪ROI502c may represent a union area of the candidate annotation region of the candidate annotation result 501c with the candidate annotation region of the candidate annotation result 502c and may be determined by the location information union L501c∪502c.
The traffic server 502d may determine a candidate coincidence degree C(501c,502b) between the candidate annotation result 501c and the candidate annotation result 502b according to Formula (2).
In the formula, ROI502b may represent a candidate annotation region of the candidate annotation result 502b and may be determined by the location information L502b, ROI501c ∩ROI502b may represent an intersection area of the candidate annotation region of the candidate annotation result 501c with the candidate annotation region of the candidate annotation result 502b and may be determined by the location information intersection L501c∩502b, and ROI501c ∪ROI502b may represent a union area of the candidate annotation region of the candidate annotation result 501c and the candidate annotation region of the candidate annotation result 502b and may be determined by the location information union L501c∪502b.
The traffic server 502d compares the candidate coincidence degree C(501c,502c) with the candidate coincidence degree C(501c,502b). For the updated candidate annotation image 501a and the updated candidate annotation image 502a, it is obvious that there is no intersection area between the candidate annotation result 501c and the candidate annotation result 502b, and therefore the first coincidence degree of the candidate annotation result 501c is the candidate coincidence degree C(501c,502c).
For example, a second coincidence degree of the candidate annotation result 502b in the updated candidate annotation image 502a is determined below, and the determination of a second coincidence degree of the candidate annotation result 502c in the updated candidate annotation image 502a may refer to the following process.
The traffic server 502d may determine a candidate coincidence degree C(501b,502b) between the candidate annotation result 502b and the candidate annotation result 501b according to Formula (3).
In the formula, ROI501b may represent a candidate annotation region of the candidate annotation result 501c and may be determined by the location information L501b, ROI501b ∩ROI502b may represent an intersection area of the candidate annotation region of the candidate annotation result 501b with the candidate annotation region of the candidate annotation result 502b and may be determined by the location information intersection L501b∩502b, and ROI501b ∪ROI502b may represent a union area of the candidate annotation region of the candidate annotation result 501b and the candidate annotation region of the candidate annotation result 502b and may be determined by the location information union L501b∪502b.
The traffic server 502d compares the candidate coincidence degree C(501b,502b) with the candidate coincidence degree C(501c,502b). For the updated candidate annotation image 501a and the updated candidate annotation image 502a, it is obvious that there is no intersection area between the candidate annotation result 501c and the candidate annotation result 502b, and therefore the second coincidence degree of the candidate annotation result 502b is the candidate coincidence degree C(501b,502b).
The traffic server 502d determines the first coincidence degree of each candidate annotation region (including the candidate annotation result 501c and the candidate annotation result 501b) in the updated candidate annotation image 501a and the second coincidence degree of each candidate annotation region (including the candidate annotation result 502c and the candidate annotation result 502b) in the updated candidate annotation image 502a as coincidence degrees between the candidate annotation regions separately included in the updated candidate annotation image 501a and the updated candidate annotation image 502a.
Referring back to
The traffic server 502d compares the coincidence degrees described above with a coincidence degree threshold, and if at least one of the coincidence degrees is less than the coincidence degree threshold, the at least two candidate annotation results (i.e., the candidate annotation results separately included in the three updated candidate annotation images) are separately determined as the initial check annotation results. If each of the coincidence degrees described above is greater than or equal to the coincidence degree threshold, candidate object labels (including candidate object labels separately included in the candidate annotation result 501c and the candidate annotation result 501b) are acquired from the candidate annotation results included in the updated candidate annotation image 501a, candidate object labels (including candidate object labels separately included in the candidate annotation result 502c and the candidate annotation result 502b) are acquired from the candidate annotation results included in the updated candidate annotation image 502a, and candidate object labels (including candidate object labels separately included in the candidate annotation result 503c and the candidate annotation result 503b) are acquired from the candidate annotation results included in the updated candidate annotation image 503a. The traffic server 502d groups the same candidate object labels in the candidate object labels separately included in the three updated candidate annotation images into a same object label group so as to obtain n object label groups. The object label quantities of the candidate object labels separately included in the n object label groups are counted. A maximum object label quantity is acquired from the object label quantities separately corresponding to the n object label groups. Quantity ratios between the maximum object label quantity and the object label quantities corresponding to the at least two candidate annotation results are determined. The quantity ratios are compared with a quantity ratio threshold, and when the quantity ratios are less than the quantity ratio threshold, the candidate annotation results separately corresponding to the three updated candidate annotation images are all determined as the initial check annotation results. When the quantity ratios are equal to or greater than the quantity ratio threshold, an object label group corresponding to the maximum object label quantity is determined as a target object label group. Target candidate annotation results are acquired from candidate annotation results associated with the target object label group, and the target candidate annotation results are determined as the initial check annotation results.
After determining the initial check annotation results, the traffic server needs to transmit the initial check annotation results to a check terminal (including a first check terminal and a second check terminal), such that the check terminal confirms the results and returns an updated standard annotation result. If the initial check annotation results are at least two candidate annotation results, the traffic server transmits the initial check annotation results to the first check terminal (same as the first check terminal 200a described above in
After the first check terminal acquires at least two candidate annotation results, an arbitration object corresponding thereto may view the second original image and the at least two candidate annotation results. If the arbitration object confirms that none of the at least two candidate annotation results is desirable, region annotation and object annotation may be performed on the second original image. The process of the arbitration object annotating the second original image is the same as the process of the annotation object annotating the first original image, and therefore, reference may be made to the annotation described in step S101 above. Subsequently, the arbitration object may transmit the re-annotated check annotation result thereof as an arbitration result to the second check terminal (same as the second check terminal 200b described above in
If the arbitration object approves one of the at least two candidate annotation results, the approved candidate annotation result may be directly transmitted to the second check terminal as the arbitration result, such that the check object checks the arbitration result.
When the initial check annotation results are the target candidate annotation results, the initial check annotation results are transmitted to the second check terminal, and the second check terminal described above has a check function throughout image processing. After the second check terminal acquires the target candidate annotation results, the check object may check the image via the second check terminal.
The check object may store the target candidate annotation results or the arbitration result transmitted by the first check terminal, if approved, in an image database (same as the image database 30b in
In summary, in this step, quality control may be performed on the second original image with an existing annotation result through the updated initial image recognition model (i.e., the updated image recognition model), such that the existing annotation result may be dynamically updated, thereby improving the accuracy of target recognition.
Step S104: Determine, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model; the target image recognition model is used for generating an annotation result of a target image.
Specifically, the second original image includes a target object. The updated aided annotation result includes an updated aided annotation region for the target object and an updated aided object label for the updated aided annotation region. The updated standard annotation result includes an updated standard annotation region for the target object and an updated standard object label for the updated standard annotation region. An updated region loss value between the updated aided annotation region and the updated standard annotation region is determined. An updated object loss value between the updated aided object label and the updated standard object label is determined. Weighted summation is performed on the updated region loss value and the updated object loss value to obtain an updated loss value of the updated image recognition model. When the updated loss value is greater than or equal to an updated loss value threshold, it is determined that the updated image recognition model does not satisfy the model convergence condition, and model parameters in the updated image recognition model continue to be adjusted. When the updated loss value is less than the updated loss value threshold, it is determined that the updated image recognition model satisfies the model convergence condition, and the updated image recognition model is determined as the target image recognition model.
The original images further include a third original image. The initial standard annotation results further include a third initial standard annotation result of the third original image. The initial aided annotation results further include a third initial aided annotation result of the third original image. The specific process of continuing to adjust model parameters in the updated image recognition model may include: determining an adjusted loss value based on the third initial standard annotation result and the third initial aided annotation result; performing weighted summation on the adjusted loss value and the updated loss value to obtain a target loss value; and adjusting the model parameters in the updated image recognition model based on the target loss value.
The embodiment of this disclosure does not limit the quantities separately corresponding to the first original image, the second original image, and the third original image, which may be any number, and may be set according to practical application scenes. It is to be understood that the first original image and the second original image are different from each other, and the second original image and the third original image are different from each other. Optionally, if the updated loss value is less than the updated loss value threshold, and the annotation object transmits, via the annotation terminal, an instruction to continue updating the model, the traffic server may keep the update processing on the updated image recognition model with a process the same as the subsequent process where the updated loss value is equal to or greater than the updated loss value threshold, which is therefore not detailed herein.
The traffic server may determine the third original image based on the updated loss value, and the specific process for determination may be as follows. The target object may include at least two target objects, and the at least two target objects may include a first target object. It is to be understood that the updated loss value may be obtained by averaging a first updated loss value for the first target object and remaining updated loss values for the remaining target objects. The remaining target objects include target objects other than the first target object among the at least two target objects. Therefore, the traffic server may determine a first loss value ratio of the first updated loss value to the updated loss value, and acquire, based on the first loss value ratio and the training sample quantity (equal to the quantity of the third original images), original images including the first target object and original images including the remaining target objects from the original images, and determine the two types of original images described above as the third original images. For example, if the training sample quantity is equal to 200 and the first loss value ratio is 0.8, the traffic server may randomly extract 160 images including the first target object from the original images, and similarly randomly extract the remaining images from the original images, and determine the extracted images including the first target object and the remaining images as the third original images.
In an embodiment of this disclosure, a computer device may use a first initial standard annotation result and a first initial aided annotation result as a training sample set to update an initial image recognition model, i.e., adjusting model parameters to obtain an updated image recognition model. It is to be understood that the process can not only realize the model update, but also determine the orientation of the model update based on the training sample set. Further, an updated aided annotation result of the second original image is predicted based on the updated image recognition model, and an updated standard annotation result obtained by adjusting the second initial standard annotation result based on the updated aided annotation result is acquired, thereby updating the second initial standard annotation result. Further, when the updated image recognition model is determined as the target image recognition model, a target aided annotation result of the target image is generated by using the target image recognition model. It may be seen from the above that the embodiment of this disclosure can not only update the initial image recognition model based on the training sample set, so as to improve the recognition capability of the updated image recognition model, but also update the second initial standard annotation result through the updated image recognition model, so as to improve the accuracy of the updated standard annotation result. Therefore, this disclosure enables bidirectional update of the image recognition model and the annotation result.
Reference may be made to
Step S201: Predict, based on an initial image recognition model, initial aided annotation results of original images, and acquire initial standard annotation results determined by correcting the initial aided annotation results; the original images include a first original image and a second original image; the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image; and the initial aided annotation results include a first initial aided annotation result of the first original image.
The specific implementation of step S201 may refer to step S101 in the embodiment corresponding to
Step S202: Determine, in response to a model update instruction, the first original image as a sample image, the first initial standard annotation result as a sample label of the sample image, and the first initial aided annotation result as a sample prediction result of the sample image.
Step S203: Determine, based on the sample label and the sample prediction result, an overall loss value of the initial image recognition model.
Step S204: Adjust, based on the overall loss value, model parameters in the initial image recognition model, and determine, when the adjusted initial image recognition model satisfies a model convergence condition, the adjusted initial image recognition model as an updated image recognition model.
In conjunction with the description of steps S202 to S204, the traffic server currently uses the initial image recognition model to predict original images in the image database, and generates initial aided annotation results corresponding to the original images, and acquires initial standard annotation results determined based on the initial aided annotation results. In the embodiment of this disclosure, the determination of an average annotation result error between the initial standard annotation results and the initial aided annotation results is not detailed, and reference may be made to the description of steps S302 to S304 in the embodiment corresponding to
In this case, an initial loss value generated based on the average annotation result error between the initial standard annotation results and the initial aided annotation results is less than an initial loss value threshold, and when a model update instruction is acquired, the traffic server responds to the model update instruction. Optionally, the model update instruction carries training sample information, and the training sample information may include at least two object labels and training sample quantities separately corresponding to the at least two object labels. For example, the at least two object labels include a first object label and a second object label, and the model update instruction carries a first training sample quantity for the first object label and a second training sample quantity for the second object label. Then the traffic server may acquire, from the initial standard annotation results, an initial standard annotation result of which the annotation result quantity is equal to the first training sample quantity and which includes the first object label, and determine the acquired initial standard annotation result as a first initial standard annotation result. The traffic server acquires, from the initial aided annotation results, an initial aided annotation result corresponding to the first initial standard annotation result as a first initial aided annotation result. Further, the traffic server determines the first initial standard annotation result as a sample label of the sample image, determines the first initial aided annotation result as a sample prediction result of the sample image, determines an error between the sample label and the sample prediction result, determines the error as an overall loss value of the initial image recognition model, adjusts model parameters in the initial image recognition model by using the overall loss value, and determines, when the adjusted initial image recognition model satisfies a model convergence condition, the adjusted initial image recognition model as the updated image recognition model.
It may be seen from the above that the embodiment of this disclosure can not only update the initial image recognition model, but also determine the update orientation by the traffic object, thereby improving the update efficiency as well as the prediction accuracy of the model.
Step S205: Predict, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquire an updated standard annotation result; the updated standard annotation result is obtained by adjusting the second initial standard annotation result based on the updated aided annotation result.
Step S206: Determine, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model; the target image recognition model is used for generating a target aided annotation result of a target image.
The specific implementation of steps S205-S206 may refer to steps S103-S104 in the embodiment corresponding to
In an embodiment of this disclosure, a computer device may use a first initial standard annotation result and a first initial aided annotation result as a training sample set to update an initial image recognition model, i.e., adjusting model parameters to obtain an updated image recognition model. It is to be understood that the process can not only realize the model update, but also determine the orientation of the model update based on the training sample set. Further, an updated aided annotation result of the second original image is predicted based on the updated image recognition model, and an updated standard annotation result obtained by adjusting the second initial standard annotation result based on the updated aided annotation result is acquired, thereby updating the second initial standard annotation result. Further, when the updated image recognition model is determined as the target image recognition model, a target aided annotation result of the target image is generated by using the target image recognition model. It may be seen from the above that the embodiment of this disclosure can not only update the initial image recognition model based on the training sample set, so as to improve the recognition capability of the updated image recognition model, but also update the second initial standard annotation result through the updated image recognition model, so as to improve the accuracy of the updated standard annotation result. Therefore, this disclosure enables bidirectional update of the image recognition model and the annotation result.
Reference may be made to
Step S301: Predict, based on an initial image recognition model, initial aided annotation results of original images, and acquire initial standard annotation results determined based on the initial aided annotation results; the original images include a first original image and a second original image; the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image; and the initial aided annotation results include a first initial aided annotation result of the first original image.
The specific implementation of step S301 may refer to step S101 in the embodiment corresponding to
Step S302: Determine a first annotation result error between the first initial aided annotation result and the first initial standard annotation result.
Step S303: Determine a second annotation result error between a second initial aided annotation result and the second initial standard annotation result.
Step S304: Determine an average annotation result error between the first annotation result error and the second annotation result error.
Step S305: Determine an initial loss value of the initial image recognition model based on the average annotation result error.
Step S306: When the initial loss value is greater than or equal to an initial loss value threshold, adjust, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model.
Step S307: Predict, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquire an updated standard annotation result; the updated standard annotation result is obtained by adjusting the second initial standard annotation result based on the updated aided annotation result.
Step S308: Determine, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model; the target image recognition model is used for generating a target aided annotation result of a target image.
The specific implementation of steps S307-S308 may refer to steps S103-S104 in the embodiment corresponding to
In conjunction with
In an embodiment of this disclosure, a computer device may use a first initial standard annotation result and a first initial aided annotation result as a training sample set to update an initial image recognition model, i.e., adjusting model parameters to obtain an updated image recognition model. It is to be understood that the process can not only realize the model update, but also determine the orientation of the model update based on the training sample set. Further, an updated aided annotation result of the second original image is predicted based on the updated image recognition model, and an updated standard annotation result obtained by adjusting the second initial standard annotation result based on the updated aided annotation result is acquired, thereby updating the second initial standard annotation result. Further, when the updated image recognition model is determined as the target image recognition model, a target aided annotation result of the target image is generated by using the target image recognition model. It may be seen from the above that the embodiment of this disclosure can not only update the initial image recognition model based on the training sample set, so as to improve the recognition capability of the updated image recognition model, but also update the second initial standard annotation result through the updated image recognition model, so as to improve the accuracy of the updated standard annotation result. Therefore, this disclosure enables bidirectional update of the image recognition model and the annotation result.
Further, reference may be made to
The term “module” (and other similar terms such as unit, submodule, etc.) refers to computing software, firmware, hardware, and/or various combinations thereof. At a minimum, however, modules are not to be interpreted as software that is not implemented on hardware, firmware, or recorded on a non-transitory processor readable recordable storage medium. Indeed “module” is to be interpreted to include at least some physical, non-transitory hardware such as a part of a processor, circuitry, or computer. Two different modules can share the same physical hardware (e.g., two different modules can use the same processor and network interface). The modules described herein can be combined, integrated, separated, and/or duplicated to support various applications. Also, a function described herein as being performed at a particular module can be performed at one or more other modules and/or by one or more other devices instead of or in addition to the function performed at the particular module. Further, the modules can be implemented across multiple devices and/or other components local or remote to one another. Additionally, the modules can be moved from one device and added to another device, and/or can be included in both devices. The modules can be implemented in software stored in memory or non-transitory computer-readable medium. The software stored in the memory or medium can run on a processor or circuitry (e.g., ASIC, PLA, DSP, FPGA, or any other integrated circuit) capable of executing computer instructions or computer code. The modules can also be implemented in hardware using processors or circuitry on the same or different integrated circuit.
The first acquisition module 11 is configured to predict, based on an initial image recognition model, initial aided annotation results of original images, and acquire initial standard annotation results determined by correcting the initial aided annotation results; the original images include a first original image and a second original image; the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image; and the initial aided annotation results include a first initial aided annotation result of the first original image.
The model update module 12 is configured to adjust, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model.
The second acquisition module 13 is configured to predict, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquire an updated standard annotation result of the second original image; the updated standard annotation result is obtained by adjusting the second initial standard annotation result based on the updated aided annotation result.
The first determination module 14 is configured to determine, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model; the target image recognition model is used for generating an annotation result of a target image.
The specific functional implementations of the first acquisition module 11, the model update module 12, the second acquisition module 13, and the first determination module 14 may refer to steps S101-S104 in the embodiment corresponding to
Referring back to
The second determination module 15 is configured to determine, in response to a model update instruction, the first original image as a sample image, the first initial standard annotation result as a sample label of the sample image, and the first initial aided annotation result as a sample prediction result of the sample image.
Then the model update module 12 includes: a first determination unit 121 and a second determination unit 122.
The first determination unit 121 is configured to determine, based on the sample label and the sample prediction result, an overall loss value of the initial image recognition model.
The second determination unit 122 is configured to adjust, based on the overall loss value, model parameters in the initial image recognition model, and determine, when the adjusted initial image recognition model satisfies a model convergence condition, the adjusted initial image recognition model as an updated image recognition model.
The specific functional implementations of the second determination module 15, the first determination unit 121, and the second determination unit 122 may refer to steps S202-S204 in the embodiment corresponding to
Referring back to
The data processing apparatus 1 may further include: a third determination module 16 and a step performing module 17.
The third determination module 16 is configured to determine a first annotation result error between the first initial aided annotation result and the first initial standard annotation result.
The third determination module 16 is further configured to determine a second annotation result error between the second initial aided annotation result and the second initial standard annotation result.
The third determination module 16 is further configured to determine an average annotation result error between the first annotation result error and the second annotation result error.
The third determination module 16 is further configured to determine an initial loss value of the initial image recognition model based on the average annotation result error.
The step performing module 17 is configured to perform, when the initial loss value is greater than or equal to an initial loss value threshold, the step of adjusting, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model.
The specific functional implementations of the third determination module 16 and the step performing module 17 may refer to steps S302-S306 in the embodiment corresponding to
Referring back to
The third determination module 16 may include: a third determination unit 161 and a first weighting unit 162.
The third determination unit 161 is configured to determine an initial region error between the first annotation region and the second annotation region.
The third determination unit 161 is further configured to determine an initial object error between the first object label and the second object label.
The first weighting unit 162 is configured to perform weighted summation on the initial region error and the initial object error to obtain the first annotation result error.
The specific functional implementations of the third determination unit 161 and the first weighting unit 162 may refer to step S302 in the embodiment corresponding to
Referring back to
The first determination module 14 may include: a fourth determination unit 141, a second weighting unit 142, a fifth determination unit 143, and a sixth determination unit 144.
The fourth determination unit 141 is configured to determine an updated region loss value between the updated aided annotation region and the updated standard annotation region.
The fourth determination unit 141 is configured to determine an updated object loss value between the updated aided object label and the updated standard object label.
The second weighting unit 142 is configured to perform weighted summation on the updated region loss value and the updated object loss value to obtain an updated loss value of the updated image recognition model.
The fifth determination unit 143 is configured to determine, when the updated loss value is greater than or equal to an updated loss value threshold, that the updated image recognition model does not satisfy the model convergence condition, and continue to adjust model parameters in the updated image recognition model.
The sixth determination unit 144 is configured to determine, when the updated loss value is less than the updated loss value threshold, that the updated image recognition model satisfies the model convergence condition, and determine the updated image recognition model as the target image recognition model.
The specific functional implementations of the fourth determination unit 141, the second weighting unit 142, the fifth determination unit 143, and the sixth determination unit 144 may refer to step S104 in the embodiment corresponding to
Referring back to
The fifth determination unit 143 may include: a first determination sub-unit 1431 and a model adjustment sub-unit 1432.
The first determination sub-unit 1431 is configured to determine an adjusted loss value based on the third initial standard annotation result and the third initial aided annotation result.
The first determination sub-unit 1431 is further configured to perform weighted summation on the adjusted loss value and the updated loss value to obtain a target loss value.
The model adjustment sub-unit 1432 is configured to adjust the model parameters in the updated image recognition model based on the target loss value.
The specific functional implementations of the first determination sub-unit 1431 and the model adjustment sub-unit 1432 may refer to step S104 in the embodiment corresponding to
Referring back to
The aid transmission unit 131 is configured to transmit the updated aided annotation result to annotation terminals corresponding to at least two annotation objects, such that the annotation terminals corresponding to the at least two annotation objects separately adjust the second initial standard annotation result based on the updated aided annotation result so as to obtain candidate annotation results of the second original image.
The first acquisition unit 132 is configured to acquire candidate annotation results returned by the annotation terminals separately corresponding to the at least two annotation objects; at least two candidate annotation results separately include candidate annotation regions for annotating the target object in the second original image.
The seventh determination unit 133 is configured to determine region quantities corresponding to the candidate annotation regions included separately in the at least two candidate annotation results.
The eighth determination unit 134 is configured to determine initial check annotation results for the at least two candidate annotation results based on at least two region quantities.
The second acquisition unit 135 is configured to acquire the updated standard annotation result based on the initial check annotation results.
The specific functional implementations of the aid transmission unit 131, the first acquisition unit 132, the seventh determination unit 133, the eighth determination unit 134, and the second acquisition unit 135 may refer to step S103 in the embodiment corresponding to
Referring back to
The quantity comparison sub-unit 1341 is configured to compare the at least two region quantities; the at least two region quantities include a region quantity Ba; a is a positive integer, and a is less than or equal to a result quantity of the at least two candidate annotation results.
The second determination sub-unit 1342 is configured to determine, when there is a region quantity different from the region quantity Ba in the remaining region quantities, the at least two candidate annotation results separately as the initial check annotation results; the remaining region quantities include region quantities other than the region quantity Ba among the at least two region quantities.
The region acquisition sub-unit 1343 is configured to acquire, when the remaining region quantities are all the same as the region quantity Ba, candidate annotation regions separately included in every two candidate annotation results in the at least two candidate annotation results.
The third determination sub-unit 1344 is configured to determine coincidence degrees between the candidate annotation regions separately included in every two candidate annotation results, and determine the initial check annotation results based on the coincidence degrees.
The specific functional implementations of the quantity comparison sub-unit 1341, the second determination sub-unit 1342, the region acquisition sub-unit 1343, and the third determination sub-unit 1344 may refer to step S104 in the embodiment corresponding to
Referring back to
The third determination sub-unit 1344 may include: a first check sub-unit 13441, a label division sub-unit 13442, and a second check sub-unit 13443.
The first check sub-unit 13441 is configured to determine, when at least one of the coincidence degrees is less than a coincidence degree threshold, the at least two candidate annotation results separately as the initial check annotation results.
The label division sub-unit 13442 is configured to group, when each of the coincidence degrees is equal to or greater than the coincidence degree threshold, same candidate object labels in the at least two candidate annotation results into a same object label group so as to obtain n object label groups; n is a positive integer.
The second check sub-unit 13443 is configured to determine the initial check annotation results based on the n object label groups.
The specific functional implementations of the first check sub-unit 13441, the label division sub-unit 13442, and the second check sub-unit 13443 may refer to step S103 in the embodiment corresponding to
Referring back to
The second check sub-unit 13443 is further specifically configured to determine quantity ratios between the maximum object label quantity and the object label quantities corresponding to the at least two candidate annotation results.
The second check sub-unit 13443 is further specifically configured to compare the quantity ratios with a quantity ratio threshold, and determine, when the quantity ratios are less than the quantity ratio threshold, the at least two candidate annotation results separately as the initial check annotation results.
The second check sub-unit 13443 is further specifically configured to determine, when the quantity ratios are equal to or greater than the quantity ratio threshold, an object label group corresponding to the maximum object label quantity as a target object label group.
The second check sub-unit 13443 is further specifically configured to acquire target candidate annotation results from candidate annotation results associated with the target object label group, and determine the target candidate annotation results as the initial check annotation results.
The specific functional implementation of the second check sub-unit 13443 may refer to step S103 in the embodiment corresponding to
Referring back to
The first transmission sub-unit 1351 is configured to transmit, when the initial check annotation results are the at least two candidate annotation results, the initial check annotation results to a first check terminal, such that the first check terminal determines check annotation results to be transmitted to a second check terminal based on the at least two candidate annotation results. The second check terminal is configured to return the updated standard annotation result based on the check annotation results.
The second transmission sub-unit 1352 is configured to transmit, when the initial check annotation results are the target candidate annotation results, the initial check annotation results to the second check terminal, such that the second check terminal returns the updated standard annotation result based on the target candidate annotation results.
The specific functional implementations of the first transmission sub-unit 1351 and the second transmission sub-unit 1352 may refer to step S103 in the embodiment corresponding to
Referring back to
The third acquisition unit 111 is configured to acquire original images; the original images include a target object.
The fourth acquisition unit 112 is configured to input the original images into the initial image recognition model, and acquire image features of the original images in the initial image recognition model.
The ninth determination unit 113 is configured to determine, based on the image features, an initial region recognition feature of the target object and an initial object recognition feature of the target object.
The result generation unit 114 is configured to generate, based on the initial region recognition feature, an initial aided annotation region for the target object, and generate, based on the initial object recognition feature, an initial aided object label for the initial aided annotation region.
The result generation unit 114 is further configured to determine the initial aided annotation region and the initial aided object label as the initial aided annotation results.
The specific functional implementations of the third acquisition unit 111, the fourth acquisition unit 112, the ninth determination unit 113, and the result generation unit 114 may refer to step S101 in the embodiment corresponding to
In an embodiment of this disclosure, a computer device may use a first initial standard annotation result and a first initial aided annotation result as a training sample set to update an initial image recognition model, i.e., adjusting model parameters to obtain an updated image recognition model. It is to be understood that the process can not only realize the model update, but also determine the orientation of the model update based on the training sample set. Further, an updated aided annotation result of the second original image is predicted based on the updated image recognition model, and an updated standard annotation result obtained by adjusting the second initial standard annotation result based on the updated aided annotation result is acquired, thereby updating the second initial standard annotation result. Further, when the updated image recognition model is determined as the target image recognition model, a target aided annotation result of the target image is generated by using the target image recognition model. It may be seen from the above that the embodiment of this disclosure can not only update the initial image recognition model based on the training sample set, so as to improve the recognition capability of the updated image recognition model, but also update the second initial standard annotation result through the updated image recognition model, so as to improve the accuracy of the updated standard annotation result. Therefore, this disclosure enables bidirectional update of the image recognition model and the annotation result.
Further, reference may be made to
In the computer device 1000 shown in
Predict, based on an initial image recognition model, initial aided annotation results of original images, and acquire initial standard annotation results determined based on the initial aided annotation results; the original images include a first original image and a second original image; the initial standard annotation results include a first initial standard annotation result of the first original image and a second initial standard annotation result of the second original image; and the initial aided annotation results include a first initial aided annotation result of the first original image.
Adjust, based on the first initial standard annotation result and the first initial aided annotation result, model parameters in the initial image recognition model so as to generate an updated image recognition model.
Predict, based on the updated image recognition model, an updated aided annotation result of the second original image, and acquire an updated standard annotation result; the updated standard annotation result is obtained by adjusting the second initial standard annotation result based on the updated aided annotation result.
Determine, in response to determining that the updated image recognition model satisfies a model convergence condition based on the updated aided annotation result and the updated standard annotation result, the updated image recognition model as a target image recognition model; the target image recognition model is used for generating a target aided annotation result of a target image.
It is to be understood that the computer device 1000 as described in the embodiments of this disclosure may carry out the description of the data processing method as described in the embodiments corresponding to
An embodiment of this disclosure further provides a computer-readable storage medium storing a computer program including program instructions which, when executed by a processor, implement the data processing method provided by the steps of
The computer-readable storage medium above may be a data processing apparatus provided by any of the preceding embodiments, or an internal storage unit of the computer device above, such as a hard disk or an internal memory of the computer device. The computer-readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a smart media card (SMC), a secure digital (SD) card, or a flash card provided on the computer device. Further, the computer-readable storage medium can include both the internal storage unit and the external storage device of the computer device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the computer device. The computer-readable storage medium can also be used for temporarily storing data that has been or will be output.
An embodiment of this disclosure further provides a computer program product or computer program including computer instructions stored in a computer-readable storage medium. A processor of a computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to carry out the description of the data processing method as described in the embodiments corresponding to
The terms “first”, “second”, and the like in the description of embodiments, claims, and drawings of this disclosure are used for distinguishing between different objects and not necessarily for describing a specific order. Furthermore, the term “include” and any variations thereof are intended to encompass a non-exclusive inclusion. For example, a process, method, apparatus, product, or device that includes a list of steps or units is not limited to the listed steps or modules, but may optionally include additional steps or modules not listed or may optionally include additional steps or units inherent to such process, method, apparatus, product, or device.
Those skilled in the art may recognize that the units and algorithm steps of the examples described in connection with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination thereof, and that the components and steps of the examples have been described above generally in terms of functions to clearly illustrate the interchangeability of hardware and software. Whether these functions are performed in hardware or software depends upon the particular application and design constraint imposed on the technical solution. Skilled artisans may implement the described functions in various ways for each particular application, but such an implementation is not interpreted as departing from the scope of this disclosure.
The method and the apparatus related thereto provided by embodiments of this disclosure are described with reference to method flowcharts and/or structural diagrams provided by embodiments of this disclosure, and specifically, each flow and/or block in the method flowcharts and/or structural diagrams, and combinations of flows and/or blocks in the flowcharts and/or block diagrams may be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing devices to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing devices, create means for implementing the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the structural diagrams. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing devices to function in a particular manner such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means that implement the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the structural diagrams. These computer program instructions may also be loaded onto a computer or other programmable data processing devices such that a series of operational steps are performed on the computer or other programmable devices to produce a computer-implemented process such that the instructions which execute on the computer or other programmable devices provide steps for implementing the functions specified in a flow or flows of the flowcharts and/or a block or blocks of the structural diagrams.
The above disclosures are merely preferred embodiments of this disclosure, and are of course not intended to limit the scope of claims of this disclosure, and therefore, equivalent changes made according to the claims of this disclosure are still within the scope of this disclosure.
Number | Date | Country | Kind |
---|---|---|---|
202111521261.4 | Dec 2021 | CN | national |
This application is a continuation application of PCT Patent Application No. PCT/CN2022/137442, filed on Dec. 8, 2022, which claims priority to Chinese Patent Application No. 202111521261.4, entitled “DATA PROCESSING METHOD, DEVICE, AND COMPUTER-READABLE STORAGE MEDIUM” filed to China National Intellectual Property Administration on Dec. 13, 2021, wherein the content of the above-referenced applications is incorporated herein by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2022/137442 | Dec 2022 | US |
Child | 18368680 | US |