The present invention relates to the field of digital imagery, and more specifically relates to a method for searching the internet for instances of the usage of images, especially including those images which have degraded quality, either purposefully or incidentally.
As the internet continues to increase in breadth and size, it has become more difficult to note when one's intellectual property is displayed online without authorization. Unwarranted parties may opt to display one's likeness, logo, or similarly protected images without consent, and it is impossible to take the appropriate counter measures until one is aware of the infringement.
This is further complicated when a party displays a degraded, altered, partially corrupted, or similar incomplete depiction of the image. Presently, image search platforms are ill-equipped to detect such degraded images and include them in result sets. However, some forms of machine learning via a convoluted neural net have been crafted for the specific purpose of image analytics. If such methods were trained for the detection of degraded images, image search platforms would be more effective in the detection of such images, authorized or otherwise, on the internet.
Thus, there is a need for a new method by which degraded or otherwise compromised images which are in use online may be matched and identified. Such a method is preferably configured to include all degraded images in the search result set despite any imperfections and altercations to the image file itself, as well as the depiction of the image on the internet.
The present invention is a method for performing an image search which enables the identification and classification of degraded images as pertinent results in the result set. The method employs machine learning, namely a convolutional neural net. Preferably the ResNet50 model is used as a training base, but any ResNet model may be used. The first and fourth layers of the image are retained and used as the feature set for the search which is preferably executed via well-known AI libraries. The first and fourth layers are employed for the classification and prediction of objects which may not be originally depicted within the image due to the aforementioned degradation.
The following brief and detailed descriptions of the drawings are provided to explain possible embodiments of the present invention but are not provided to limit the scope of the present invention as expressed herein this summary section.
The accompanying drawings, which are incorporated herein and form a part of the specification, illustrate the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention.
The present invention will be better understood with reference to the appended drawing sheets, wherein:
The present specification discloses one or more embodiments that incorporate the features of the invention. The disclosed embodiment(s) merely exemplify the invention. The scope of the invention is not limited to the disclosed embodiment(s).
References in the specification to “one embodiment,” “an embodiment,” “an example embodiment,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment, Further, when a particular feature, structure, or characteristic is described in connection with an embodiment, it is submitted that it is within the knowledge of one skilled in the art to effect such feature, structure, or characteristic in connection with other embodiments whether or not explicitly described.
The present invention is a method of performing internet-based image searches which includes degraded, corrupted, or otherwise incomplete images in its result set. The method employs a convolutional neural net (CNN) to analyze digital imagery against a base sample image. The first and fourth layers of the CNN are reserved as the feature set which are used for the classification and prediction of objects of the subject image. The method of the present invention preferably employs ResNet50 (a well-known AI library) as the CNN of choice, as it is known that it has been trained on over one million images sourced from the ImageNet database. It is preferable to use well-known AI libraries as they can provide far better results on degraded images.
The retained feature set, as derived from the first and fourth layers, are stored in an Approximate Nearest Neighbor Index (ANN-Index) to quickly determine distance for predicted objects against the subject image. Similarly, the ANN-Index facilitates the execution of cosign similarity detection. The ANN-Index is preferable as it can execute a K-Nearest Neighbor (KNN) vector search rapidly while achieving efficacious results. Other indexes can be used.
It should be noted that the method of the present invention employs Bert and MultiFiT models to provide text classification and posit a bag-of-words methodology. As MultiFiT is trained on at least 100 documents within the target language, it is optimal for the detection of text components of an image, and for the prediction of any and all degraded text components of the image. Bert is preferably used to cross-check results originating from MultiFiT analysis. For result clustering, the method of the present invention utilizes Lingo3G, a multilingual text clustering engine. With Lingo3G on the text side, and ResNet-50 on the image side, the method of the present invention uses transfer learning to increase the training ability of the system over time.
The procedure of use of the method of the present invention, as coordinated and executed by at least one computer as depicted in
As a further reporting function, once the images have been found within the trademark database, the goods and services associated with the particular marks where the images have been found are then cross referenced and a report allowing the user to see those goods and services which do and which do not have a reference to the image searched is made available. A further reporting capability of cross referencing any of the data in the trademarks found to the images is also available.
As a further embodiment, first on the training side, the features of the image are first extracted and then the above mentioned training is completed, after which, using a known algorithm “EAST” because it's an: Efficient and Accurate Scene Text detection pipeline we extract the bounding box of where the text is from the algorithm and then use optical character recognition (OCR) to convert the content of the bounding box to text. We then apply the text to the classification records for the image. Later during the search we use the extracted text as a secondary classification.
It should be noted that the above method allows the search tool to either continue to search the full set based on the images or to create a subset based solely on the words found within the images making the search faster and more efficient.
All of the above methods may be used for video over the iteration of what is well known as the key frames (the starting and ending points of the smooth transition).
The systems Tech stack includes: Solr 8.X (SolrCloud Configuration with Distributed Zookeepers), Dropwizard 3.X, Vertx 3.x, Postgresql 11.x, Hazelcast Distributed Memory Grid and Flask (Ai Model Serving). The system, although it is running on Amazon AWS servers, is not specifically tied to AWS or its infrastructure. However, being on cloud server systems allows for endless scalability.
Having illustrated the present invention, it should be understood that various adjustments and versions might be implemented without venturing away from the essence of the present invention. Further, it should be understood that the present invention is not solely limited to the invention as described in the embodiments above, but further comprises any and all embodiments within the scope of this application.
The foregoing descriptions of specific embodiments of the present invention have been presented for purposes of illustration and description. They are not intended to be exhaustive or to limit the present invention to the precise forms disclosed, and many modifications and variations are possible in light of the above teaching. The exemplary embodiment was chosen and described in order to best explain the principles of the present invention and its practical application, to thereby enable others skilled in the art to best utilize the present invention and various embodiments with various modifications as are suited to the particular use contemplated.
This application is a non-provisional application of provisional patent application No. 62/960,579, filed on Jan. 13, 2020, and priority is claimed thereto.
Number | Date | Country | |
---|---|---|---|
62960579 | Jan 2020 | US |