The present disclosure relates generally to analyzing documents containing images, and more particularly to generating a plurality of files based on images containing multiple documents.
The Value-Added Tax (VAT) or sales tax is a broadly based consumption tax assessed on the value added to goods and services. A particular VAT applies to most goods and services that are bought or sold within a given country or state. When a person travels abroad and makes a purchase that requires paying a VAT (or any applicable tax), that person may be entitled to a subsequent refund of the VAT for the purchase. Other taxes applied to purchases may similarly be refunded under particular circumstances. Further, sellers may offer rebates for purchases of products sold in certain locations and under particular circumstances. Such refunds of the purchase price may be reclaimed by following procedures established by the refunding entity.
The laws and regulations of many countries allow foreign travelers the right for reimbursement or a refund of certain taxes such as, e.g., VATs paid for goods and/or services abroad. As such laws and regulations are different from one country to another, determination of the actual VAT refunds that one is entitled to receive often requires that the seeker of the refund possess a vast amount of knowledge in the area of tax laws abroad. Moreover, travelers may seek refunds for VATs when they are not entitled to such refunds, thereby spending time and effort on a fruitless endeavor. Further, availability of the VAT refund may vary based on the type of purchase made and the presence of a qualified VAT receipt.
One procedure to request a refund is to physically approach a customs official at an airport, fill out a form, and file the original receipts respective of the expenses incurred during the visit. This procedure should be performed prior to checking in or boarding to the next destination. Additionally, particularly with respect to goods purchased abroad, the procedure to request a refund may require that the payer show the unused goods to a custom official to verify that the goods being exported match the goods that the payer paid VATs on.
As travelers are not familiar with specific laws and regulations for claiming a refund, the travelers may submit a claim for a refund even though they are not eligible. This procedure further unnecessarily wastes time if the traveler ultimately learns that he or she is not entitled to a refund. It would therefore be advantageous to provide a solution that would overcome the deficiencies of the prior art by providing an effective way to handle VAT refunds electronically and, preferably, over the Internet.
Furthermore, due to the hassles associated with claiming refunds and, in particular, VAT refunds, customers may not be motivated to seek such refunds. Particularly with respect to potentially large refunds, properly managed refunding platforms may be crucial for saving money. As an example, a VAT refunding platform may be important to large enterprises requiring their employees to travel for business purposes. Due to the massive amount of invoices generated by a typical enterprise, many of which may be eligible for VAT refunds, enterprises may be prone to errors during collection and verification of invoices.
Additionally, the large numbers of invoices generated by a typical enterprise ultimately results in creation of a multitude of files corresponding to the invoices. Existing solutions typically require that each invoice is contained in a separate file and, consequently, require individual scanning or otherwise capturing of each invoice. Such manual individual scanning wastes time and resources, and ultimately subject the process to more potential for human error.
It would therefore be advantageous to provide a solution that would overcome the deficiencies of the prior art.
A summary of several example embodiments of the disclosure follows. This summary is provided for the convenience of the reader to provide a basic understanding of such embodiments and does not wholly define the breadth of the disclosure. This summary is not an extensive overview of all contemplated embodiments, and is intended to neither identify key or critical elements of all embodiments nor to delineate the scope of any or all aspects. Its sole purpose is to present some concepts of one or more embodiments in a simplified form as a prelude to the more detailed description that is presented later. For convenience, the term “some embodiments” may be used herein to refer to a single embodiment or multiple embodiments of the disclosure.
Some exemplary embodiments disclosed herein include a method for extracting document images from images featuring multiple documents. The method comprises receiving a multiple-document image including a plurality of document images, wherein each document image is associated with a document; extracting a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is associated with one of the plurality of document images; analyzing the plurality of visual identifiers to identify each document image; determining, based on the analysis, an image area of each document image; extracting each document image based on its image area.
Some exemplary embodiments disclosed herein also include a system for extracting document images from images featuring multiple documents. The system comprises a processing system; and a memory, the memory containing instructions that, when executed by the processing unit, configure the system to: receive a multiple-document image including a plurality of document images, wherein each document image is associated with a document; extract a plurality of visual identifiers from the multiple-document image, wherein each visual identifier is associated with one of the plurality of document images; analyze the plurality of visual identifiers to identify each document image; determine, based on the analysis, an image area of each document image; and extract each document image based on its image area.
The subject matter disclosed herein is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the disclosed embodiments will be apparent from the following detailed description taken in conjunction with the accompanying drawings.
It is important to note that the embodiments disclosed herein are only examples of the many advantageous uses of the innovative teachings herein. In general, statements made in the specification of the present application do not necessarily limit any of the various claimed embodiments. Moreover, some statements may apply to some inventive features but not to others. In general, unless otherwise indicated, singular elements may be in plural and vice versa with no loss of generality. In the drawings, like numerals refer to like parts through several views.
The user device 150 and each enterprise device 160 may be, but is not limited to, a personal computer (PC), a notebook computer, a cellular phone, a smartphone, a tablet device, a wearable computing device, a scanner, and so on. The user device 150 may include or be communicatively connected to an image sensor 155 utilized to capture images. An enterprise operating an enterprise device 160 may be, but is not limited to, a hotel, a shop, a service provider, and so on.
In an embodiment, the user device 150 captures an image (e.g., via the image sensor 155) containing multiple invoices and/or other documents. Each invoice typically includes a proof of payment for a potentially refundable purchase. The documents may be in an unorganized form, i.e., the invoices do not need to be arranged, oriented, or otherwise organized in a particular manner so long as information (e.g., words, symbols, numbers, characters, shapes, matrices, labels, barcodes, and so on) in each document is visible in the multiple-invoice image.
The user device 150 sends the captured multiple-invoice image to the server 120. The server 120 is configured to extract visual identifiers from the multiple-invoice image. The visual identifiers may include, but are not limited to, a document identification number (e.g., an invoice number), a code (e.g., a QR code, a bar code, etc.), a transaction number, a name of a business, an address of a business, an identification number of a business, a total price, a currency, a method of payment (e.g., cash, check, credit card, debit card, digital currency, etc.), a date, a type of product, a price per product, and so on.
To this end, the server 120 may include or may be communicatively connected to a recognition unit (RU) 125. The recognition unit 125 is configured to execute machine imaging processes. The recognition unit 125 is further configured to enable recognition of the visual identifiers shown in the multiple-invoice image by using one or more computer vision techniques such as, but not limited to, image recognition, pattern recognition, signal processing, character recognition, and the like. The recognition unit 125 may include, but is not limited to, an optical character recognition unit, an image recognition unit, and a combination thereof
The server 120 is configured to analyze the extracted visual identifiers to identify invoice images illustrated in the multiple-invoice image. The identification may be based on a threshold of visual identifiers required for determining an invoice based on the visual identifiers. The visual identifier threshold may represent the minimum visual identifiers needed for identifying an invoice in the multiple-invoice image. The threshold may include, but is not limited to, a minimum number of visual identifiers, a particular visual identifier, a particular combination of visual identifiers, and so on. For example, a threshold requirement for identifying an invoice in an image based on visual identifiers may include a total price, a merchant identifier, and a type of product. In that example, each identified invoice in the multiple-invoice image will include a total price, a merchant identifier, and a type of product.
In an embodiment, the server 120 may be configured to determine whether any visual identifiers are required and, if so, retrieving the required visual identifiers. The required visual identifiers may be predefined such that each invoice that is identified will be suitable for further processing. For example, for a value added tax (VAT) reclaim, a location of a purchase may be a required visual identifier. Thus, if an invoice does not include a visual identifier indicating a location of the transaction, the location associated with the invoice may be retrieved.
The required visual identifiers may be retrieved from, e.g., the enterprise device 160 and/or the web sources 170. The web sources 170 may be, but are not limited to, databases in which data regarding reclaim information is stored. Such databases may include, for example, VAT information exchange systems (VIESs), tax authority databases, rebate sharing systems, and so on. Each web source 170 may be operated by an entity such as, but not limited to, a tax authority, a VAT refund agency, and the like.
The server 120 is configured to determine an image area associated with each of the identified invoice images based on the analysis. Each image area includes the visual identifiers of its respective invoice image and indicates the boundaries of the invoice image within the multiple-invoice image. The determination may include, but is not limited to, identifying a center of an invoice, identifying boundaries of each invoice image, and so on. In an embodiment, identifying the boundaries of each invoice image may be based on clean areas in the multiple-invoice image, i.e., portions of the captured image where no text appears may be identified as boundaries of the invoice images. Each image area may be a particular shape defined by its boundaries such as, for example, rectangular (i.e., a typical invoice contains text within a rectangular area). The image areas may be the same or different shapes defined by their respective boundaries.
The server 120 is configured to extract each identified invoice image based on its respective determined image area. The extraction may include creating a file for each invoice image. The extraction may further include, but is not limited to, cutting, copying, cropping each identified invoice image. Extraction via cutting may include removing each invoice image from the captured image and generating a new file for each removed invoice image such that, after extraction, the multiple-invoice image does not feature any invoices. Extraction via copying may include generating a new file for each invoice image including a copy of the invoice image such that, after extraction, the multiple-invoice image still contains all copied invoice images. Extraction by cropping includes generating a file containing a copy of the multiple-invoice image for each identified invoice image and shrinking each file based on its respective invoice image such that each file contains only the respective cropped invoice image.
The server 120 may be configured to store each extracted invoice image separately in, e.g., the database 180. The invoice images stored in the database 180 may be subsequently accessed for processing (e.g., VAT reclaim processing). In an embodiment, the server 120 may be further configured to automatically submit a VAT reclaim for any or all of the extracted invoice images. The VAT reclaim may be submitted to a refund agency via, e.g., one of the web sources 170 as described further in U.S. patent application Ser. No. 14/836,230, assigned to the common assignee, which is hereby incorporated by reference for all that it contains.
The server 120 typically includes a processing system 122 coupled to a memory 124.
The processing system 122 may comprise or be a component of a processor (not shown) or an array of processors coupled to the memory 124. The memory 124 contains instructions that can be executed by the processing system 122. The instructions, when executed by the processing system 122, cause it 122 to perform the various functions described herein. The one or more processors may be implemented with any combination of general-purpose microprocessors, multi-core processors, microcontrollers, digital signal processors (DSPs), field programmable gate array (FPGAs), programmable logic devices (PLDs), controllers, state machines, gated logic, discrete hardware components, dedicated hardware finite state machines, or any other suitable entities that can perform calculations or other manipulations of information.
The processing system 122 may also include machine-readable media for storing software. Software shall be construed broadly to mean any type of instructions, whether referred to as software, firmware, middleware, microcode, hardware description language, or otherwise. Instructions may include code (e.g., in source code format, binary code format, executable code format, or any other suitable format of code). The instructions, when executed by the one or more processors, cause the processing system to perform the various functions described herein.
It should be noted that the embodiments described herein above with respect to
In S210, an image featuring multiple invoices is received. The invoices in the multiple-invoice image may be unorganized such that they are not suitable for immediate processing.
An exemplary and non-limiting multiple-invoice image may be seen in
In S220, visual identifiers are extracted from the multiple-invoice image. Each visual identifier indicates information related to an invoice in the multiple-invoice image. The visual identifiers may include, but are not limited to, a document identification number (e.g., an invoice number), a code (e.g., a QR code, a bar code, etc.), a transaction number, a name of a business, an address of a business, an identification number of a business, a total price, a currency, a method of payment (e.g., cash, check, credit card, debit card, digital currency, etc.), a date, a type of product, a price per product, and so on. Threshold visual identifier requirements (e.g., a number or particular group of visual identifiers) may be identified based on a type of entity for which the multiple-invoice image was captured.
In S230, the extracted visual identifiers are analyzed. The analysis may yield identification of metadata associated with the multiple-invoice image. Such metadata may include, but is not limited to, a number of invoice images in the multiple-invoice image, pointer data indicating an invoice image available via one or more storage units (e.g., the enterprise device 160 or the web sources 170), a purchaser of a transaction, and so on.
In S240, an image area of an invoice image featured in the multiple-invoice image is determined based on the analysis. In an embodiment, the determination may include identifying a boundary of each invoice illustrated in the multiple-invoice image. The image area of an invoice may be defined as the area contained within the boundary of the invoice.
Exemplary determined image areas may be seen in
In S250, the invoice image is extracted from the multiple-invoice image respective of its image area. The extraction may include generating a new file for the invoice image, and may further include cutting, cropping, and/or copying the invoice image in the captured image. Exemplary methods for extracting invoice images from a multiple-invoice image are described further herein below with respect to
Extracting invoice images from a multiple-invoice image via cutting may be seen in
In optional S260, the extracted invoice image may be stored as a file in, for example, a database (e.g., the database 180). Stored invoice images may be subsequently processed further. For example, stored invoice images may be analyzed for value added tax (VAT) reclaim eligibility and/or sent to a refund agency.
In S270, it is determined whether additional invoice images are to be extracted from the multiple-invoice image and, if so, execution continues with S210; otherwise, execution terminates.
Extraction of an additional invoice image from a multiple-invoice image may be seen in
In S310A, an invoice image featured in a multiple-invoice image is identified based on its image area. In S320A, the identified invoice image is cut from the multiple-invoice image. The cut image is removed from the captured image such that it is no longer featured in the multiple-invoice image. In S330A, a new file including the cut invoice image is generated. In S340A, the generated file may be stored in, e.g., a database.
In S310B, an invoice image featured in a multiple-invoice image is identified based on its image area. In S320B, a file including the multiple-invoice image is generated. In S330B, the new file is cropped respective of the identified invoice image. The cropping may include shrinking the size of the generated file such that the cropped file only includes the invoice image. In S340B, the cropped new file may be stored in, e.g., a database.
In S310C, an invoice image featured in a multiple-invoice image is identified based on its image area. In S320C, the identified invoice image is copied from the multiple-invoice image. In S330C, a file including the copied invoice image is generated. In S340C, the generated file may be stored in, e.g., a database.
It should be noted that the embodiments described herein above are discussed with respect to an image featuring multiple invoices merely for simplicity purposes and without limitations on the disclosed embodiments. Images featuring other documents may be utilized without departing from the scope of the disclosure. It should be further noted that visual identifiers other than those related to VAT reclaims may be utilized to identify documents captured within images according to the disclosed embodiments. It should be further noted that the analyzed image may be either captured and sent (e.g., to the server 120) for invoice image extraction, or may be retrieved from a database, without departing from the scope of the disclosure.
The various embodiments disclosed herein can be implemented as hardware, firmware, software, or any combination thereof. Moreover, the software is preferably implemented as an application program tangibly embodied on a program storage unit or computer readable medium consisting of parts, or of certain devices and/or a combination of devices. The application program may be uploaded to, and executed by, a machine comprising any suitable architecture. Preferably, the machine is implemented on a computer platform having hardware such as one or more central processing units (“CPUs”), a memory, and input/output interfaces. The computer platform may also include an operating system and microinstruction code. The various processes and functions described herein may be either part of the microinstruction code or part of the application program, or any combination thereof, which may be executed by a CPU, whether or not such a computer or processor is explicitly shown. In addition, various other peripheral units may be connected to the computer platform such as an additional data storage unit and a printing unit. Furthermore, a non-transitory computer readable medium is any computer readable medium except for a transitory propagating signal.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the principles of the disclosed embodiment and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Moreover, all statements herein reciting principles, aspects, and embodiments of the disclosed embodiments, as well as specific examples thereof, are intended to encompass both structural and functional equivalents thereof. Additionally, it is intended that such equivalents include both currently known equivalents as well as equivalents developed in the future, i.e., any elements developed that perform the same function, regardless of structure.
This application claims the benefit of U.S. Provisional Application No. 62/111,690 filed on Feb. 4, 2015, the contents of the above referenced applications are herein incorporated by reference.
Number | Name | Date | Kind |
---|---|---|---|
5606609 | Houser et al. | Feb 1997 | A |
5903876 | Hagemier | May 1999 | A |
6003016 | Hagemier | Dec 1999 | A |
7299408 | Daconta et al. | Nov 2007 | B1 |
7809614 | Drehnen et al. | Oct 2010 | B2 |
7983966 | Ostlund | Jul 2011 | B2 |
8065611 | Chan et al. | Nov 2011 | B1 |
8200642 | Maze | Jun 2012 | B2 |
8386394 | Nguyen et al. | Feb 2013 | B1 |
8438089 | Wasserblat et al. | May 2013 | B1 |
8447111 | King et al. | May 2013 | B2 |
8890978 | Madhani et al. | Nov 2014 | B1 |
9002838 | Pitzo et al. | Apr 2015 | B2 |
9158833 | Urbschat et al. | Oct 2015 | B2 |
20020091671 | Prokoph | Jul 2002 | A1 |
20030163778 | Shores et al. | Aug 2003 | A1 |
20040002906 | Drehnen et al. | Jan 2004 | A1 |
20040010451 | Romano et al. | Jan 2004 | A1 |
20040068452 | Ullrich et al. | Apr 2004 | A1 |
20040267620 | Feldman et al. | Dec 2004 | A1 |
20050021410 | Ostlund | Jan 2005 | A1 |
20050096989 | Ostlund | May 2005 | A1 |
20050273614 | Ahuja et al. | Dec 2005 | A1 |
20060004814 | Lawrence et al. | Jan 2006 | A1 |
20070168382 | Tillberg et al. | Jul 2007 | A1 |
20080079979 | Holt et al. | Apr 2008 | A1 |
20080229187 | Mahoney et al. | Sep 2008 | A1 |
20090171958 | Anderson | Jul 2009 | A1 |
20090208118 | Csurka | Aug 2009 | A1 |
20100161616 | Mitchell | Jun 2010 | A1 |
20100220929 | Misawa | Sep 2010 | A1 |
20110016043 | Dornseif | Jan 2011 | A1 |
20110022485 | Drehnen et al. | Jan 2011 | A1 |
20110022486 | Drehnen et al. | Jan 2011 | A1 |
20110138175 | Clark et al. | Jun 2011 | A1 |
20110255784 | Welling et al. | Oct 2011 | A1 |
20120078682 | Pinsley | Mar 2012 | A1 |
20120078768 | King et al. | Mar 2012 | A1 |
20130051671 | Barton | Feb 2013 | A1 |
20140006234 | Geisau et al. | Jan 2014 | A1 |
20140079294 | Amtrup et al. | Mar 2014 | A1 |
20140108210 | Chelst | Apr 2014 | A1 |
20140207634 | Edmonds | Jul 2014 | A1 |
20140344576 | Johnson | Nov 2014 | A1 |
20150019409 | Vagiri | Jan 2015 | A1 |
20150019586 | Raichelgauz et al. | Jan 2015 | A1 |
20150026556 | Stadermann et al. | Jan 2015 | A1 |
20150040002 | Kannan et al. | Feb 2015 | A1 |
20150106247 | Saft et al. | Apr 2015 | A1 |
20150235301 | Brady et al. | Aug 2015 | A1 |
20150242832 | Corritori et al. | Aug 2015 | A1 |
20150248657 | Loock et al. | Sep 2015 | A1 |
20150332283 | Witchey | Nov 2015 | A1 |
20150356174 | Narayana et al. | Dec 2015 | A1 |
20150378972 | Kapadia et al. | Dec 2015 | A1 |
20150379346 | Newcomer et al. | Dec 2015 | A1 |
20170147540 | McCormick et al. | May 2017 | A1 |
20170308517 | Josifovski et al. | Oct 2017 | A1 |
20170351968 | Bowers et al. | Dec 2017 | A1 |
20180012268 | Simantov et al. | Jan 2018 | A1 |
20190236128 | Guzman et al. | Aug 2019 | A1 |
20190236347 | Guzman et al. | Aug 2019 | A1 |
20190244048 | Saft et al. | Aug 2019 | A1 |
Number | Date | Country |
---|---|---|
2004280274 | Oct 2004 | JP |
2008167009 | Jul 2008 | JP |
2010143001 | Dec 2010 | WO |
2014132256 | Sep 2014 | WO |
Entry |
---|
The International Search Report and the Written Opinion of the International Searching Authority for PCT/US2016/067716, ISA/RU, Moscow, Russia, dated Jul. 20, 2017. |
The International Search Report and the Written Opinion for PCT/US2016/016104, ISA/RU, Moscow, Russia, dated Apr. 14, 2016. |
The International Search Report and the Written Opinion for PCT/US2016/063828, ISA/RU, Moscow, Russia, dated Apr. 13, 2017. |
The International Search Report and the Written Opinion for PCT/US2016/066845, ISA/RU, Moscow, Russia, dated May 25, 2017. |
The International Search Report and the Written Opinion for PCT/US2016/068536, ISA/RU, Moscow, Russia, dated Apr. 13, 2017. |
The International Search Report and the Written Opinion for PCT/US2016/068714, ISA/RU, Moscow, Russia, dated May 11, 2017. |
The International Search Report and the Written Opinion for PCT/US2017/012120, ISA/RU, Moscow, Russia, dated May 18, 2017. |
The International Search Report and the Written Opinion for PCT/US2017/014874, ISA/RU, Moscow, Russia, dated May 18, 2017. |
The International Search Report and the Written Opinion for PCT/US2017/015087, ISA/RU, Moscow, Russia, dated Jan. 26, 2017. |
EP Search Report for European Patent Application No. 16 747 086.3 dated Dec. 4, 2017, The Hague. |
The European Search Report for European Application No. 16890887.9, dated Jun. 5, 2019, EPO, Munich, Germany. |
Notice of Deficiencies for European Application No. 16 747 086.3, dated Apr. 23, 2019, EPO, Netherlands. |
The European Search Report for EP Application 17767105.4, dated Sep. 9, 2019, EPO, Munich, Germany. |
The European Search Report for EP Application No. 16894794.3, The European Patent Office, The Hague, Date of Completion: Aug. 16, 2019. |
The European Search Report for European Application No. 17799796.2, dated Oct. 7, 2019, EPO, Munich, Germany. |
The First Office Action for Japanese Patent Application No. 2016-574128, dated Oct. 8, 2019, Japanese Patent Office, Tokyo, Japan. |
The European Search Report for EP Application No. 17837704.0, The European Patent Office, The Hague: dated Jan. 29, 2020. |
The European Search Report for EP Application No. 17837759.4, The European Patent Office, The Hague, dated Jan. 29, 2020. |
Number | Date | Country | |
---|---|---|---|
20160225101 A1 | Aug 2016 | US |
Number | Date | Country | |
---|---|---|---|
62111690 | Feb 2015 | US |