This invention relates to warehouse inventory management devices, systems and methods.
Regions or activities in a warehouse can generally be classified into a few zones. These are classified and described in the order in which inventory typically flows through the warehouse.
A first zone is the incoming/receiving zone. A typical warehouse has a receiving area that includes several receiving docks for trucks to pull up and unload their pallets. These pallets are usually scanned, entered into the system, inspected and validated against the accompanying paperwork, and then moved (in whole or after splitting them up into cases, boxes or cartons) to their storage locations within the warehouse. All the steps in this process are currently conducted manually and are thus rather labor intensive. Further, once the incoming pallets or boxes are put away in their respective locations on the racks and shelving, quality control personnel are usually dispatched to verify that the items were indeed put away in the appropriate locations.
A second zone of the warehouse is the storage, also referred to as reserve or racking area. In this section of a warehouse, the pallets containing cases or boxes that are placed on the shelves are stored until they need to be picked and shipped out of the warehouse. Another frequently occurring model is that these cases or boxes are opened up and sub-units are picked from these cases to fulfill smaller orders, which are then separately packaged and shipped out of the warehouse. In this reserve section, the activities that occur are therefore predominantly the put away of the pallets to specific locations on the racks, or the picking of items from these pallets or cases, which leaves the inventory locations with partial inventory. A typical warehouse also maintains several quality control personnel whose daily job is to monitor whether the right cases are in the right locations and also whether there is the right count of inventory in the partially opened pallets or cases.
A third zone of a warehouse is a packing area. Here, the picked items from the storage area are consolidated and packed into boxes that are meant to be shipped to customers. Once again, quality control personnel are assigned to make sure that each box contains the right order and that the contents of each box correctly reflect the shipping label or bill of lading that would accompany the box.
A final zone of a warehouse is the shipping area. In this section, the individual packing boxes that are intended for a common destination (such as a retail store, or a hospital or another business or even a consumer's home) are consolidated on to a pallet or packing box and shrink wrapped. In some cases, the packing boxes are shipped directly to a destination location. The appropriate shipping labels are applied to the outside of the pallet or box and the entire pallet or box is loaded on to the truck through a shipping dock door. In this area too, quality control personnel are delegated to inspect and verify that the pallets or boxes have the full complement of constituent boxes, that they have the correct labels; that they are not damaged from handling; that there are customs papers if needed; and that they are loaded on to the truck properly. Until the moment the pallet is loaded on to the truck, the warehouse owns the inventory and has liability for it.
Accordingly, given such a flow in a warehouse, if one contemplates a warehouse with 40,000-70,000 pallets or boxes and a corresponding number of positions on racks and shelves, it can become very expensive to have quality control personnel track and verify the daily activity and the various events that occur in a warehouse. Warehouses sometimes see activity that exceeds 2000-3000 pallets or boxes coming in and leaving each day, and to control costs, only a small fraction of the inventory activity is verified (or audited) by the quality control personnel.
A misplaced box or pallet can prove to be very expensive, since when the time comes to pick the box or from it, if it cannot easily be found in the location that it is supposed to be in, it can cost hours of expensive searching and manual labor. Further, this could result in shipment delays which in turn could incur penalties from the customer or the manufacturer/shipper.
Similarly, if the wrong boxes are packaged up for shipment, or the wrong shipment labels are applied or the wrong quantities are picked, this results in shipment errors, which in turn result in reverse logistics related costs as well as loss of customer goodwill.
Further, even if the boxes and pallets are in appropriate locations in the storage areas of the racks in the warehouse, certain types of inventory require that they be stored within specific temperature and humidity ranges. In warehouses where the racks can reach up to 30 feet high, it is difficult to monitor and maintain compliance with these requirements without incurring excessive costs of frequently having a human make these measurements by driving forklifts through each of the aisles.
Accordingly, there is a need in the art for technology that addresses at least some of these problems.
The present invention provides in one embodiment a method of tracking and digitization for warehouse inventory management. A warehouse with inventory locations stores inventory. The warehouse has unique markers throughout the warehouse for tracking location. Examples of the unique markers are warehouse markers on a wall, on a floor, on a bin, on a rack, placed overhead over the inventory locations, identifying an aisle, on light fixtures, or on pillars. These markers may be naturally occurring features that are already part of the warehouse, or specially placed in the warehouse to aid location information, or a combination thereof. The inventory has unique inventory information features for identifying inventory. Examples of the unique inventory information features are manufacturer logos, Stock Keeping Unit (SKU) numbers, Barcodes, Identification Numbers, Part numbers, box colors, or pallet colors.
A vehicle (such as a forklift truck, a pallet jack, an order picker, or a cart) capable of transporting the inventory and sometimes operated by a human operator (i.e. not an automatic vehicle or robot) moves throughout the warehouse and manipulates the inventory (referred to as the manipulation) or supports the manipulation of the inventory by the human operator. A plurality of cameras is mounted on the vehicle. The plurality of cameras are selected from the group consisting of one or more forward-facing cameras with respect to the vehicle, one or more top-down-facing cameras with respect to the vehicle, one or more diagonal-downward-facing cameras with respect to the vehicle, one or more upward facing cameras, one or more back facing cameras, one or more side facing cameras with respect to the vehicle.
The manipulation is defined as one or more of the steps of moving the inventory with the vehicle or by the operator from an entry of the inventory into the warehouse, storing the inventory by the at least one vehicle at the inventory locations, picking up the inventory with the at least one vehicle from the inventory locations, to a departure of the inventory out of the warehouse.
During the movement of the vehicle, images are captured of the unique markers in the warehouse by at least one of the plurality of cameras mounted on the vehicle. Vehicle location information of the vehicle is determined while the vehicle is moving throughout the warehouse by processing the captured images of the unique markers captured by at least one of the plurality of cameras mounted on the vehicle. The process for determining vehicle location information of the vehicle does not have or involve RFID tags or bar codes and furthermore the process for determining vehicle location information of the vehicle does not use RFID sensors for reading the RFID tags or bar code readers for reading the bar codes. In one embodiment, the process for determining vehicle location information of the vehicle only starts when the vehicle is moving.
Images of the inventory are captured with at least one of the plurality of cameras on the vehicle during the manipulation of the inventory. At least one of the captured images are digitized and unique inventory information features are extracted from the captured images of the inventory during the manipulation. The unique inventory information features uniquely identify the inventory. In one embodiment, the capturing of images of the inventory only starts when the human operator is about to manipulate the inventory.
A unique inventory location of the inventory is determined at the moment of the manipulation by synchronizing the extracted unique inventory information features and the determined vehicle location information of the vehicle. In one embodiment, the vehicle is further outfitted with position and inertial sensors to capture position and movement information of the vehicle and the inventory. The position and movement information could then assist in the determining of the unique inventory location of the inventory. A warehouse inventory management system is maintained with the determined inventory location during the manipulation.
In one aspect, more than one vehicle could be used in the method, each of which is responsible for specific aspects of the manipulation tasks/steps, or each of which are working in parallel with each other and responsible for all aspects of the manipulation tasks/steps.
In one aspect, the method relies essentially on (e.g. consisting essentially of) using cameras for the determining a unique inventory location of the inventory.
Aspects of the method require computer hardware systems and software algorithms to execute the method steps on these computer hardware systems. Aspects of the method require computer vision algorithms, neural computing engines and/or neural network analysis methods to process the acquired images and/or sensor data. Aspects of the method require database systems stored on computer systems or in the Cloud to maintain and make accessible the inventory information to users of the warehouse inventory management system.
In a further embodiment, the present invention is an apparatus, system or method to use a combination of human-operated vehicles, drones, sensors and cameras placed at various locations in a warehouse to track every event that occurs in the warehouse in a real-time, comprehensive and autonomous manner. By capturing every such event, a warehouse manager is then able to generate a ‘source of truth’ of the exact state of the warehouse at any given instant—including locations of items, the state of the items, damage, changes in temperature, events such as picks and puts of the inventory, etc.
In still another embodiment, the invention describes an apparatus to mount a series of cameras, sensors, embedded electronics and other image processing capabilities to enable a real-time tracking of any changes in the inventory in the warehouse, and to maintain accurate records of such inventory.
In still another embodiment, the invention includes updating the inventory in the warehouse management system when the inventory is picked from the unique inventory location or put away to the unique inventory location.
In still another embodiment, the invention includes verifying that a correct number of inventory items has been picked from the unique inventory location or put away to the unique inventory location.
In still another embodiment, the invention includes building a digital map of the unique inventory locations of the inventory in the warehouse.
In still another embodiment, the invention includes using software to obscure faces to maintain privacy.
In still another embodiment, the invention includes using face recognition software to recognize faces for security in the warehouse.
In still another embodiment, the invention includes using face recognition software to ensure that only certified vehicle operators are operating the vehicles.
In still another embodiment, the invention includes utilizing vehicle location information throughout a day or time window to improve productivity and efficiency. In one example the method includes tracking labor and equipment productivity. Based on the tags that are mounted on the various shelves in the warehouse and the sensors and cameras that are mounted on the vehicles, one can track the location of each vehicle (e.g. forklift) at any given time.
In still another embodiment, the invention includes handling Multi-Deep Shelving. In many instances, the boxes in the warehouses are not large enough to occupy the entire depth of a rack, which could be as much as 5 feet. Therewith, the warehouses stack boxes in a multi-deep manner: the boxes are stacked one in front of the other.
Embodiments of the invention have the capability to greatly increase the visibility of the events at a warehouse, provide a comprehensive cataloging of every single event, compare that event against the expected event, and report any discrepancies immediately so that they can be fixed prior to causing costly mistakes. Further, it reduces the need for costly quality control personnel in the warehouse. Simply put, embodiments of this invention greatly enhance the accuracy of inventory, at a vastly reduced cost.
In an indoor environment, GPS cannot be used to track the location of the forklifts or vehicles in the warehouse because most warehouses have metal constructions and present a “GPS denied” environment. Hence one must resort to vision, lidar, or inertial, or a combination of such sensors to accurately track location.
Embodiments of this invention are more effective than placing fixed cameras or sensors in the warehouse. Fixed cameras need to be placed at very close proximities to each other to detect the movement of forklifts to any degree of precision. Given the large sizes of warehouses, such fixed cameras make the solution excessively expensive and commercially non-viable. Further, fixed cameras require power and other infrastructure routing to many thousands of locations in the warehouses, including ceilings, racks, and pillars, which makes the solution even more expensive to maintain. A large number of cameras also significantly increases the data transmission and data processing bandwidth requirements, which further decreases the attractiveness of this solution.
In a general overall scope or pipeline of the invention for inventory management in a warehouse,
A drone scans the aisles and captures information from pallets and boxes that are stored on the racks (
Further to the overall scope are capabilities such as (
In the receiving and shipping locations of the warehouse, embodiments of the invention describe an archway near the receiving and shipping dock doors of the warehouse. This archway (also known as the QC Gate) has vertical and horizontal beams on which are mounted a series of cameras and sensors. Whenever these sensors sense that a forklift truck is entering or leaving the warehouse with pallets, they immediately turn on the cameras and sensors which capture the information from the incoming or outgoing pallets. This information is processed by the Computer Vision and Image Processing software to stitch together all the information and extract information such as shipment labels, box dimensions, damage to the boxes, or any other information deemed critical by the warehouse manager. This information is then compared against the Warehouse Management System (WMS) to determine if there are any discrepancies between the incoming or outgoing bills of lading and the actual shipment. More details on the method of image processing of the QC Gate is provided in the PIPELINE section infra.
Another area in the warehouse that needs to be tracked is the packing area. In a typical warehouse, the picked items are packaged into boxes that are then consolidated into larger pallets or boxes for shipment. However, the warehouse needs to conduct Quality Control checks on each box to ensure that the box contains the appropriate items, the correct numbers of each item, the correct SKU, no damage to the item, etc. This QC Station according to an embodiment of this invention involves the steps of:
The entire QC Station is especially valuable and relevant in many reverse logistics warehouses, where the warehouse is responsible for repairing and sending back items to individual locations. An example could be a phone repair or a laptop repair facility: the warehouse operator is required to re-furbish and pack thousands or millions of shipments with the appropriate phone, the charging unit, the earpiece set, the manual, etc. To ensure that each shipment indeed contains the required items the QC Station can be deployed.
Now a key component to the embodiments in this invention and contributing to the overall scope for inventory management in a warehouse is the PickTrack, which is Event Tracking during item picking in the warehouse. In one embodiment this is the individual racks and shelves in the aisles (
This invention also includes attaching special cameras and sensors to vehicles (e.g. forklifts which is used as an example but the invention is not limited to forklifts) and picking equipment/inventory in the warehouse. These cameras and sensors are positioned strategically around the forklift so that they can capture the location of the forklift at any given instant and also the motion of the warehouse worker who is performing the picking action. This embodiment works in the following manner:
This same scheme of using the cameras on the forklifts can also be used for multiple other event tracking functions within the warehouse operations. Several such applications and use cases are listed below:
There are also other aspects of the embodiments that become important in the context of deploying it across the entire warehouse in the manner described above.
Scan and inspect the outbound or inbound shipment pallet for the following:
The setup on the platform is demonstrated in a pictorial way in
Each beam of the gate has multiple cameras mounted on them which record the pallet as it moves through the gate (
The workflow of this pipeline is shown in
A machine learning network is applied to detect and get masks around boxes, text and damage. The masks of text-regions are then used to crop the original image and is given as an input to the text recognizer network. Since the orientation of the text is not known, the cropped images are flipped vertically also to cover cases where boxes are placed upside down. Even partial boxes and text regions are detected. Once, text is recognized, boxes and text are associated by checking overlap using a metric called intersection over union (IoU).
When the incoming forklift is identified, camera recording starts a few seconds before the actual crossing of the forklift. Similarly, recording is stopped a few seconds later after the crossing of the forklift past the gate. This results in recording of few extra frames with no relevant data and should be excluded for further analysis. In order to do this, the existence of box in each frame is identified through previously detected output masks. Then for each frame-set across cameras, statistical mode is applied to identify if the particular frame-set is relevant. The largest contiguous block of relevant frame-set is chosen for further processing.
Stitching is performed in two ways; intra camera and inter camera. The frames from each camera are used to perform intra-camera stitching. To perform stitching, pair-wise images are taken and features extracted. After feature extraction, the features are matched to get correspondences. Feature matching is evaluated through metrics to filter out weak matches. Strong matches are then carried forward to compute homography matrix transformation between the images. The same transformation derived from images is then applied to box and text coordinates as well.
Inter-camera stitching is performed using the stitched images from each camera as input. Known positional information of cameras is used to get overlap direction to give a more accurate mask for feature detection. For example, when images from Camera1 and Camera2 are stitched, features are extracted from bottom-half of Camera1 and top half of Camera2. This helps to avoid getting features from non-overlapping regions. This task is performed for all intra-camera stitched images to get full stitched image of the pallet. The homography matrices computed for color images are also then used to compute stitched images for object masks (see FIG. 8). Individual masks having overlap in stitched images are merged together using a threshold value of IoU.
The stitched object masks images are used to analyze the boxes for dimensions, damage, text and units. All the individual object masks are brought to same stitched canvas. The same homography transformations are also applied to the object masks. Some boxes are captured partially in each frame. Transformed boxes are merged based on overlap metric IoU. After merging all the masks, consolidated output is evaluated to find boxes and their corresponding text and damage extent. This process of stitching and consolidation is performed on all data from all 3 sides of the gate. Since distance sensors are also placed with each camera, there is enough information to form a 3D model of the pallet using the output from all 3 cameras. Using distance values from sensors, the physical spacing mapping is obtained between pixels on stitched image and thereby helping us to get physical dimensions of the objects.
The algorithm output is compared with data from warehouse database. The aim is to identify any discrepancy and report to the operator. The discrepancies covered under the analysis will be to identify number of boxes, incorrect size and incorrect tags and damaged items. Once discrepancies are identified, one can notify the operator with links to original image for manual verification.
Count and verify the items picked from or placed into a box or inventory location through visual imagery of the activity performed. The scene is captured from multiple cameras to cover the activity from different perspectives. The aim is to identify and subsequently verify the number of items involved in the transaction to generate any potential discrepancies.
The setup on the platform is shown in
The cameras are mounted on the vehicle at multiple locations to capture the activities from different viewpoints. If the items are occluded in one of the viewpoints, one can use images from other cameras to fill in the information. This helps mitigate the issue of potential occlusion as no constraint is placed on user behavior. The recording is triggered when the vehicle stops at a certain location, or when a certain action is detected. The text, bar-code information at the location as well as on the box to triangulate our position in the warehouse is captured. The video recording stops when the vehicle starts moving again. The video recording involves all the activities which operator performs on the location to pick or place items. Some example actions are shown in
The first step is to identify the parts of video (video segments) where different activities such as unboxing, picking, placing are performed. These activities can take place at multiple times in a video. A pre-defined window of a small-time duration (of few frames) is taken and slid across the video to identify actions in each window. Activity Recognition network can be used to perform this task. This is done on frames from all cameras. For each camera, the window is slid across all frames and activity is identified corresponding to each frame (output of activity recognition on window centered around that frame). Then contiguous blocks of each activity are detected by taking statistical mode across cameras.
Once the timestamps of each picking and placing activity are determined, segmented videos are taken to analyze number of items involved in the activity. This done by multiple approaches—object tracking, fine action recognition and change detection. Each of these approaches are explained below:
The item count from each of these approaches is computed along with confidence scores. An intelligent confidence-based voting system is used to then compute the final number of objects.
Analysis on each segment of the whole video gives us number of objects picked or placed. Finally, the count from each segment is taken to get the total number of items exchanged in the complete transaction. Final number of items remaining in the box is computed by the equation below:
Itemsfinal−Itemsinitial−ΣNpicked+Nplaced
The algorithm output is compared with the warehouse database. The aim is to identify any discrepancy and report to the inventory clerk. The discrepancies covered under the analysis will be to identify incorrect number of items picked or placed. Once discrepancies are identified, the clerk can be notified with links to corresponding videos for manual verification.
Verify the pick-list generated from WMS with visual imagery of the tote placed on a platform. The tote can be captured from multiple cameras to cover the full view. The aim is to identify and subsequently verify the items against the pre-generated pick-list to generate any potential discrepancies.
The setup on the platform is demonstrated in a pictorial way in
The fonts on boxes are small in physical dimensions and the camera is limited in terms of field-of-view and resolution per inch. This leads one to have a multi-camera setup capturing the tote. The cameras are arranged in a grid fashion with each camera having an overlapping field-of-view with its neighbor. This helps one register (stitch) the captures from all cameras on a single canvas to get a consolidated output. In addition, in the event there are multiple QC Stations in the warehouse, it may be necessary to identify the location and identity of each specific QC Station in the warehouse. For this purpose, the embedded sensor module can also contain other sensors and cameras to detect the specific location of this QC station in the warehouse. The setup is explained below in a graphical format in
A machine learning network is applied to detect boxes and text regions. The output is given in a format of center coordinates, dimensions and angle from horizontal. The bounding boxes of text-regions are then used to crop the original image which is given as an input to a text recognizer network. The cropped images are flipped vertically also to cover cases where boxes are placed upside down. Even partial boxes and text regions are detected. Once text is recognized, we associate boxes and text by checking overlap. This inference process is shown in
The frames are stitched in anti-clockwise order. To perform stitching, pair-wise images are taken and extract features. Known positional information of cameras is used to get overlap direction to give a more accurate mask for feature detection. For example, when frames from Camera 1 and Camera 2 are stitched, features are extracted from bottom-half of Camera1 and top half of Camera2. This helps one avoid getting features from non-overlapping region. After feature extraction, the features are matched to get correspondences. Feature matching is evaluated through metrics to filter out weak matches. Strong matches are then carried forward to compute homography matrix transformation between the images. The same transformation derived from images is then applied to box and text coordinates as well. This stitching process is shown in
All the individual frames are brought to same stitched canvas. Same transformations are also applied to the box and text detections. Some boxes are captured partially in each frame. Transformed boxes are merged based on overlap metric called intersection over union (IoU). After merging all the boxes, consolidated output is evaluated to find boxes and their corresponding tags. The consolidation process is depicted in
The algorithm output is compared with generated pick-list. The aim is to identify any discrepancy and report to the operator. The discrepancies covered under the analysis will be to identify missing boxes, incorrect boxes and stray boxes. Once discrepancies are identified, one can notify the operator with links to original image for manual verification. The discrepancy analysis is illustrated in
This application claims priority from U.S. Provisional Patent Application 63/030543 filed May 27, 2020, which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63030543 | May 2020 | US |