The disclosure relates to manufacturing execution systems (MES) which are used in manufacturing to track and document the transformation of raw materials to finished goods. More particularly, this disclosure relates to systems and methods for intelligence augmentation of a human in a human-in-the-loop flexible manufacturing system in non-automated and semi-automated complex manufacturing.
Whilst there has been considerable development from the MES perspective to allow for seamless integration of manufacturing and business processes, there has been little work on the seamless integration of the human into those processes. Current approaches require the human to control or interact with the interfaces such as reading from screens, input values, and selecting from menus. Thus, the system and the human are independent.
Based on the complexity of the system, information can be entered digitally or through a physical hardcopy which is then stored. In low-complexity systems where a physical hardcopy is stored, the MES does not receive information back from production. There is no confirmation received that a work order, or individual unit has been completed and as such this is an open loop system.
As part of a product's manufacture, a work instruction is also needed. These are often large, detailed documents and the level of information which they contain can lower usability and present difficulty in using them for reference by human users. This work instruction can be stored digitally or in hard copy, again, depending on the complexity of the manufacturing system.
As such, it is not only difficult for users to follow such a work instruction, it is also difficult to ensure that a work instruction has been followed. Currently, this is done through a quality inspection of the product, to ensure the manufacture is to standard. However, this does not ensure correct procedure was followed to achieve that standard. In addition, postproduction sampling is not sufficiently responsive in a fast-changing environment to deliver Zero Defect manufacturing. Real-time quality checks during assembly must be performed, making it possible to detect mismatches before any product loss occurs.
The current view is that the skilled human is seen as a crucial resource for their capacity to manage the complexity of flexible manufacturing. Experiential knowledge, informal expertise, and tacit knowledge are essential even in routine assembly. So whilst humans induce quality errors, what currently saves the human from displacement in complex manufacture is that unique contribution they bring to successful execution of the manufacturing process.
There is a problem in manufacturing digitalization that huge quantities of data are being created and stored, that have to be processed, cleaned, analysed, commonly referred to as the ‘curse of dimensionality’. Traditionally, data can be stored in two types of queues: ‘First in, First out’ and ‘Last In, First out’. These are more commonly known as FIFO and LIFO respectively. An issue arises when a signal is received by a system using these data queuing techniques. The system may only capture data either before the event occurs, or after, but not both. This results in an inability to fully capture and contextualise all of the data relating to an event or events. The approach in other systems is to record everything, which can then be searched through if there is a problem, or alternatively post process the video record of the day's work by cutting it into time slices or applying AI to identify trends, all of which is prohibitively time consuming and requires significant processing capability.
European patent publication number EP 3 611 676, assigned to Boeing Co, discloses an automated supervision and inspection of an assembly process at an assembly site. The Boeing patent publication's primary function is the production of quality reports, supervision and inspection based on very large data sets (three dimensional global mapping). The Boeing system buses huge volumes of data, based on deep neural networks by generating three dimensional point clouds of the assembly site and production to which surface patches are fitted. The data processing requirements are exceptionally large. The required data to configure such as system suits the production to which it is employed i.e. assembly of plane wings etc. That is large scale production of large scale items. Deep neural networks such as this cannot satisfy real-time processing requirement without significant run-time memory cost to cope with the thick model architecture. The Boeing system is not suitable for flexible manufacturing where the product changes rapidly to batch size manufacturing.
There is therefore a need to provide an improved system and method for use in complex manufacturing execution systems (MES) to overcome at least one of the above mentioned problems.
In accordance with a first aspect of the invention there is provided, as set out in the appended claims, a production process execution system which comprises:
The provision of contextualised production instructions and assistance in real time through intelligence augmentation of humans/users according to the invention reduces the opportunity of error and the need for intensive management of quality in manufacturing. The present invention provides two primary functions two primary functions are (a) the provision of intelligence augmentation to the human through support, instruction and feedback so that the skilled operator in complex manufacturing is integrated into the system away from a supervisory system towards a collaborative system, and (b) the production of a digital record of the manufacture. The invention can generate a digital record of each production step in an image or video format with contextual data overlaid on the frames. This digital record has a unique configurability in terms of quality in frames-per-second, length of the video, and whether only critical parts of the process should be recorded so as to minimize the amount of data to be processed and stored.
In at least one embodiment, the cognition layer makes use of neural networks to identify one or more representations of the production process, from one or more sensors, to provide feedback.
The monitoring module can comprise multiple neural network vision-systems working in parallel, checking individual regions of interest which allow for a multi-dimensional representation of the object of interest, so as to achieve multi-factor confirmation. This permits thin neural networks which permits faster training and inference which can be installed on a common computer.
It will be appreciated that the thin neural networks trained initially can be further augmented by capturing knowledge from the video frames used in the multi-factor confirmation that have passed the confidence index; frames that capture new, better or alternative execution of the process are collated and fed back into the neural network; the neural network is retrained. This allows for the creation of a model which is able to account for nuances or tacit knowledge such as how an object was held, positioning, background, lumens of the environment etc., which it is testing for the presence of the trained model.
In at least one embodiment of the present invention, the feedback comprises a system wide signal confirmation of a step complete state.
In at least one embodiment of the invention, the control layer receives instructions on production requirements, human-authentication, location and context-specific information and provides context-specific instructions to the user.
In at least one embodiment of the invention, the control layer takes a record of the production process and the associated production data at each stage in production process as received from the recording modules.
In at least one embodiment of the invention the system uses data transfer protocols to receive production instructions from external production lifecycle manufacturing systems and manufacturing execution systems.
In at least one embodiment of the invention the system operate as a standalone system. It will be appreciated that the system does not integrate to production lifecycle manufacturing systems and manufacturing execution systems, with process data being stored within the independent system.
In at least one embodiment of the invention the system uses data transfer protocols to send production process completion information to those external systems automatically without the need for user input.
In at least one embodiment of the invention, the system manages multi-user concurrent access across a range of manufacturing use-cases.
In at least one embodiment of the invention the system uses cloud-based services, fog-computing, edge-computing, networked-computing, or local-computing.
In at least one embodiment of the invention, the cognition layer uses multi-threading to achieve real-time performance.
In at least one embodiment of the invention a series of class objects represent the production process, link the trained models, the video, aural and image assets, and any camera configurations to the corresponding sub-step.
In at least one embodiment of the invention, the Cognition Layer makes use of neural networks to identify one or more representations of the production process, from one or more sensors, to produce a system wide signal confirming the step complete state.
In at least one embodiment of the invention the monitoring modules comprise sensors which monitor manufacturing variables including temperature, pressure, tolerances.
In at least one embodiment of the invention the manufacturing variables are captured and overlaid on frames or images then stored as a permanent record of production.
In at least one embodiment of the invention the monitoring or digital record module includes a video monitor which can capture one or more video frames of the user undertaking a task during the production process and store the video frames with a unique job identifier.
In at least one embodiment of the invention, the production process execution system can flag uniquely identified objects to be checked for compliance to quality.
In at least one embodiment of the invention, the monitoring module comprises an artificial intelligence vision system.
In at least one embodiment of the invention, a flag can be turned on when the artificial intelligence vision system determines that the user is not working in line with the implemented steps.
In at least one embodiment of the invention, the implementation modules load just-in-time context-specific intelligence, product line information and code executed for each stage of the manufacturing process.
In at least one embodiment of the invention, the implementation modules display context specific instructions as to the next part of production process to the human, through a multiplicity of signals.
In at least one embodiment of the invention the monitoring modules transfer multiple live video stream feeds to the control layer.
In at least one embodiment of the invention the cognition layer recognizes within those video stream feeds the desired state of the object, by comparison to image-based neural networks.
In at least one embodiment of the invention the cognition layer generates a state-achieved signal once a desired step complete state has been achieved.
In at least one embodiment of the invention confirming feedback is provided to the user through a user interface on receipt of the step-complete signal.
In at least one embodiment of the invention the next steps to be executed are presented to the user on receipt of the step-complete signal.
In at least one embodiment of the invention, the production process can be split into sub-stages, so as to allow multiple users to operate concurrently for load balancing.
In at least one embodiment of the invention, receipt of the step-complete signal causes a video or digital record to be taken that of a predetermined time before the signal and a predetermined time after the signal.
In at least one embodiment of the invention, the predetermined time before the signal and the predetermined time after the signal have the same length. Thus a record of what the user did immediately before, at, and immediately after the critical assembly is kept, in this way unnecessary frames of the scene are not stored.
In at least one embodiment of the invention the video record is of a configurable quality in frames-per-second, so as to minimize the amount of data to be processed and stored. In this way, one frame to any maximum of frames can be stored as a record. Different frame rates can be set for different parts of the process thus optimising storage.
In at least one embodiment of the invention, specified critical parts of the process, and those critical to quality can be captured, thus only the relevant parts of the process, as flagged, are stored.
In at least one embodiment of the invention, on receipt of the step complete signal, a light is activated to confirm to the human that the step has been completed satisfactorily.
In at least one embodiment of the invention, on receipt of the step-complete signal, a relay is activated to produce a sound to confirm to the human that step has been completed satisfactorily.
In at least one embodiment of the invention, the monitoring modules comprise a silent monitor or silent partner which monitors the user during the execution of their work, and provides assistance to the user if the cognition layer determines that the user needs assistance.
Advantageously, where the user receives the information at the right time, in the right place, within the right context, thus they receive a context-specific feed of instructions.
In at least one embodiment of the invention, the cognition layer comprises an image-based neural network as a vision system to classify and locate objects.
In at least one embodiment of the invention, the monitoring module comprises multiple neural network vision-systems working in parallel, checking individual regions of interest which allow for a multi-dimensional representation of the object of interest, so as to achieve multi-factor confirmation. In this way signaling to the user is achieved confirming that the assembly is correct from all angles. In so doing, better quality checking is performed as being performed from many angles, and checking operates faster that it would do from one angle which may be obscured from the camera.
In at least one embodiment of the invention, wherein the monitoring module comprises neural networks which identify the location of an object and establish a datum such that future regions of interest are then established from this datum for the location of future models, reducing the impact of changes to equipment set ups, and potential moving of the vision system.
In at least one embodiment of the invention, a manual override to the pointed region of interest can be activated, whereby the camera or the objects can be moved, by human or machine, so that a visible frame of reference, such as may be displayed with a bounding box, is aligned with an appropriate image that the neural net is looking for.
In at least one embodiment of the invention, the cognition layer comprises a neural network image training set, preferably a thin neural network. In one embodiment the cognition layer comprises multiple thin neural networks working in parallel focused on a region of interest configured to perform image recognition in real time
In at least one embodiment of the invention, the cognition layer operates by taking a single image given by the human and performing a number of alterations to that image in order to create a very large data set which creates and captures variances within the data and allows for generalizations within the data wherein the data set is then given to the neural network to complete the training of a model
In at least one embodiment of the invention, said images are made transparent, cropped, rotated, and have different backgrounds, lights and shadows applied.
Advantageously, the process steps can be applied in any order.
In at least one embodiment of the invention, the cognition layer captures knowledge captured from the video frames used in the multi-factor confirmation to retrain the neural network so that new and better, or alternative processes can be captured.
In at least one embodiment of the invention the neural networks can be retrained with images taken from previous approved states.
In at least one embodiment of the invention the images will have been approved previously via a confidence index threshold (cit), thus images will have a confidence index somewhere between cit and 1.
In accordance with a second aspect of the invention there is provided a method for creating a video digital record of an event, the method comprising the steps of:
In at least one embodiment, the method is used to create a video record in the production process execution system in accordance with the first aspect of the invention.
In at least one embodiment, the queue length is configurable by a user.
In at least one embodiment, a frame rate for the video frames is configurable by a user.
In at least one embodiment, the queue length and/or the frame rate can be adjusted based on the event which is being monitored.
In at least one embodiment, the video digital record is created by having a video camera viewing an area of interest.
In at least one embodiment, the video camera captures live video frames constantly.
In at least one embodiment, the video frames are passed through the queue-based data structure on a first in, first out basis.
In at least one embodiment, the notification that an event of interest has occurred is provided by an AI-driven vision system.
In at least one embodiment, the notification that an event of interest has occurred is provided by an IoT sensor.
In at least one embodiment, the video frames to be stored from the queue-based data structure use centre forward and backward first in first out, essentially a Centre Forward Centre Backward (CFCB) queue from the notification point that an event of interest has occurred.
In at least one embodiment, the method of the second aspect of the invention is used to create a video record in the production process execution system of the first aspect of the invention.
In accordance with a third aspect of the invention there is provided a method for training a model to recognise features in an image, the method comprising the steps of:
In at least one embodiment, the method is used to create a neural network image training set in accordance with the first aspect of the invention.
In at least one embodiment, the alterations are carried out in series such that subsequent alterations are performed on previously altered images.
In at least one embodiment, the alterations are carried out on a previously altered image.
In at least one embodiment, the alterations are carried out on an immediately previously altered image.
In at least one embodiment the invention is used for the training of image based neural networks which work in conjunction with vision systems.
In at least one embodiment, alterations comprise a plurality of alterations.
In at least one embodiment, alterations comprise the removal of the background of the image by making it transparent.
Advantageously, this allows the picture to focus on the item and remove reliance on the background of the scene.
In at least one embodiment, the image with transparent background is cropped and re-scaled x number of times.
Advantageously, this enables a model to be trained to account for the scale of the item it is looking for, meaning the model can be detected at a greater range of distances to the camera. Also, the camera can identify the item if not all of the item is within the frame, i.e. a portion of the item missing out of frame.
In at least one embodiment, the cropped image is rotated at a number of increments of degrees between 0° and 360°.
In at least one embodiment, the increments of degrees, y, is variable ranging from 1 to 360 degrees. to allow the vision system to correctly identify the object from many different angles, meaning that the trained model is now rotation invariant.
In at least one embodiment, a number of background images, b, are added in order to represent the item being viewed in different scenes.
In at least one embodiment, the background scenes represent the likely scenes in which the item may be viewed for example a person may be positioned on a road, at a desk or in a landscape. A manufacturing component might be positioned on a workbench which was coloured black, stainless steel, white, or grey background.
In at least one embodiment, a number of light effects, e, are added to the images.
In at least one embodiment, these include contrast or pixel intensity. This allows for the creation of a model which is able to account for changes in the lumens of the environment which it is testing for the presence of the trained model.
In at least one embodiment, the images are saved and added to a collective data set, which can be used to train a machine learning model.
In at least one embodiment a number of variant images can be generated and added to the data set, allowing for the creation of a more robust model.
In at least one embodiment, the dataset is large with respect to the amount of data in the original image.
In at least one embodiment, the method of the method of the third aspect of the invention is used to train a model to recognize features in an image in the production process execution system of the first aspect of the invention.
In accordance with a fourth aspect of the invention there is provided a method for creating a digital record of a manufacturing process, the system comprising: monitoring the manufacturing process
In at least one embodiment, the video record is saved to a database along with a number of key process variables.
In at least one embodiment, the method is used to create a video record of the production process neural network image training set in accordance with the first aspect of the invention.
In at least one embodiment, the method is used to create a record in accordance with the first aspect of the invention.
In at least one embodiment the key variable comprises one or more of operator ID, Product type, Unit Number, a Lot to which that unit belongs and a timestamp.
In at least one embodiment, data relevant to the manufacturing process is recorded.
In at least one embodiment the data record comprises manufacturing variables, such as temperature, pressure or tolerances.
In at least one embodiment, the data for quality control, traceability and accountability in real time, that encompasses visual and textual data.
In at least one embodiment, the monitor is an industrial vision system.
In at least one embodiment the method monitors a complex manufacturing process through the use of an Industrial Vision System.
In at least one embodiment the method uses trained Neural Networks to identify when a manufacturing step or steps are complete.
In at least one embodiment, upon completion of a step, a video record is captured.
In at least one embodiment the video record is saved to a database along with a number of key process variables such as the operator ID, the Product type and the Unit Number, the Lot to which that unit belongs and a timestamp.
This data can also include important manufacturing variables, such as temperature, pressure or tolerances.
In at least one embodiment, the method of the fourth aspect of the invention is used to create a digital record of a manufacturing process in the production process execution system of the first aspect of the invention.
In one embodiment there is provided production process execution system which provides cognitive augmentation to enable a human-in-the-loop manufacturing closed system for complex non-automated and semi-automated production processes which comprises:
In one embodiment digital record module includes a video monitor which can capture video frames of the human undertaking a task during the production process; wherein the captured video frames are passed through a queue-based data structure which is configurable to define a queue length, and frame rate for the video frames; wherein frames not of interest to the digital record are removed from the queue; wherein a notification that an event of interest has occurred is received; wherein a predetermined number of frames in the queue based structure which were recorded before and after the time at which the event of interest occurred and retrieved thus the minimum amount of critical frames are retained; wherein data pertaining to the digital record is overlaid on the video; wherein the predetermined number of frames for inspection or analysis are retained.
In one embodiment a flag is turned on when the system determines that the user is not working in line with the implemented steps; wherein the flag is stored with the digital record; identifies this assembly for quality checks; and generates a quality report using the digital records highlighted for quality control attention.
In one embodiment the monitoring modules comprises of a silent monitor or partner which provides intelligence augmentation to an experienced human; wherein the cognition module silently monitors the human during the execution of their work, flags work not being executed as per process for quality checking, alerts the human, prompts the human as to the correct process and provides necessary instruction if the cognition layer determines that the human needs assistance; provides the human the opportunity to correct the assembly.
In one embodiment specified steps in the assembly which are flagged as Critical to Quality can be treated differently to other steps; wherein a configuration setting is made available to turn on specific settings such as flag and recording settings; Critical to Quality flag is communicated to the human so that attention is targeted to those steps; wherein these steps can be recorded at higher frame rates and greater resolutions for minute fine-tuned accuracy reporting than other steps; digital/video records of assembly of only Critical to Quality steps can be captured in the digital/video record thus only the relevant parts of the process are stored; wherein the number of digital/video records are reduced for storage, processing and analysis so that the record is close to its intrinsic dimension.
In one embodiment the monitoring module comprises one or more neural network vision-systems working in parallel; wherein focus for each vision system is on an individual small region of interest at different fixed reference lines; activates a thin neural network for vision systems for each camera; allocates each neural network inference to a unique processing thread for improved parallel processing performance; evaluates images feed through the vision system to the relevant active thin neural network model as to whether the confidence index threshold for that thin neural network is passed; performs a logical conjunction; confirms the step is complete thereby providing multi-factor confirmation.
In one embodiment the monitoring module comprises neural networks which identify the location of an object and establish a datum such that future regions of interest are then established from this datum for the location of future steps thereby reduces the impact of changes to equipment set ups and potential movement of the vision system cameras field of view; enables the region of interest to be repositioned dynamically allow tracking within field of view; enables a focused very small region of interest for analysis; reduces the size of data to be analysed; enables a fast thin neural network for the vision system to be established.
In one embodiment wherein in calibration mode the region of interest can be overridden/configured manually; identifies the field of view of the camera, identifies the required field of view; permits alignment of the camera to the correct field of view.
In one embodiment the cognition layer captures knowledge captured from the video frames used in the multi-factor confirmation that have passed the confidence index; frames that capture new, better or alternative execution of the process are collated and fed back into the neural network; the neural network is retrained.
The invention will be more clearly understood from the following description of an embodiment thereof, given by way of example only, with reference to the accompanying drawings, in which:
Features of the present invention will be set out in general terms and also with reference to
The present invention provides a system which unobtrusively monitors the user at work and uses machine learning to identify if that user needs support and if so, offers that support. The system is configured to read the user's preferences and to provide feedback in real time to that user, thus enabling new interactions and co-operation between users, the work environment and products in a quality driven lean and agile production system. Thus, the user is within the loop, supported by technology and artificially intelligent driven assistance systems.
The present invention provides a user-cyber-physical interface to the MES systems, thus enabling a closed-loop MES. It actively collects production information to be returned to MES, without any input from users either digitally, or via a hardcopy which would need to be stored.
In at least one example, the system has an AI Driven Vision System, dynamic user instructions and detailed quality records that work to close the loop between open MES system and users, while improving the integrity of production.
In at least one example of the present invention, the AI Module uses a StepBuilder object to create the process and steps from the information in the database. The AI Module iterates through the steps of the process while looping each step until it is complete. Each step monitored by the AI Module has multiple classifiers, such as the AI model or models to be used, the vision sensor or sensors to be used and the relative region of interest (ROI) for the corresponding models.
The AI-driven vision system uses one or more models, captured by one or more sensor streams to determine whether the confidence threshold has been successfully crossed, thus assessing that the step has been completed. This provides a signal throughout the system to confirm that a user has completed a step of the manufacturing process correctly, thus confirming that the work instruction for the product has been followed correctly.
The AI Driven Vision System layer is a primary part of the cognition layer. Multiple thin neural networks are used to observe the object from a multiplicity of viewpoints. These thin neural networks comprised of a fraction of the number of layers normally existing in a deep neural network. Being a thin neural network, they can be built relatively quickly when compared with a deep neural network which fits with the scalability requirements across multiple product lines and the responsiveness required for a batch-of-one. Training data for the thin neural networks can be automatically populated using a data augmentation algorithm. The thin neural networks can be future augmented with a labelled dataset of data that previously passes the confidence index threshold thus meeting the requirements of supervised learning AI.
The step-complete signal is used to capture a video of a production step, capturing frames before and after the signal was received using an intelligent queueing system. This means only the information most critical to production is captured rather than the whole process. Videos can be saved at varying frame rates and resolutions ensuring more critical steps are captured in higher detail. Thus, minimisation of the quantity of data to be captured is achieved. This is achieved by temporarily pushing one or more video frames of the implemented steps in a data queue, on receipt of a signal that assembly is complete, save a specified number of configurable frames in a digital/video record, overlap contextual data on the video frames, and save as a least one or more assembly records.
Upon receiving the step completion signal the intelligent queuing system retains a number of video frames in the queue immediately prior to the signal being received. An equal number of video frames are added after the signal has been received. The queue now holds a number of video frames before a signal was received in the first half of the queue, and an equal number of video frames after the signal was received in the second half of the queue, thus providing a before and after record.
This video is then stored with information typically input by a user, for example, the date, time, the user identifier, the raw materials used, the batch produced, and so forth. Similarly, process parameters which are typically undocumented can be acquired from varying sensor streams and saved with the video record, ensuring production standards are upheld.
A user can then search a database to find the production information for a specific product. The user can search by date, time, the user ID, the unit ID, the work order, confidence index, process variables such as temperature, pressure etc, steps flagged for a quality check or a combination of the above. This returns videos and/or other data documenting the creation of the product. Videos can be watched or downloaded by the user. Data can be exported for data analysis.
The system is capable of tracking the steps which the user has carried out through the AI vision system and it is capable of offering clear, concise instructions visually through video or written instructions, or aurally through spoken instructions.
The User Instruction Module loads associated instructional text, video and audio files from a media folder/server that relates to a process and selects components to use according to user preference, as well as selecting languages for audio and text according to user preference.
By using the confirmation signal from the AI module, it determines when to progress and display instructions through a dynamically loading webpage that requires no user interaction to progress or change. When a user completes a step, this is confirmed to them via visual signalling from a traffic light, as well as by a confirmation noise which acts upon hearing as a pre-attentive attribute. The pre-attentive signals to the human provide confirmatory positive feedback that does not require conscious effort thus not interfering with the speed and flow of production.
The above elements of the system are controlled by the Cognition layer, which monitors and loads the intelligence and the product line information, running the code for each stage of the process. This cognition layer makes use of multi-threading so as to perform the analysis in real time.
The system also performs quality checking and compliance in real-time. So, in addition to the documentation record, a video record of the part being produced is also available to the MES. This rich data source permits engineers to review where any issues occurred, or conversely have proof that a part was correct manufactured as it left the production floor.
The creation of the video record, and the ability to track that products are complete at a unit level allows the system to give feedback to the Manufacturing Execution System from which the work orders are being generated. This enables the return of manufacturing information, as well as flags to notify that a unit has been completed. As such, this closes the feedback loop to the manufacturing execution system, creating a digital record with all information relating to the unit's production.
This invention addresses the issues in manufacturing environments, the first of these is the issue of traceability. Due to their nature, these environments often lack the uniformity on which industrial traceability relies. Parts are often in unpredictable orientations and due to the high levels of human interaction, processes are rarely carried out in the same precise locations, making it incredibly difficult to track a unit throughout its production process. Another issue is providing effective operator support.
This solution vastly increases product traceability as it captures the production process of each individual part and contextualises the data immediately. In conjunction with robust product tracking, this greatly increases the level of product traceability which can be achieved in manufacturing environments.
The second issue this invention addresses is digitalising manufacturing environments in order to establish an effective digital record of both the process and the product. This is an incredibly difficult task to overcome, as non-automated manufacturing environments generally lack the same level of system integration as their automated counterparts. As such, it becomes incredibly difficult to provide cognitive support to the user or a human that allows human-in-the-loop. The invention rather than attempting to monitor the human for reporting to others, assists the human by providing the intelligence to the human empowering them to use that to find production instructions, notify them of errors in real time providing the opportunity to fix, provide confirmatory support during the execution of their work, so that ownership is with the human.
The final issue which this solution addresses is data collection in manufacturing environments. Due to the low level of sensor and actuator integration, a large quantity of data is recorded in hard copies which must then be stored for the product's life cycle. This produces a vast amount of paperwork for a single lot, as there is a high quantity of related paperwork for each individual unit. As such lot sizes must remain quite large to ensure that this paperwork is manageable, and production is efficient. Issues also arise when a unit is re-called as the entire associated lot must be returned. This is a particularly prominent issue in the Medical Device Industry, most problematic with Grade III Intravenous devices as they are often implanted and in operation.
The power of this solution is its ability to collect and contextualise the data related to a particular unit. Process data can then be tied to the unit and automatically exported to a database, reducing the necessity for hard copies and in turn the necessity for lot related paperwork. This allowing non-automated manufacturing environments to reduce their lot size, enabling them to better approach the Industry 4.0 standard of Lot Size 1.
In addition, the digitalization of manufacturing is generating huge volumes of data that are difficult to store, process and analyze. Due to the minimization techniques designed for creating of the digital record, and the use of thin neural networks the data produced herein is close to its intrinsic dimension.
The artificial intelligence vision system triggers the capturing of a video 37 of each step of the manufacturing process with which the manufacturing information is stored 41. This manufacturing information, with flags confirming the completion of each step of the manufacturing process for each individual product in the work order, is returned automatically to the Manufacturing Execution System production process external systems automatically without the need for user input, thus closing the loop to the Manufacturing Execution System.
In
The “Process Recognition Module” 55 verifies a work order by accepting the input of the work order/process job number and retrieving the appropriate instructions.
It performs checks to ensure that the requested job can be performed at that workstation/location. The “Video Feedback Module” 57 loads the instructions to complete the next step in the production process, where each production process is dissected into a discrete number of steps with go/no-go conditions identified at the end of each step. The video instructions continue to play in a loop until it receives a step-complete-signal, when the next video instruction is loaded thereby providing real-time instructions on each step, and confirmation that the step is complete by progression. Thus, the user receives the right instructions at the right time, in the right place, within the right context.
The “Traffic Light Feedback Module” 59 displays a subtle green light for positive feedback where the step-complete-signal has been received. Amber lights can be display when a tolerance is within range. Red light feedback indicates that the step has been completed incorrectly and must be undone or discarded. The “Aural Feedback Module” 61 sends aural instructions to the user as to the work to be done in the next step. These aural instructions are available in the user's chosen language. Aural feedback is also provided to the user as an audible click and acts as a pre-attentive attribute, capturing the attention of user to ensure they receive feedback on their actions. The “Textual Feedback Module” 63 displays text instructions on the monitor as to the work to be done in the next step. These textual instructions are available in the user's chosen language. The “Silent Monitor” or silent partner 69 monitors the steps being completed and provides no instructions or feedback for use by an experienced user.
However, if the cognition layer notices that the step is not being completed correctly, then it can intervene and prompt the user as to the correct step, or with an instruction that the step complete does not meet the satisfying conditions providing the user with the opportunity to correct that step at that point in time. The “Video Record Module” 65 records a video of the step-complete, so that a live record is taken on the production at each stage in the production process. The video record is taken for a specified number of seconds before and after the step-complete-signal is received, thus the record illustrates what happened just before and just after the critical moment, thus goal posting the step. The frame-length and the quality in frames-per-second are configurable thus one frame (i.e. one image) or multiple seconds could be taken. Other data relevant to that step being completed are also captured with the video.
Whether to record a particular step, and whether that step is critical to quality (CTQ) can be specified thus only the relevant parts of the process, as flagged, are stored thus reducing the storage requirements.
The “Traceability Validation Module” 67 provides data analytics, search and retrieval functionality, where approved users can search for data records by process, by work order, by user, by timestamp, by location and other searchable parameters. The “User Interface Layer” 71 provides the services to the users, it receives live data from the front end, which is dynamically loaded, requiring zero-input from the user for progression from one page or service to another. The “Control Layer” 73 is the engine of the system that monitors state and signals, loading appropriate modules and functionality. The “Cognition Layer” 75 is the intelligence layer. It performs the artificial intelligence computations using thin and augmented neural networks as image vision systems. The “Operating Systems/Hardware Layer” 77 captures the physical architecture of the system. The storage layer comprising of video file server 79 and database servers 81 captures the notion that storage is persistent.
A key area where this solution is particularly powerful in is non-automated production environments. A highly intricate manufacturing process can be observed with high levels of data collected. The data can then be stored with and/or overlaid on the video file for immediate data contextualisation. The overlaying of data on the video reduces the high-dimensionality problem associated with big data in the digitalisation of manufacturing so that the record is close to its intrinsic dimension.
This contextualised data is directly tied to a unit and lot in the manufacturing environment. As such, when a unit is recalled, the unit number can be extracted in the field and relayed to the manufacturing company. The manufacturing company can then search through the database of video files, extract the files relevant to that particular unit and perform an in-house quality check. Similarly, the units made directly before and after this unit can be then be inspected to ensure that no issue arose which may compromise the entire lot of units.
While the video file allows for the immediate contextualisation of the data relating to the unit's manufacture, it also serves the purpose of creating a digital record which documents that unit's manufacture. This digital record can be stored for the product's lifecycle and accessed by the manufacturing company as issues arise in the field.
Points of key importance throughout the unit's manufacture can then be highlighted as critical to quality, ensuring these are saved in far greater detail, at higher framerates and greater resolutions. As such, the data that is saved is guaranteed to be critical to the unit's quality and ensures that the data which a manufacturing company is most concerned with is most accessible.
In
The front-end application is made up of HTML pages that contain design elements driven from a template in which a base standard of design is implemented and reused throughout the application to maintain consistency and familiarity. These templates are filled and delivered by many different views in which the views contain responses used to populate the templates.
These views are a function of the system that receive a web request and returns a web response, supported by the URL's component which contains the information to link the pathways within the system and provide locations of functionality. The views are used to perform operations that fetch low level objects from the database, modify said objects as needed, render forms for input and build HTML among other operations.
The objects fetched from within the views are modelled by the model's component. These models are made up of attributes that relate to the data stored within the database 95 where each attribute represents a database field. A set of attributes makes a model. A model may represent a profile of a user, the user itself, a neural network model, a step within the instruction module or a record from the record module. Some models contain within them reference to files and folders stored as part of the instruction module by way of media files 99 such as video, audio and text. These media files are allocated storage and retrieved by the media component.
An admin component 101 acts as an overseer that can modify the components on a lower level to a standard user. A group of settings 103 are applied that dictate the framework configurations which include the base directory, the allowed hosts of the application, the installed applications, the middleware, the templates directory backend 105 and options as well as the database interface, password validators, overall language, time zone and the static content and media URLs, their directories and roots. Information from the models 97 and database driven components are serialised in a JSON format to be broadcast out via a Representational State Transfer (REST) communication architecture.
The communication architecture talks to the backend component of the system which retrieves information about who is using the system and how they are using it, for example what preferences the user has in terms of what high level components they wish to be able to use. The traffic light module 109 uses this information to understand when to function or not. The record system 111 uses this information to build the record, both within the video file itself as overlaid text and the record as a whole. The state component is driven by the AI module 113 which dictates the step complete signal status 115. This in turn informs the traffic light module 109 and the record module 111 on when to perform their respective timing actions. The AI module 113 retrieves information from the database on how to build its models and in turn classify those models. The RFID module 117 is used to inform the front end views 119 on who has logged on to the system and their corresponding level of authorisation.
At step 157, the video traceability module captures a video of the process to be stored with manufacturing information 163, 165. The traffic light module offers visual and audible feedback to the user to confirm the step is complete 167, 169. The User Instruction Module loads in the aural, written and visual user instructions in the appropriate languages based on User Preferences 171, 173.
A video camera is pointed to an area of interest and captures live video frames constantly. These video frames are passed through a queue-based data structure on a roll in roll out system, commonly referred to in computing as first in, first out. These frames are discarded by popping off the top of queue.
The process listens for a signal to indicate an event of interest has occurred, such as may be generated by an AI-driven vision system or an IoT sensor. This event of interest is recorded. On receipt of the signal, a number of video frames in the queue immediately prior to the signal are retained, with unnecessary frames being popped off the queue. An equal number of video frames are added to the back of the queue after the signal has been received. The queue now holds X number of video frames before a signal was received in the first half of the queue, and an equal number of video frames after the signal was received in the second half of the queue, thus providing a before and after record.
As shown in
Traditionally, data can be stored in two types of queues: ‘First in, First out’ and ‘Last In, First out’. These are more commonly known as FIFO and LIFO respectively. An issue arises when a signal is received by a system using these data queuing techniques. The system may only capture data either before the event occurs, or after, but not both. This results in an inability to fully capture and contextualise all of the data relating to an event or events.
This solution vastly increases data contextualisation as it captures data both before and after an event occurs, this gives a greater, more holistic view of an event. As such, this data queueing method solves an issue created by LIFO and FIFO queues, as they are unable to fully capture an event.
This solution also allows for a reduction in the quantity of data necessary to capture when contextualising a process. The alternative approach to using the FIFO or LIFO queues was to record everything. So in a manufacturing context a video record may store a full record of what happened over a working day at a particular location. This causes significant issues in storage requirements and the effort required to post-process or post-review that video record. In this invention, a much shorter video record is taken which is recorded on receipt of a signal. Typically, this signal will be generated by a machine learning algorithm (e.g. a vision system based on a neural net indicates that the operator has performed their task correctly) or a sensor (e.g. spike in temperature, breakdown, fire etc.). On receipt of this signal a video record is taken that captures x number of seconds before the signal, and an equal number of frames after the signal. Thus a great reduction can be achieved in the quantity of data needed to be processed and stored.
The record module mainly provides functionality through the class CameraController 215, where camera operations are performed, and the DatabaseInterface class 217, where database commands and interactions are performed.
The AI 219 is supported by a StepBuilder class 221 that assembles the neural network pre-processed models for use and classification and a Database interface class that retrieves information about a given process.
The TrafficLight module 227 connects via USB to inform decisions on when to operate through the front end. The AI, Record and TrafficLight modules are deeply embedded in the use and configuration of the step completed signal to function. The RFID module has a reader.
The work instruction for the product to be manufactured is relayed to the user 257 based on their preferences. As the user carries out the manufacturing process 259 the unique identifier for the unit is extracted if available. The user will then work to complete the product. The system monitors the work order until all units in the work order are completed and as such the work order is deemed complete 261. During this manufacture the system extracts the manufacturing information as well as videos of the manufacturing process when each step of the process is completed for each unit in the work order. The system monitors for as long as a user is manufacturing a work order. For periods of inactivity, the user can be logged out 263. In this event, or in the event the user logs out the following user to log into the system will continue on the step that the prior user logged out of the system. The system can as such track multiple users completing single units.
When a single user logs on to the system 275, they can work on all of the steps to complete the product. When a second user logs on 277, they are automatically assigned steps of the manufacturing process to complete based on how the steps of the manufacturing process were divided when the process was established for a second user. If the process was not established to allow for a second user, they will not be logged on until the first user logs off. The same applies for the Nth user, so long as the process has been established to be worked on by that number of users, the process will automatically assign the steps of the manufacturing process to that number of users, based on how it was originally established. If the process has not been established to allow the Nth user to work on the product, they will not be logged on to the system.
This will allow the vision system to correctly identify the object from many different angles, meaning that the trained model is now rotation invariant. A number of background images are then added in order to represent the object being viewed in different scenes. These background scenes represent the likely scenes in which the object may be viewed—for example a person may be positioned on a road, at a desk or in a landscape. An array of rotated background images 311 is created. A manufacturing component might be positioned on a workbench which was coloured black, stainless steel, white, or grey background. A number of light effects 313 are then added to the images and an array created 315. These may include contrast or pixel intensity. This allows for the creation of a model which is able to account for changes in the lumens of the environment which it is testing for the presence of the trained model. These images are then saved 317 and added to a collective data set, which can be used to train a machine learning model. Any number of variant images can be generated and added to the data set, allowing for the creation of a more robust model.
This feature of the present invention is further illustrated by the following worked example.
Given the following values:—
In this case, one image became a training data set of 72,000 individually named images.
The method of
The neural networks must be trained with a wide variety of data in order to create a robust, reliable and accurate model. Issues arise in the collection of data to train such a model. The data can either be acquired instantaneously, which results in a smaller data set. Smaller data sets tend to lack reliability, robustness or accuracy, and therefore much more likely to incorrectly identify the item in a vision system. A smaller dataset requires that the item almost exactly matches one of the images used in the training set—same background, same lighting levels, and placed exactly the same position with the camera at the same distance from the item. These conditions are difficult to manage and unlikely to occur in practice.
Alternatively, the data set can be collected with a focus on variety, capturing a greater variance in data. However, in order to capture both a large quantity of data and a large variance in said data, a great deal of human time is required to create each of those scenes and capture and label the images. The time and work requirement necessary to capture such data often leaves the training of models as an unrealistic option, particularly if the model is being captured in an environment which varies wildly.
This method allows the user to automatically create a data set from a single image as well as accounting for data variance, dramatically reducing the time required to collect data for model training while prioritising accuracy and reliability.
In
To begin with, the system makes use of thin layer neural nets which require minimum training time, for scalability and to cope with training and communicating requirements for increased customisation of batch of one. A fully trained machine learning layer should have a deeper neural net and in that way can identify nuances in the production process, for example an user might hold the piece differently, or have different sized/coloured hands, or indeed have an alternative approach or workaround to completing the work. When the AI is running, each image is passed through a neural net, and a confidence index calculated. Once the desired confidence index threshold is reached, the AI assumes that the image it is currently looking at satisfies that condition of completing the current step. There will be nuances in each of these images, for example the angle of the object, or plays on light and shadow, or how the object is held or rotated. These nuances capture tacit knowledge and the image taken at that moment in time can be used to retrain the neural net so that new and better, or alternative images can be captured, and a wider generalised deep neural network can be trained.
This system through its use of multiple lightweight thin neural networks based on small region of interests, inference assign to unique processing threads, and data minimization techniques used in capturing digital records, quality records, configurable frame rates and frame quality, and critical to quality selections has the net results that it is lightweight in terms of configuration, set-up, storage, processing, and can be configured for a new process within a fraction of the time required for other production systems.
The embodiments in the invention described with reference to the drawings comprise a computer apparatus and/or processes performed in a computer apparatus. However, the invention also extends to computer programs, particularly computer programs stored on or in a carrier adapted to bring the invention into practice. The program may be in the form of source code, object code, or a code intermediate source and object code, such as in partially compiled form or in any other form suitable for use in the implementation of the method according to the invention. The carrier may comprise a storage medium such as ROM, e.g. CD ROM, or magnetic recording medium, e.g. a memory stick or hard disk. The carrier may be an electrical or optical signal which may be transmitted via an electrical or an optical cable or by radio or other means.
It will be appreciated that in the context of the present invention that the term ‘user’ and ‘human’ are used interchangeably throughout the description and highlight that the invention is used in conjunction with a user or human operator in a non-automated method, system or process.
In the specification the terms “comprise, comprises, comprised and comprising” or any variation thereof and the terms include, includes, included and including” or any variation thereof are considered to be totally interchangeable and they should all be afforded the widest possible interpretation and vice versa.
The invention is not limited to the embodiments hereinbefore described but may be varied in both construction and detail.
Number | Date | Country | Kind |
---|---|---|---|
20195889.9 | Sep 2020 | EP | regional |
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/075253 | 9/14/2021 | WO |