Embodiments of the present invention(s) are generally related to generating and running computer vision pipelines for processing images and/or video, and in particular to generating computer vision pipelines for processing images and/or video utilizing image transformation blocks that transform images and/or video and prediction blocks that detect objects or classes of objects in images and/or video.
Computer vision generally refers to utilizing computing devices to analyze images and/or video so as to obtain high-level understandings of the images and/or video. Applications of computer vision include detecting product defects on assembly lines, monitoring personal protective equipment (PPE) compliance at construction sites, and monitoring and detecting gas and fluid leaks.
Computer vision models may be used to process the images and/or video. Software engineers and data scientists may use software, such as Jupyter notebooks from Project Jupyter, to implement computer vision models.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium including executable instructions, the executable instructions being executable by one or more processors to perform a method, the method including: receiving a request to create a computer vision pipeline for processing images and/or video; receiving an input source for images and/or video for the computer vision pipeline; displaying multiple blocks for selection, the multiple blocks including multiple image transformation blocks, an image transformation block transforming images and/or video to produce transformed images and/or video, and multiple prediction blocks, a prediction block detecting objects or classes of objects in images and/or video to produce detected objects or classes of objects; receiving a selection of an image transformation block and configuration information for the image transformation block; adding the image transformation block to the computer vision pipeline; receiving a selection of a prediction block and configuration information for the prediction block; adding the prediction block to the computer vision pipeline; receiving an output destination for the computer vision pipeline; receiving a request to activate the computer vision pipeline; activating the computer vision pipeline; receiving input images and/or video from the input source; transforming, using the image transformation block, the input images and/or video to produce transformed images and/or video; detecting, using the prediction block, objects or classes of objects in the transformed images and/or video to produce detected objects or classes of objects; and transmitting the detected objects or classes of objects to the output destination.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium, the method further including displaying a user interface element into which a block may be dragged and dropped to select the block, and wherein receiving the selection of the image transformation block and configuration information for the image transformation block includes receiving a dragging and dropping of the image transformation block into the user interface element, and receiving the selection of the prediction block and configuration information for the prediction block includes receiving a dragging and dropping of the prediction block into the user interface element.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium, the method further including: receiving a computer vision pipeline type, the computer vision pipeline type being one of a batch type or a streaming type; if the computer vision pipeline type is the batch type, receiving a processing schedule, wherein receiving input images and/or video from the input source includes receiving input images and/or video from the input source according to the processing schedule; and if the computer vision pipeline type is a streaming type, wherein receiving input images and/or video from the input source includes continually receiving input images and/or video from the input source.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium wherein the multiple image transformation blocks include an image crop block that crops images and/or video and an image resize block that resizes images and/or video.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium wherein the multiple prediction blocks include a tiled model block that splits images and/or video into multiple images for detection of objects or classes of objects, detects objects or classes of objects in the multiple images, recombines the multiple images to produce transformed images and/or video, and produces the transformed images and/or videos and detections of objects or classes of objects.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium wherein the multiple prediction blocks include a label detection block that detects objects in images and/or video and produces one or more label descriptions for the detected objects, a landmark detection block that detects landmarks in images and/or video and produces a detected landmark description, and a logo detection block that detects logos in images and/or video and returns a detected logo description.
In some aspects, the techniques described herein relate to a non-transitory computer-readable medium, the method further including displaying a summary of the computer vision pipeline, the summary including the input source, the image transformation block, the prediction block, an order of the image transformation block and the prediction block in the computer vision pipeline, and the output destination.
In some aspects, the techniques described herein relate to a system including at least one processor and memory containing instructions, the instructions being executable by the at least one processor to: receive a request to create a computer vision pipeline for processing images and/or video; receive an input source for images and/or video for the computer vision pipeline; display multiple blocks for selection, the multiple blocks including multiple image transformation blocks, an image transformation block transforming images and/or video to produce transformed images and/or video, and multiple prediction blocks, a prediction block detecting objects or classes of objects in images and/or video to produce detected objects or classes of objects; receive a selection of an image transformation block and configuration information for the image transformation block; add the image transformation block to the computer vision pipeline; receive a selection of a prediction block and configuration information for the prediction block; add the prediction block to the computer vision pipeline; receive an output destination for the computer vision pipeline; receive a request to activate the computer vision pipeline; activate the computer vision pipeline; receive input images and/or video from the input source; transform, using the image transformation block, the input images and/or video to produce transformed images and/or video; detect, using the prediction block, objects or classes of objects in the transformed images and/or video to produce detected objects or classes of objects; and transmit the detected objects or classes of objects to the output destination.
In some aspects, the techniques described herein relate to a system, the instructions being further executable by the at least one processor to display a user interface element into which a block may be dragged and dropped to select the block, and wherein the instructions being executable by the at least one processor to receive the selection of the image transformation block and configuration information for the image transformation block include instructions being executable by the at least one processor to receive a dragging and dropping of the image transformation block into the user interface element, and wherein the instructions being executable by the at least one processor to receive the selection of the prediction block and configuration information for the prediction block include instructions being executable by the at least one processor to receive a dragging and dropping of the prediction block into the user interface element.
In some aspects, the techniques described herein relate to a system, the instructions being further executable by the at least one processor to: receive a computer vision pipeline type, the computer vision pipeline type being one of a batch type or a streaming type; if the computer vision pipeline type is the batch type, receive a processing schedule, wherein the instructions being executable by the at least one processor to receive input images and/or video from the input source includes instructions being executable by the at least one processor to receive input images and/or video from the input source according to the processing schedule; and if the computer vision pipeline type is a streaming type, wherein the instructions being executable by the at least one processor to receive input images and/or video from the input source includes instructions being executable by the at least one processor to continually receive input images and/or video from the input source.
In some aspects, the techniques described herein relate to a system wherein the multiple image transformation blocks include an image crop block that crops images and/or video and an image resize block that resizes images and/or video.
In some aspects, the techniques described herein relate to a system wherein the multiple prediction blocks include a tiled model block that splits images and/or video into multiple images for detection of objects or classes of objects, detects objects or classes of objects in the multiple images, recombines the multiple images to produce transformed images and/or video, and produces the transformed images and/or videos and detections of objects or classes of objects.
In some aspects, the techniques described herein relate to a system wherein the multiple prediction blocks include a tiled model block that splits images and/or video into multiple images for detection of objects or classes of objects, detects objects or classes of objects in the multiple images, recombines the multiple images to produce transformed images and/or video, and produces the transformed images and/or videos and detections of objects or classes of objects.
In some aspects, the techniques described herein relate to a system, the instructions being further executable by the at least one processor to display a summary of the computer vision pipeline, the summary including the input source, the image transformation block, the prediction block, an order of the image transformation block and the prediction block in the computer vision pipeline, and the output destination.
In some aspects, the techniques described herein relate to a method including: receiving a request to create a computer vision pipeline for processing images and/or video; receiving an input source for images and/or video for the computer vision pipeline; displaying multiple blocks for selection, the multiple blocks including multiple image transformation blocks, an image transformation block transforming images and/or video to produce transformed images and/or video, and multiple prediction blocks, a prediction block detecting objects or classes of objects in images and/or video to produce detected objects or classes of objects; receiving a selection of an image transformation block and configuration information for the image transformation block; adding the image transformation block to the computer vision pipeline; receiving a selection of a prediction block and configuration information for the prediction block; adding the prediction block to the computer vision pipeline; receiving an output destination for the computer vision pipeline; receiving a request to activate the computer vision pipeline; activating the computer vision pipeline; receiving input images and/or video from the input source; transforming, using the image transformation block, the input images and/or video to produce transformed images and/or video; detecting, using the prediction block, objects or classes of objects in the transformed images and/or video to produce detected objects or classes of objects; and transmitting the detected objects or classes of objects to the output destination.
In some aspects, the techniques described herein relate to a method, further including displaying a user interface element into which a block may be dragged and dropped to select the block, and wherein receiving the selection of the image transformation block and configuration information for the image transformation block includes receiving a dragging and dropping of the image transformation block into the user interface element, and receiving the selection of the prediction block and configuration information for the prediction block includes receiving a dragging and dropping of the prediction block into the user interface element.
In some aspects, the techniques described herein relate to a method, further including: receiving a computer vision pipeline type, the computer vision pipeline type being one of a batch type or a streaming type; if the computer vision pipeline type is the batch type, receiving a processing schedule, wherein receiving input images and/or video from the input source includes receiving input images and/or video from the input source according to the processing schedule; and if the computer vision pipeline type is a streaming type, wherein receiving input images and/or video from the input source includes continually receiving input images and/or video from the input source.
In some aspects, the techniques described herein relate to a method wherein the multiple image transformation blocks include an image crop block that crops images and/or video and an image resize block that resizes images and/or video.
In some aspects, the techniques described herein relate to a method wherein the multiple prediction blocks include a tiled model block that splits images and/or video into multiple images for detection of objects or classes of objects, detects objects or classes of objects in the multiple images, recombines the multiple images to produce transformed images and/or video, and produces the transformed images and/or videos and detections of objects or classes of objects.
In some aspects, the techniques described herein relate to a method wherein the multiple prediction blocks include a label detection block that detects objects in images and/or video and produces one or more label descriptions for the detected objects, a landmark detection block that detects landmarks in images and/or video and produces a detected landmark description, and a logo detection block that detects logos in images and/or video and returns a detected logo description.
Throughout the drawings, like reference numerals will be understood to refer to like parts, components, and structures.
In order to implement computer vision models using software such as Jupyter notebooks, software engineers and data scientists typically must write software in order to load training data, train computer vision models using the training data, perform inference using the trained computer vision models, and save the results of the inference. Such an approach has several disadvantages. First, it may require a certain level of software development experience. Second, it may require extensive configurations to the computer vision models in order to achieve the desired objectives. Third, such an approach is not very scalable to multiple organizations or even to multiple users within an organization. Further, while data engineers are highly specialized (as they need to be given the nature of the manual coding and construction of the analysis system), they rarely have the subject matter expertise required to ensure that incorrect assumptions are avoided, that specific features are identified, and that the output is well crafted to fit the business need.
In some embodiments, the computer vision pipeline systems described herein allow users to implement computer vision solutions without having to write software (e.g., as a “no code” or “low code” system). The computer vision pipeline systems allow users with no or little software development experience to create, run, and modify computer vision pipelines for processing images and/or video. In various embodiments, the computer vision pipeline systems provide interfaces that allow users to select blocks quickly and easily for computer vision pipelines, configure the blocks, and activate the computer vision pipelines.
In some embodiments, a computer vision pipeline system may include a graphical user interface with interactive visual elements. An interactive visual element may represent a function or combination of functions that have been previously coded. In some embodiments, the interactive visual element may be represented as a “block” or any other representation. Although the term “block” is used herein, it will be appreciated that the visual element representing one or more functions may be any shape (or combination of different shapes), media, animation, and/or the like.
In some embodiments, the blocks for computer vision pipelines include image transformation blocks that transform images and/or video. Image transformation blocks may be used at the beginning of a computer vision pipeline to transform the input images and/or video. The blocks also include prediction blocks that detect objects and/or classes of objects in images and/or video. Prediction blocks add inference metadata to images and/or video in computer vision pipelines.
In addition to the above advantages, the computer vision pipeline systems allow users to quickly and easily specify input sources for images and/or video and output destinations for inference data and/or processed images and/or video. Furthermore, the computer vision pipeline systems allow users to specify whether the computer vision pipeline is to process images and/or video continually or on a scheduled basis, and if the latter, a processing schedule.
In part due to inherent ease of use, distributed processing, ability to customize, and/or other aspects, the computer vision pipeline systems are scalable. Further, a centralized system (as shown in
Moreover, it will be appreciated that the “low code” or “no code” systems discussed herein allow different users of different experience (i.e., not just those who are dedicated data engineers or those with extensive coding experience) to create computer vision pipelines through the computer vision interface. As a result, users with subject matter expertise may be enabled to leverage their understanding of the business, problem, and solution to create computer visions applications, platforms, and/or processes that are crafted to the business’ particular needs.
Further, by utilizing previously coded “blocks” (i.e., functionality), different businesses and users may leverage the systems and flexibility without individually coding different processes. As a result, considerable time and computer resources are saved by avoiding debugging of new code, testing, updating, justification, and documentation which are typically required for all new code for important systems. Other advantages of the computer vision pipeline systems will be apparent.
The computer vision pipeline system 104 may provide interfaces for creating, activating, running, and managing computer vision pipelines. The computer vision pipeline system 104 may provide such interfaces to the user system 106 so as to allow a user of the user system 106 to, among other things, request that the computer vision pipeline system 104 create a computer vision pipeline, select blocks for the computer vision pipeline, and request that the computer vision pipeline system 104 activate and run the computer vision pipeline. The computer vision pipeline system 104 receives input images and/or video from the input source system 108 and processes the input images and/or video. The computer vision pipeline system 104 may process the input images and/or video by transforming the input images and/or video, and/or by detecting objects and/or classes of objects in the input images and/or video. The computer vision pipeline system 104 transmits detected objects and/or classes of objects, and optionally, the transformed images and/or video to the output destination system 110.
The user system 106 may display interfaces to a user that the user may utilize to request creation of a computer vision pipeline, select blocks for the computer vision pipeline, and request activation and running of the computer vision pipeline. The user system 106 may also display interfaces that the user may utilize to specify input sources and output destinations for the computer vision pipeline, configure blocks for the computer vision pipeline, and manage the computer vision pipeline.
The input source system 108 may be or include any system that may provide images and/or video. The input source system 108 may be or include cloud storage providers such as Google Cloud Storage and Amazon S3, messaging services such as Google Pub/Sub, and local data storage. The input source system 108 may be or include cameras or sensors (for example, security cameras) that continually stream images and/or video. The input source system 108 may be or include systems that provide satellite and/or airborne images and/or video. The input source system 108 transmits images and/or video to the computer vision pipeline system 104.
The output destination system 110 may be or include cloud storage providers such as Google Cloud Storage and Amazon S3, messaging services such as Google Pub/Sub, and local data storage. The output destination system 110 may receive detected objects and/or classes of objects, and optionally, processed images and/or video, from the computer vision pipeline system 104.
In some embodiments, the communication network 112 may represent one or more computer networks (for example, LAN, WAN, and/or the like). The communication network 112 may provide communication between any of the computer vision pipeline system 104, the user system 106, the input source system 108, and the output destination system 110. In some implementations, the communication network 112 comprises computer devices, routers, cables, uses, and/or other network topologies. In some embodiments, the communication network 112 may be wired and/or wireless. In various embodiments, the communication network 112 may comprise the Internet, one or more networks that may be public, private, IP-based, non-IP based, and so forth.
The communication module 202 may send and/or receive requests and/or data between the computer vision pipeline system 104 and any of the user system 106, the input source system 108, and the output destination system 110. The communication module 202 may receive requests and/or data from the user system 106, the input source system 108, and the output destination system 110. The communication module 202 may also send requests and/or data to the user system 106, the input source system 108, and the output destination system 110.
The user interface module 204 may receive requests to create, modify, and run computer vision pipelines. The display module 206 may display and/or provide for display interfaces for users to interact with to create, activate, run, and modify computer vision pipelines. The computer vision pipeline creation module 208 may add blocks that users have selected to computer vision pipelines.
The computer vision pipeline activation module 210 may activate computer vision pipelines. The image transformation module 212 may cause an image transformation block, if a computer vision pipeline contains an image transformation block, to transform images and/or video and to produce transformed images. The object and class detection module 214 may cause a prediction block, if a computer vision pipeline contains a prediction block, to detect objects or classes of objects in images and/or video and to produce detected objects or classes of objects.
The computer vision pipeline running module 216 may run a computer vision pipeline. The computer vision pipeline running module 216 may translate a computer vision pipeline into a directed acyclic graph (DAG), such as an Apache Beam-based DAG, to run the computer vision pipeline.
The data storage 220 may include data stored, accessed, and/or modified by any of the modules of the computer vision pipeline system 104, such as the computer vision pipelines created by users, images and/or video, and inference data. The data storage 220 may include any number of data storage structures such as tables, databases, lists, and/or the like.
A module may be hardware, software, firmware, or any combination. For example, each module may include functions performed by dedicated hardware (for example, an Application-Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA), or the like), software, instructions maintained in ROM, and/or any combination. Software may be executed by one or more processors. Although a limited number of modules are depicted in
Returning to
The interface 500 includes a computer vision pipeline name section 512 where the user may specify a name of the computer vision pipeline, a computer vision pipeline type section 514 where the user may specify a computer vision pipeline type, and a computer vision pipeline input type section 516 where the user may specify a computer vision pipeline input type. The computer vision pipeline type section 514 allows the user to select either a batch type for the computer vision pipeline or a streaming type for the computer vision pipeline. If the user selects the batch type for the computer vision pipeline type, the computer vision pipeline system 104 will receive and process input images and/or video from the input source system 108 according to a processing schedule or on an on-demand basis. If the user selects the streaming type for the computer vision pipeline type in the computer vision pipeline type section 514, the computer vision pipeline system 104 will continually receive and process images and/or video from the input source system 108, that is, on a continual basis. If the user selects an image input for the computer vision pipeline input type in the computer vision pipeline input type section 516, the computer vision pipeline system 104 will receive and process images from the input source system 108. If the user selects a video input for the computer vision pipeline input type in the computer vision pipeline input type section 516, the computer vision pipeline system 104 will receive and process video from the input source system 108. In some embodiments, the computer vision pipeline input type section 516 includes a combined image and video input option. In such embodiments, the computer vision pipeline system 104 may receive and process both images and/or video from the input source system 108. The user can select a button 518 labeled “Continue to Inputs” or the inputs text label 504 to continue to an interface where the user may specify the input source system 108.
Returning to
Returning to
Returning to
One advantage of cropping images using the image crop block 704 is that certain images may be too large to fit into memory of a GPU (graphical processing unit). Typically, such images may be resized in order to fit into GPU memory. However, resizing an image results in a loss of image resolution, which may make it more difficult to detect objects or classes of objects in the image. The image crop block 704 may center crop or fixed crop images and/or video, which may permit the cropped images and/or video to fit into GPU memory.
In addition to multiple image transformation blocks, the block library section 703 (
The interface 910 has the same elements as the interface 900 of
Returning to
The computer vision pipeline system 104 may include blocks other than the blocks described with reference to, for example,
Returning to
In various embodiments, the computer vision pipeline running module 216 may utilize an application programming interface (API) to run the computer vision pipeline. The API may support multiple image and video data types, multiple pipeline types such as batch or streaming (or edge in some embodiments), multiple input sources such as cloud storage buckets and messaging services, and multiple output destinations such as cloud storage buckets and messaging services. The computer vision pipeline running module 216 may translate computer vision pipelines into a directed acyclic graph (DAG), such as an Apache Beam-based DAG, to run the computer vision pipeline. The computer vision pipeline running module 216 may create a uniform resource locator (URL) for each image and/or video frame as a signed URL and pass the signed URL to a queue. The computer vision pipeline running module 216 may then distribute signed URLS out of the queue to multiple digital devices working in parallel. The multiple digital devices may transform images and/or video and detect objects or classes of objects in parallel and write detections into JSON files. The multiple digital devices may then transmit the output to the output destination system 110 destination.
Returning to
At step 330, the display module 206 displays a summary of the computer vision pipeline.
An example use case of the computer vision pipeline systems and associated methods described herein is as follows. Animal hides, which may be steer hides or cow hides, each have a tattoo (or similar, such as a brand) on the animal hide that includes numbers and letters. A camera may capture images and/or video of the animal hides. A user may request that the computer vision pipeline system 104 create a computer vision pipeline to process the captured images and/or video. The user may specify an input source of the images and/or video. The user may select a single model block 708 as a first block of the computer vision pipeline to detect the tattoo and identify a bounding box around the tattoo in the images and/or video. The user may select an image crop block 704 as a second block of the computer vision pipeline to crop the images and/or video to the bounding box (or slightly larger than the bounding box). The user may select the optical character recognition block 715 as a third block of the computer vision pipeline to perform OCR on the cropped images and/or video to detect characters in the cropped mages and/or video. The computer vision pipeline system 104 may then produce JSON files with the detected characters and transmit the JSON files, and optionally, the cropped images and/or video, to the output destination system 110. Human labelers may then manually verify the detected characters in the images and/or video. The user could use the computer vision pipeline system 104 in the described fashion to generate a ground truth dataset for further training of machine learning models and/or for other purposes. Additionally or alternatively, the user may select a single model block 708 that detects characters as a first block of the computer vision pipeline and an image crop block 704 that crops around the detected characters as a second block of the computer vision pipeline.
Another example use case of the computer vision pipeline systems and associated methods described herein is as follows. Live animals may need to be tracked as they move, for example, in a pen, a field, or along a track. Similarly, animal carcasses may need to be tracked as they are moved, for example, by a conveyor belt, a hook, or other instrument. A camera may capture video of the live animals or animal carcasses. A user may request that the computer vision pipeline system 104 create a computer vision pipeline to process the captured video. The user may specify an input source of the video. The user may select a single model block 708 as a first block of the computer vision pipeline to detect the live animals or animal carcasses. The user may select the object tracking block 714 as a second block to track the live animals or animal carcasses across video frames. The computer vision pipeline system 104 may then produce JSON files with the tracked live animals or animal carcasses.
Another example use case of the computer vision pipeline systems and associated methods described herein is to detect persons, such as persons within stores, malls, and other facilities in images and/or video. Such detection may be performed so as to obtain occupancy counts. The general object detection block 712 may be utilized to detect persons as the general object detection block 712 includes a person class.
Another example use case of the computer vision pipeline systems and associated methods described herein is to detect insects on vegetation, such as whiteflies on leaves, in images and/or video. The tiled model block 710 may be utilized to split a high-resolution image of vegetation into multiple images, detect the insects in each image, and then recombine the images and aggregate the detections.
Another example use case of the computer vision pipeline systems and associated methods described herein is to detect license plates, such United States license plates. A two-stage block may be utilized to detect license plates in images and/or video. The first stage may crop a region of interest (ROI) in an image using a detection model. The second stage may be run on the cropped image. For example, a two-stage license plate detection block may first crop a ROI in an image using a license plate detection model, and may second run a model that detects license characters on the cropped image. The second stage may be modeled as a character detection problem with a subsequent algorithmic reconstruction, or a recurrent neural network (RNN) based model.
Another example use case of the computer vision pipeline systems and associated methods described herein is to quantify plant growth for different plant species. There may be a standardized plant growing environment with a fixed camera perspective. An image crop block 704 may be utilized to crop the image to the ROI. A single model block 708 may then be run to detect plants in the cropped image so as to be able to quantify growth of the plants.
Another example use case of the computer vision pipeline systems and associated methods described herein is to detect chess positions. A single model block 708 could detect a chess board in an image, then an image crop block 704 could crop the image to just the chess board. An image resize block 706 could resize the image to reduce the image resolution. The resized image could then be used for training or inference.
Another example use case of the computer vision pipeline systems and associated methods described herein is to detect food. A two-stage model may be utilized to detect a plate or bowl and then individual food items may be detected. It will be appreciated that other example use cases are within the scope of this disclosure.
In
System bus 1712 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
The digital device 1700 typically includes a variety of computer system readable media, such as computer system readable storage media. Such media may be any available media that is accessible by any of the systems described herein and it includes both volatile and nonvolatile media, removable and non-removable media.
In some embodiments, the at least one processor 1702 is configured to execute executable instructions (for example, programs). In some embodiments, the at least one processor 1702 comprises circuitry or any processor capable of processing the executable instructions.
In some embodiments, RAM 1704 stores programs and/or data. In various embodiments, working data is stored within RAM 1704. The data within RAM 1704 may be cleared or ultimately transferred to storage 1710, such as prior to reset and/or powering down the digital device 1700.
In some embodiments, the digital device 1700 is coupled to a network, such as the communication network 112, via communication interface 1706. Still yet, the user system 106, the computer vision pipeline system 104, the output destination system 110, and/or the input source system 108 can communicate with one or more networks such as a local area network (LAN), a general wide area network (WAN), and/or a public network (for example, the Internet).
In some embodiments, input/output device 1708 is any device that inputs data (for example, mouse, keyboard, stylus, sensors, etc.) or outputs data (for example, speaker, display, virtual reality headset).
In some embodiments, storage 1710 can include computer system readable media in the form of non-volatile memory, such as read only memory (ROM), programmable read only memory (PROM), solid-state drives (SSD), flash memory, and/or cache memory. Storage 1710 may further include other removable/non-removable, volatile/non-volatile computer system storage media. By way of example only, storage 1710 can be provided for reading from and writing to a non-removable, non-volatile magnetic media. The storage 1710 may include a non-transitory computer-readable medium, or multiple non-transitory computer-readable media, which stores programs or applications for performing functions such as those described herein with reference to, for example,
Programs/utilities, having a set (at least one) of program modules, such as the computer vision pipeline system 104, may be stored in storage 1710 by way of example, and not limitation, as well as an operating system, one or more application programs, other program modules, and program data. Each of the operating system, one or more application programs, other program modules, and program data or some combination thereof, may include an implementation of a networking environment. Program modules generally carry out the functions and/or methodologies of embodiments of the invention as described herein.
It should be understood that although not shown, other hardware and/or software components could be used in conjunction with the digital device 1700. Examples include, but are not limited to microcode, device drivers, redundant processing units, and external disk drive arrays, RAID systems, tape drives, and data archival storage systems, etc.
Exemplary embodiments are described herein in detail with reference to the accompanying drawings. However, the present disclosure can be implemented in various manners, and thus should not be construed to be limited to the embodiments disclosed herein. On the contrary, those embodiments are provided for the thorough and complete understanding of the present disclosure, and completely conveying the scope of the present disclosure.
It will be appreciated that aspects of one or more embodiments may be embodied as a system, method, or computer program product. Accordingly, aspects may take the form of an entirely hardware embodiment, an entirely software embodiment (including firmware, resident software, micro-code, etc.) or an embodiment combining software and hardware aspects that may all generally be referred to herein as a “circuit,” “module” or “system.” Furthermore, aspects may take the form of a computer program product embodied in one or more computer readable medium(s) having computer readable program code embodied thereon.
Any combination of one or more computer readable medium(s) may be utilized. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a solid state drive (SSD), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain or store a program or data for use by or in connection with an instruction execution system, apparatus, or device.
A transitory computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object-oriented programming language such as Java, Smalltalk, C++, Python, or the like and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer program code may execute entirely on any of the systems described herein or on any combination of the systems described herein.
Aspects of the present invention are described below with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer readable medium that can direct a computer, other programmable data processing apparatus, or other devices to function in a particular manner, such that the instructions stored in the computer readable medium produce an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide processes for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
While specific examples are described above for illustrative purposes, various equivalent modifications are possible. For example, while processes or blocks are presented in a given order, alternative implementations may perform routines having steps, or employ systems having blocks, in a different order, and some processes or blocks may be deleted, moved, added, subdivided, combined, and/or modified to provide alternative or sub-combinations. Each of these processes or blocks may be implemented in a variety of different ways. Also, while processes or blocks are at times shown as being performed in series, these processes or blocks may instead be performed or implemented concurrently or in parallel or may be performed at different times. Further any specific numbers noted herein are only examples: alternative implementations may employ differing values or ranges.
Throughout this specification, plural instances may implement components, operations, or structures described as a single instance. Structures and functionality presented as separate components in example configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements fall within the scope of the subject matter herein. Furthermore, any specific numbers noted herein are only examples: alternative implementations may employ differing values or ranges.
Components may be described or illustrated as contained within or connected with other components. Such descriptions or illustrations are examples only, and other configurations may achieve the same or similar functionality. Components may be described or illustrated as “coupled”, “couplable”, “operably coupled”, “communicably coupled” and the like to other components. Such description or illustration should be understood as indicating that such components may cooperate or interact with each other, and may be in direct or indirect physical, electrical, or communicative contact with each other.
Components may be described or illustrated as “configured to”, “adapted to”, “operative to”, “configurable to”, “adaptable to”, “operable to” and the like. Such description or illustration should be understood to encompass components both in an active state and in an inactive or standby state unless required otherwise by context.
It may be apparent that various modifications may be made, and other embodiments may be used without departing from the broader scope of the discussion herein. Therefore, these and other variations upon the example embodiments are intended to be covered by the disclosure herein.
This application claims priority to U.S. Provisional Patent Application No. 63/269,540, filed on Mar. 17, 2022 and entitled “SYSTEMS AND METHODS FOR IMPROVED IMAGE PIPELINE FOR MACHINE LEARNING APPLICATIONS,” which is incorporated in its entirety herein by reference.
Number | Date | Country | |
---|---|---|---|
63269540 | Mar 2022 | US |