This application claims priority to Japanese Patent Application No. 2021-190909 filed on Nov. 25, 2021, incorporated herein by reference in its entirety.
The present disclosure relates to an image processing system and an image processing method.
A user who likes to drive may wish to capture an image of his or her moving vehicle. The user can post (upload) the captured image to, for example, a social networking service (hereinafter referred to as “SNS”) so that many people can view the image. However, it is difficult for the user to image the appearance of his/her traveling vehicle while driving the vehicle by himself/herself. In view of this, there has been proposed a service for imaging the appearance of a traveling vehicle. For example, Japanese Unexamined Patent Application Publication No. 2021-48449 (JP 2021-48449 A) discloses a vehicle imaging system.
The vehicle imaging system disclosed in JP 2021-48449 A identifies a vehicle based on information for identifying the vehicle, images the identified vehicle, and transmits image data of the imaged vehicle to a communication device. That is, in the vehicle imaging system disclosed in JP 2021-48449 A, the timing when the vehicle can be imaged is limited to the timing after the vehicle has been identified. Therefore, even if there is a photogenic moment (may be period) that meets the user's need before the vehicle is identified, the vehicle imaging system disclosed in JP 2021-48449 A cannot capture the image at such a moment.
The present disclosure has been made to solve the problem described above, and an object of the present disclosure is to enable image capturing at a moment that meets the user's need.
An image processing system according to a first aspect of the present disclosure includes at least one memory configured to store video data captured by a camera, and a processor configured to perform image processing on the video data stored in the memory. The processor is configured to select a preregistered target vehicle from among vehicles included in the video data captured by the camera. The processor is configured to clip, in the video data stored in the memory, a plurality of frames from the video data before the preregistered target vehicle is selected, and generate an image including the target vehicle by using the clipped frames.
In the image processing system according to the first aspect, the processor may be configured to clip all frames from entry of the target vehicle into an imageable range of the camera to exit of the target vehicle from the imageable range.
In the image processing system according to the first aspect, the processor may be configured to clip, in addition to all the frames, a frame before the entry of the target vehicle into the imageable range and a frame after the exit of the target vehicle from the imageable range.
According to such configurations, the processor can clip not only the frames including the target vehicle from the video data after the selection of the target vehicle, but also the frames included in the video data before the selection of the target vehicle. The processor preferably clips all the frames including the target vehicle, and more preferably clips the frames before and after all the frames. According to such configurations, an image including the target vehicle (viewing image described later) can be generated from a series of scenes from the time before the entry of the target vehicle into the imageable range of the camera to the time after the exit of the target vehicle from the imageable range.
In the image processing system according to the first aspect, the memory may include a ring buffer. The ring buffer may include a storage area configured to be able to store newly captured video data by a predetermined amount, and may be configured to automatically delete, from the storage area, old video data that exceeds the predetermined amount.
In the image processing system according to the first aspect, the processor may be configured to select the target vehicle based on license codes of license plates of the vehicles included in the video data.
In the image processing system according to the first aspect, the memory may be configured to store a license code recognition model. The license code recognition model may be a trained model configured to receive an input of a video including a license code of a license plate, and output the license code in the video. The processor may be configured to recognize the license codes from the video data captured by the camera by using the license code recognition model.
According to such configurations, the target vehicle can be selected with high accuracy by recognizing the license code.
In the image processing system according to the first aspect, the processor may be configured to select the target vehicle based on pieces of identification information of communication devices mounted on the vehicles.
In the image processing system according to the first aspect, the memory may be configured to store a vehicle extraction model. The vehicle extraction model may be a trained model configured to receive an input of a video including a vehicle, and output the vehicle in the video. The processor may be configured to extract a plurality of vehicles including the target vehicle from the video data captured by the camera by using the vehicle extraction model.
According to such configurations, the vehicles including the target vehicle can be extracted with high accuracy.
In the image processing system according to the first aspect, the processor may be configured to extract a feature amount of the target vehicle. The processor may be configured to identify a vehicle having the feature amount from among the vehicles included in the video data, and clip a frame including the identified vehicle and a frame including the target vehicle.
In the image processing system according to the first aspect, the memory may be configured to store a target vehicle identification model. The target vehicle identification model may be a trained model configured to receive an input of a video from which a vehicle is extracted, and output the vehicle in the video. The processor may be configured to identify the vehicle having the feature amount from the video data captured by the camera based on the target vehicle identification model.
According to such configurations, the vehicle having the same feature amount as that of the target vehicle can be identified with high accuracy from the video data.
An image processing method according to a second aspect of the present disclosure includes causing a memory to store video data showing vehicles imaged by a camera, selecting a preregistered target vehicle from among the vehicles included in the video data captured by the camera, clipping, in the video data stored in the memory, a plurality of frames from the video data before the preregistered target vehicle is selected, and generating an image including the target vehicle by using the clipped frames.
According to the present disclosure, it is possible to capture the image at the moment that meets the user's need.
Features, advantages, and technical and industrial significance of exemplary embodiments of the disclosure will be described below with reference to the accompanying drawings, in which like signs denote like elements, and wherein:
Hereinafter, embodiments of the present disclosure will be described in detail with reference to the drawings. In the drawings, the same or corresponding portions are denoted by the same reference signs and the description thereof will not be repeated.
The imaging system 1 is installed, for example, near a road and images a vehicle 9 (see
The server 2 is, for example, an in-house server of a business operator that provides a vehicle imaging service. The server 2 may be a cloud server provided by a cloud server management company. The server 2 generates an image to be viewed by a user (hereinafter referred to also as “viewing image”) from a video received from the imaging system 1, and provides the generated viewing image to the user. The viewing image is generally a still image, but may be a video for a specified period (for example, a short period of about several seconds). In many cases, the user is a driver of the vehicle 9, but is not particularly limited.
The processor 11 controls the overall operation of the imaging system 1. The memory 12 stores programs (operating system and application programs) to be executed by the processor 11, and data (maps, tables, mathematical expressions, parameters, etc.) to be used in the programs. The memory 12 temporarily stores a video captured by the imaging system 1.
The camera 13 captures a video of the vehicle 9. The camera 13 is preferably a high-sensitivity camera with a polarizing lens.
Referring back to
The processor 21 executes various arithmetic processes in the server 2. The memory 22 stores programs to be executed by the processor 21, and data to be used in the programs. The memory 22 stores data to be used for image processing by the server 2, and data subjected to the image processing by the server 2. The input device 23 receives an input from an administrator of the server 2. The input device 23 is typically a keyboard and a mouse. The display 24 displays various types of information. The communication IF 25 is an interface for communicating with the imaging system 1.
An image processing system 900 according to a comparative example will be described to facilitate understanding of the features of the image processing system 100 according to the present embodiment.
At a time t1, the distal end of the vehicle 9 enters the imageable range. At this time, the license plate is out of the imageable range. At a time t2, the license plate enters the imageable range and is imaged. It takes some processing period for the processor to recognize the number and determine whether the vehicle 9 is a vehicle to be imaged (hereinafter referred to also as “target vehicle”). During that period as well, the vehicle 9 is traveling. At a time t3, the vehicle 9 is identified as the target vehicle. A period from the time t3 to a time t4 when the vehicle 9 exits the imageable range is an imageable period for the vehicle 9 in the comparative example.
In this case, even if there is a photogenic moment (or period) that meets the user's need before the time t3 when the vehicle 9 is identified, the image cannot be captured at such a moment. It is not even possible to image a stream of scenes from the entry of the vehicle 9 into the imageable range to the exit of the vehicle 9 from the imageable range.
In the present embodiment, a video before the time t3 when the vehicle 9 is identified is also stored in the memory 22. In the present embodiment, the stream of scenes from the entry of the vehicle 9 into the imageable range to the exit of the vehicle 9 from the imageable range is imaged and then a part or all of the stream is clipped. This makes it possible to capture an image at the moment that meets the user's need.
The vehicle 9 is not limited to a four-wheel vehicle shown in
The imaging unit 31 captures a video of the vehicle 9 and outputs the captured video to the video buffer 331. The imaging unit 31 corresponds to the camera 13 in
The communication unit 32 performs bidirectional communication with a communication unit 42 (described later) of the server 2 via the network NW. The communication unit 32 receives the number of the target vehicle from the server 2 and transmits the number of each vehicle imaged by the imaging unit 31 to the server. The communication unit 32 transmits a video (more specifically, a video clip including the target vehicle) to the server 2. The communication unit 32 corresponds to the communication IF 14 in
The video buffer 331 temporarily stores the video captured by the imaging unit 31. The video buffer 331 is typically a ring buffer (circular buffer), and has an annular storage area in which the beginning and the end of a one-dimensional array are logically connected to each other. A newly captured video is stored in the video buffer 331 by a predetermined amount (may be a predetermined number of frames or a predetermined period) that can be stored in the storage area. An old video that exceeds the predetermined period is automatically deleted from the video buffer 331. The video buffer 331 outputs the video to the vehicle extraction unit 332 and the video clipping unit 337.
The vehicle extraction unit 332 extracts a vehicle (not only the target vehicle but vehicles as a whole) from the video. This process is referred to also as “vehicle extraction process”. For example, a trained model generated by a machine learning technology such as deep learning can be used for the vehicle extraction process. In this example, the vehicle extraction unit 332 is implemented by a “vehicle extraction model”. The vehicle extraction model will be described with reference to
The number recognition unit 333 recognizes a license plate number in the part from which the vehicle is extracted by the vehicle extraction unit 332 (frame including the vehicle). This process is referred to also as “number recognition process”. A trained model generated by a machine learning technology such as deep learning can be used also for the number recognition process. In this example, the number recognition unit 333 is implemented by a “number recognition model”. The number recognition model will be described with reference to
The target vehicle selection unit 335 selects, as the target vehicle, a vehicle whose number matches the number of the target vehicle (received from the server 2) from among the vehicles associated with the numbers by the matching process. The target vehicle selection unit 335 outputs the vehicle selected as the target vehicle to the feature amount extraction unit 336 and the video clipping unit 337.
Referring to
The video clipping unit 337 clips a part including the target vehicle from the video stored in the video buffer 331. The video clipping unit 337 preferably clips all the frames including the target vehicle selected by the target vehicle selection unit 335. More preferably, the video clipping unit 337 clips, in addition to all the frames including the target vehicle, a predetermined number of frames before those frames (frames before the entry of the target vehicle into the imageable range of the imaging unit 31) and a predetermined number of frames after those frames (frames after the exit of the target vehicle from the imageable range of the imaging unit 31). That is, the video clipping unit 337 preferably clips a video stream from the time before the entry of the target vehicle into the imageable range of the imaging unit 31 to the time after the exit of the target vehicle from the imageable range. The video clipping unit 337 outputs the clipped video to the communication unit 32. As a result, the video showing the traveling target vehicle over the entire imageable range of the imaging unit 31 is transmitted to the server 2.
The video clipping unit 337 may clip the video by using the feature amount extracted by the feature amount extraction unit 336. For example, the video clipping unit 337 may change the length of the video clipping period depending on the traveling speed of the target vehicle. In other words, the video clipping unit 337 may variably set the numbers (“predetermined numbers”) of frames before and after all the frames including the target vehicle. The video clipping unit 337 can increase the video clipping period as the traveling speed of the target vehicle decreases. As a result, it is possible to more securely clip the video of the traveling target vehicle over the entire imageable range.
The server 2 includes a storage unit 41, the communication unit 42, and an arithmetic process unit 43. The storage unit 41 includes an image storage unit 411 and a registration information storage unit 412. The arithmetic process unit 43 includes a vehicle extraction unit 431, a target vehicle identification unit 432, an image processing unit 433, an album creation unit 434, a web service management unit 435, and an imaging system management unit 436.
The image storage unit 411 stores a viewing image obtained as a result of an arithmetic process by the server 2. More specifically, the image storage unit 411 stores images before and after processing by the image processing unit 433, and an album created by the album creation unit 434.
The registration information storage unit 412 stores registration information related to the vehicle imaging service. The registration information includes personal information of a user who applied for the provision of the vehicle imaging service, and vehicle information of the user. The personal information of the user includes, for example, information on an identification number (ID), a name, a date of birth, an address, a telephone number, and an e-mail address of the user. The vehicle information of the user includes information on a license plate number of the vehicle. The vehicle information may include, for example, information on a vehicle model, a model year, a body shape (sedan, wagon, van, etc.), and a body color.
The communication unit 42 performs bidirectional communication with the communication unit 32 of the imaging system 1 via the network NW. The communication unit 42 transmits the number of the target vehicle to the imaging system 1 and receives the number of each vehicle imaged by the imaging system 1. The communication unit 42 receives a video including the target vehicle and a feature amount (traveling condition and appearance) of the target vehicle from the imaging system 1. The communication unit 42 corresponds to the communication IF 25 in
The vehicle extraction unit 431 extracts a vehicle (not only the target vehicle but vehicles as a whole) from the video. In this process, a vehicle extraction model can be used similarly to the vehicle extraction process by the vehicle extraction unit 332 of the imaging system 1. The vehicle extraction unit 431 outputs a video from which a vehicle is extracted in the video (frame including a vehicle) to the target vehicle identification unit 432.
The target vehicle identification unit 432 identifies the target vehicle from among the vehicles extracted by the vehicle extraction unit 431 based on the feature amount of the target vehicle (the traveling condition such as a traveling speed and an acceleration, and the appearance such as a body shape and a body color). This process is referred to also as “target vehicle identification process”. A trained model generated by a machine learning technology such as deep learning can be used also for the target vehicle identification process. In this example, the target vehicle identification unit 432 is implemented by a “target vehicle identification model”. The target vehicle identification will be described with reference to
The image processing unit 433 processes the viewing image. For example, the image processing unit 433 selects a most photogenic image (so-called best shot) from among the plurality of images. Then, the image processing unit 433 performs various types of image correction (trimming, color correction, distortion correction, etc.) on the extracted viewing image. The image processing unit 433 outputs the processed viewing image to the album creation unit 434.
The album creation unit 434 creates an album by using the processed viewing image. A known image analysis technology (for example, a technology for automatically creating a photo book, a slide show, or the like from images captured by a smartphone) can be used for creating the album. The album creation unit 434 outputs the album to the web service management unit 435.
The web service management unit 435 provides a web service (for example, an application program that can be linked to an SNS) using the album created by the album creation unit 434. The web service management unit 435 may be implemented on a server different from the server 2.
The imaging system management unit 436 manages (monitors and diagnoses) the imaging system 1. In the event of some abnormality (camera failure, communication failure, etc.) in the imaging system 1 under management, the imaging system management unit 436 notifies the administrator of the server 2 about the abnormality. As a result, the administrator can take measures such as inspection or repair of the imaging system 1. The imaging system management unit 436 may be implemented as a separate server similarly to the web service management unit 435.
A large amount of teaching data is prepared in advance by a developer. The teaching data includes example data and correct answer data. The example data is image data including a vehicle to be extracted. The correct answer data includes an extraction result associated with the example data. Specifically, the correct answer data is image data including the vehicle extracted from the example data.
A learning system 61 trains the estimation model 51 by using the example data and the correct answer data. The learning system 61 includes an input unit 611, an extraction unit 612, and a learning unit 613.
The input unit 611 receives a large amount of example data (image data) prepared by the developer, and outputs the data to the extraction unit 612.
By inputting the example data from the input unit 611 into the estimation model 51, the extraction unit 612 extracts a vehicle included in the example data for each piece of example data. The extraction unit 612 outputs the extraction result (output from the estimation model 51) to the learning unit 613.
The learning unit 613 trains the estimation model 51 based on the vehicle extraction result from the example data that is received from the extraction unit 612 and the correct answer data associated with the example data. Specifically, the learning unit 613 adjusts the parameters 512 (for example, the weighting coefficient) so that the vehicle extraction result obtained by the extraction unit 612 approaches the correct answer data.
The estimation model 51 is trained as described above, and the trained estimation model 51 is stored in the vehicle extraction unit 332 (and the vehicle extraction unit 431) as a vehicle extraction model 71. The vehicle extraction model 71 receives an input of a video, and outputs a video from which a vehicle is extracted. The vehicle extraction model 71 outputs, for each frame of the video, the extracted vehicle in association with an identifier of the frame to the matching process unit 334. The frame identifier is, for example, a time stamp (time information of the frame).
The trained estimation model 52 is stored in the number recognition unit 333 as a number recognition model 72. The number recognition model 72 receives an input of a video from which a vehicle is extracted by the vehicle extraction unit 332, and outputs coordinates and a number of a license plate. The number recognition model 72 outputs, for each frame of the video, the recognized coordinates and number of the license plate in association with an identifier of the frame to the matching process unit 334.
The trained estimation model 53 is stored in the target vehicle identification unit 432 as a target vehicle identification model 73. The target vehicle identification model 73 receives an input of a video from which a vehicle is extracted by the vehicle extraction unit 431 and a feature amount (traveling condition and appearance) of the target vehicle, and outputs a video including the identified target vehicle. The target vehicle identification model 73 outputs, for each frame of the video, the identified video in association with an identifier of the frame to the image processing unit 433.
The vehicle extraction process is not limited to the process using the machine learning. A known image recognition technology (image recognition model or algorithm) that does not use the machine learning can be applied to the vehicle extraction process. The same applies to the number recognition process and the target vehicle identification process.
In S11, the imaging system 1 extracts a vehicle by executing the vehicle extraction process (see
When the number is received from the imaging system 1, the server 2 refers to registration information to determine whether the received number is a registered number (that is, the vehicle imaged by the imaging system 1 is a vehicle of a user who applied for the provision of the vehicle imaging service (target vehicle)). When the received number is the registered number (the number of the target vehicle), the server 2 transmits the number of the target vehicle and requests the imaging system 1 to transmit a video including the target vehicle (S21).
In S13, the imaging system 1 executes the matching process between each vehicle and each number in the video. Then, the imaging system 1 selects, as the target vehicle, a vehicle associated with the same number as the number of the target vehicle from among the vehicles associated with the numbers (S14). The imaging system 1 extracts a feature amount (traveling condition and appearance) of the target vehicle, and transmits the extracted feature amount to the server 2 (S15).
In S16, the imaging system 1 clips, from the video temporarily stored in the memory 22 (video buffer 331), a part including the target vehicle from a time before the number recognition (before the selection of the target vehicle). Since the clipping method has been described in detail in
In S22, the server 2 extracts vehicles by executing the vehicle extraction process (see
In S23, the server 2 identifies the target vehicle from among the vehicles extracted in S22 based on the feature amount (traveling condition and appearance) of the target vehicle (target vehicle identification process in
It is not essential to use both the traveling condition and the appearance of the target vehicle, and only one of them may be used. The information on the traveling condition and/or the appearance of the target vehicle corresponds to “target vehicle information” according to the present disclosure. The information on the appearance of the target vehicle is not limited to the vehicle information obtained by the analysis performed by the imaging system 1 (feature amount extraction unit 336), but may be vehicle information prestored in the registration information storage unit 412.
In S24, the server 2 selects an optimum viewing image (best shot) from the video (plurality of viewing images) including the target vehicle. The server 2 performs image correction on the optimum viewing image. Then, the server 2 creates an album by using the corrected viewing image (S25). The user can view the created album and post a desired image in the album to the SNS.
As described above, in the first embodiment, the imaging system 1 selects the target vehicle by the recognition of the license plate number. Then, the imaging system 1 clips all the frames including the target vehicle (including the frames before the selection of the target vehicle) and transmits the frames to the server 2. More preferably, the imaging system 1 additionally clips the frames before and after all the frames including the target vehicle and transmits the frames to the server 2. As a result, the server 2 collects a stream of scenes from the time before the entry of the target vehicle into the imageable range of the camera 13 to the time after the exit of the target vehicle from the imageable range. Therefore, the server 2 can select the optimum frame from the stream of scenes and generate the viewing image. According to the first embodiment, it is possible to capture the image at the moment that meets the user's need.
In the first embodiment, description has been given of the configuration in which the target vehicle is identified by using the license plate number. The method for identifying the target vehicle is not limited to this method. In a second embodiment, the target vehicle is identified by using a wireless communication identification number.
The long-range wireless module 151 is, for example, a communication module compliant with 4G or 5G similarly to the communication IF 14. The long-range wireless module 151 is used for long-range communication between the imaging system 1A and the server 2.
The short-range wireless module 152 is a communication module compliant with short-range communication standards such as Wi-Fi (registered trademark) or Bluetooth (registered trademark). The short-range wireless module 152 communicates with a short-range wireless module 95 provided in the vehicle 9 and with a user terminal 96 (smartphone, tablet terminal, etc.) of the user of the vehicle 9.
The short-range wireless module 95 of the vehicle 9 and the user terminal 96 have identification numbers (referred to also as “device addresses”) unique to the respective wireless devices compliant with the short-range communication standards. The short-range wireless module 152 of the imaging system 1A can acquire the identification number of the short-range wireless module 95 and/or the identification number of the user terminal 96.
The short-range wireless module 95 and the user terminal 96 are hereinafter referred to also as “wireless devices” comprehensively. The identification number of the wireless device is referred to also as “wireless device ID”. The wireless device ID of the target vehicle is acquired in advance from the user (for example, when applying for the vehicle imaging service) and stored in the registration information storage unit 412 (see
The short-range communication unit 81 performs short-range communication with the wireless device mounted on the vehicle 9. The short-range communication unit 81 corresponds to the short-range wireless module 152 in
The wireless device ID acquisition unit 841 acquires the identification number (wireless device ID) of the short-range wireless module 95 and/or the identification number of the user terminal 96. The wireless device ID acquisition unit 841 outputs the acquired wireless device ID to the matching process unit 844.
The imaging unit 82, the video buffer 842, and the vehicle extraction unit 843 are equivalent to the imaging unit 31, the video buffer 331, and the vehicle extraction unit 332 (see
The matching process unit 844 associates the vehicle extracted by the vehicle extraction unit 843 with the wireless device ID acquired by the wireless device ID acquisition unit 841 (matching process). More specifically, the matching process unit 844 associates, at a timing when the vehicle including the wireless device has approached, the wireless device ID acquired from the wireless device with the vehicle extracted by the vehicle extraction unit 843. As the vehicle 9 approaches the imaging system 1A, the strength of the short-range wireless communication is improved. Therefore, the matching process unit 844 may associate the vehicle with the wireless device ID based on the strength of the short-range wireless communication in addition to the wireless device ID. The matching process unit 844 outputs a result of the matching process (the vehicle associated with the wireless device ID) to the target vehicle selection unit 845.
The target vehicle selection unit 845 selects, as the target vehicle, a vehicle whose wireless device ID matches the wireless device ID of the target vehicle (received from the server 2) from among the vehicles associated with the wireless device IDs by the matching process. The target vehicle selection unit 845 outputs the vehicle selected as the target vehicle to the feature amount extraction unit 846 and the video clipping unit 847.
The feature amount extraction unit 846, the video clipping unit 847, and the long-range communication unit 83 are equivalent to the feature amount extraction unit 336, the video clipping unit 337, and the communication unit 32 (see
As described above, in the second embodiment, the imaging system 1A selects the target vehicle by using the identification number (wireless device ID) of the short-range wireless module 95 mounted on the vehicle 9 and/or the identification number of the user terminal 96. Then, the imaging system 1A clips all the frames including the target vehicle (including the frames before the selection of the target vehicle) and transmits the frames to the server 2. More preferably, the imaging system 1A additionally clips the frames before and after all the frames including the target vehicle and transmits the frames to the server 2. As a result, the server 2 collects a stream of scenes from the time before the entry of the target vehicle into the imageable range of the camera 13 to the time after the exit of the target vehicle from the imageable range. Therefore, the server 2 can select the optimum frame from the stream of scenes and generate the viewing image. According to the second embodiment, it is possible to capture the image at the moment that meets the user's need.
In the first and second embodiments, description has been given of the example in which the imaging system 1 or 1A and the server 2 share the execution of the image processing. Therefore, both the processor 11 of the imaging system 1 or 1A and the processor 21 of the server 2 correspond to a “processor” according to the present disclosure. The imaging system 1 or 1A may execute all the image processing and transmit the image-processed data (viewing image) to the server 2. Therefore, the server 2 is not an essential component for the image processing according to the present disclosure. In this case, the processor 11 of the imaging system 1 or 1A corresponds to the “processor” according to the present disclosure.
The embodiments disclosed herein should be considered to be illustrative and not restrictive in all respects. The scope of the present disclosure is shown by the claims rather than by the above description of the embodiments, and is intended to include all modifications within the meaning and scope equivalent to the claims.
Number | Date | Country | Kind |
---|---|---|---|
2021-190909 | Nov 2021 | JP | national |