VIDEO RETRIEVAL METHOD AND APPARATUS, DEVICE, AND STORAGE MEDIUM

Information

  • Patent Application
  • 20230297617
  • Publication Number
    20230297617
  • Date Filed
    April 19, 2023
    a year ago
  • Date Published
    September 21, 2023
    a year ago
  • CPC
    • G06F16/7857
    • G06F16/7328
    • G06V10/44
    • G06V10/54
    • G06V10/761
  • International Classifications
    • G06F16/783
    • G06F16/732
    • G06V10/44
    • G06V10/54
    • G06V10/74
Abstract
This application provides a video retrieval method performed by a computer device. The method includes: performing feature extraction on an image feature of a query video to obtain a first quantization feature, obtaining a second candidate video with a high category similarity to the query video based on the first quantization feature, and finally taking a second candidate video with a high content similarity to the query video as a target video. The quantization control parameters are adjusted according to the texture feature loss value corresponding to each training sample to cause the target quantization processing sub-model to learn the ranking ability of the target texture feature sub-model, to ensure that the ranking effect of two sub-models tend to be consistent, and an end-to-end model architecture enables the target quantization processing sub-model to obtain the corresponding quantization feature based on the image feature.
Description
Claims
  • 1. A video retrieval method, performed by a computer device, the method comprising: performing feature extraction on a query video using a target image processing sub-model of a trained target video retrieval model to obtain a corresponding image feature;performing feature extraction on the image feature using a target quantization processing sub-model of the target video retrieval model to obtain a corresponding first quantization feature;identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature; andidentifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement.
  • 2. The method according to claim 1, wherein the target video retrieval model further comprises a target texture processing sub-model; and the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: performing feature extraction on the image feature using the target texture processing sub-model to obtain a corresponding first texture feature;determining a texture feature distance between the first texture feature and a second texture feature of one second candidate video;determining that a content similarity between the query video and the one second candidate video meeting the set content similarity requirement when the texture feature distance is lower than a preset texture feature distance threshold value; andidentifying the one second candidate video as the target video.
  • 3. The method according to claim 1, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a ratio between a total matching duration and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 4. The method according to claim 3, wherein the total matching duration is determined as a matching duration between the at least one second candidate video and the query video, and the comparison duration is determined as a duration value of a shorter video duration in the query video and the one second candidate video.
  • 5. The method according to claim 1, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a number of same quantization features between the query video and one second candidate video;determining a ratio between the number of same quantization features and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 6. The method according to claim 5, wherein the comparison duration being a duration value of a shorter video duration in the query video and the one second candidate video.
  • 7. The method according to claim 1, wherein the identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature comprises: determining a quantization feature distance between the first quantization feature and a second quantization feature of each of the first candidate videos; anddetermining a first candidate video with a quantization feature distance lower than a preset quantization feature distance threshold value as a second candidate video, each second quantization feature characterizing a video category to which a corresponding at least one first candidate video belongs.
  • 8. A computer device comprising a processor and a memory, the memory storing program codes that, when executed by the processor, cause the computer device to perform a video retrieval method including: performing feature extraction on a query video using a target image processing sub-model of a trained target video retrieval model to obtain a corresponding image feature;performing feature extraction on the image feature using a target quantization processing sub-model of the target video retrieval model to obtain a corresponding first quantization feature;identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature; andidentifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement.
  • 9. The computer device according to claim 8, wherein the target video retrieval model further comprises a target texture processing sub-model; and the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: performing feature extraction on the image feature using the target texture processing sub-model to obtain a corresponding first texture feature;determining a texture feature distance between the first texture feature and a second texture feature of one second candidate video;determining that a content similarity between the query video and the one second candidate video meeting the set content similarity requirement when the texture feature distance is lower than a preset texture feature distance threshold value; andidentifying the one second candidate video as the target video.
  • 10. The computer device according to claim 8, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a ratio between a total matching duration and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 11. The computer device according to claim 10, wherein the total matching duration is determined as a matching duration between the at least one second candidate video and the query video, and the comparison duration is determined as a duration value of a shorter video duration in the query video and the one second candidate video.
  • 12. The computer device according to claim 8, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a number of same quantization features between the query video and one second candidate video;determining a ratio between the number of same quantization features and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 13. The computer device according to claim 12, wherein the comparison duration being a duration value of a shorter video duration in the query video and the one second candidate video.
  • 14. The computer device according to claim 8, wherein the identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature comprises: determining a quantization feature distance between the first quantization feature and a second quantization feature of each of the first candidate videos; anddetermining a first candidate video with a quantization feature distance lower than a preset quantization feature distance threshold value as a second candidate video, each second quantization feature characterizing a video category to which a corresponding at least one first candidate video belongs.
  • 15. A non-transitory computer readable storage medium storing program codes that, when executed by a processor of a computer device, cause the computer device to perform a video retrieval method including: performing feature extraction on a query video using a target image processing sub-model of a trained target video retrieval model to obtain a corresponding image feature;performing feature extraction on the image feature using a target quantization processing sub-model of the target video retrieval model to obtain a corresponding first quantization feature;identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature; andidentifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement.
  • 16. The non-transitory computer readable storage medium according to claim 15, wherein the target video retrieval model further comprises a target texture processing sub-model; and the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: performing feature extraction on the image feature using the target texture processing sub-model to obtain a corresponding first texture feature;determining a texture feature distance between the first texture feature and a second texture feature of one second candidate video;determining that a content similarity between the query video and the one second candidate video meeting the set content similarity requirement when the texture feature distance is lower than a preset texture feature distance threshold value; andidentifying the one second candidate video as the target video.
  • 17. The non-transitory computer readable storage medium according to claim 15, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a ratio between a total matching duration and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 18. The non-transitory computer readable storage medium according to claim 17, wherein the total matching duration is determined as a matching duration between the at least one second candidate video and the query video, and the comparison duration is determined as a duration value of a shorter video duration in the query video and the one second candidate video.
  • 19. The non-transitory computer readable storage medium according to claim 15, wherein the identifying, among the at least one second candidate video, a target video whose associated content similarity to the query video meets a set content similarity requirement comprises: determining a number of same quantization features between the query video and one second candidate video;determining a ratio between the number of same quantization features and a comparison duration as a content repetition degree between the query video and one second candidate video;determining that a content similarity between the query video and the one second candidate video meets the set content similarity requirement when the content repetition degree exceeds a set content repetition degree threshold value; andidentifying the one second candidate video as the target video.
  • 20. The non-transitory computer readable storage medium according to claim 19, wherein the comparison duration being a duration value of a shorter video duration in the query video and the one second candidate video.
  • 21. The non-transitory computer readable storage medium according to claim 15, wherein the identifying, from first candidate videos, at least one second candidate video whose associated category similarity to the query video meets a set category similarity requirement based on the first quantization feature comprises: determining a quantization feature distance between the first quantization feature and a second quantization feature of each of the first candidate videos; anddetermining a first candidate video with a quantization feature distance lower than a preset quantization feature distance threshold value as a second candidate video, each second quantization feature characterizing a video category to which a corresponding at least one first candidate video belongs.
Priority Claims (1)
Number Date Country Kind
202110973390.0 Aug 2021 CN national
Continuations (1)
Number Date Country
Parent PCT/CN2022/105871 Jul 2022 WO
Child 18136538 US