Claims
- 1. A teleconferencing system comprising:
a plurality of sound-collection means; at least one speaker-imaging means, an image-display means; and an imaging control means, which, based on voice direction information of a speaker obtained from said sound-collection means, changes an imaging direction of said speaker-imaging means imaging the speaker, wherein said imaging control means is configured so as to perform control so as to direct said imaging direction of said speaker-imaging means toward a direction of said speaker predicted by said sound-collection means, and wherein said imaging control means is configured so that movement pixels are extracted from the captured image, and a distribution of the movement pixels is determined, so as to identify a position of said speaker within said image region, and so that, based on said position information of said speaker, further control is performed of said speaker-imaging control means so that said speaker is displayed at a prescribed position within an image region.
- 2. A teleconferencing system comprising:
a plurality of sound-collection means; at least one speaker-imaging means, an image-display means; and an imaging control means, which, based on voice direction information of a speaker obtained from said sound-collection means, changes an imaging direction of said speaker-imaging means imaging the speaker, wherein said imaging control means is configured so as to perform control so as to direct said imaging direction of said speaker-imaging means toward a direction of said speaker predicted by said sound-collection means, and wherein said imaging control means is configured so that movement pixels are extracted from the captured image, and a distribution of the movement pixels is determined, so as to identify said position of the speaker within said image, and so that, based on said position information of said speaker, further control is performed of said speaker-imaging means so that the size of said speaker at said prescribed position in said image region is adjusted.
- 3. A teleconferencing system according to claim 2, wherein said imaging control means operates a zoom mechanism of said speaker-imaging means.
- 4. A teleconferencing system according to claim 1, wherein voice direction information of said speaker is used to predict a position of said speaker, making use of a phase difference between speaker voices input to said plurality of sound-collection means.
- 5. A teleconferencing system according to claim 2, wherein voice direction information of said speaker is used to predict a position of said speaker, making use of a phase difference between speaker voices input to said plurality of sound-collection means.
- 6. A teleconferencing system according to claim 1, wherein voice direction information of said speaker is used to predict a position of said speaker, a direction of one of said sound-collection means indicating a maximum value of voice level among the voice levels of said speaker input to each one of said plurality of said sound-collection means being taken as a direction of said speaker.
- 7. A teleconferencing system according to claim 2, wherein voice direction information of said speaker is used to predict a position of said speaker, a direction of one of said sound-collection means indicating a maximum value of voice level among the voice levels of said speaker input to each one of said plurality of said sound-collection means being taken as a direction of said speaker.
- 8. A teleconferencing system according to claim 6, wherein one of said plurality of sound-collection means is assigned to each said speaker.
- 9. A teleconferencing system according to claim 7, wherein one of said plurality of sound-collection means is assigned to each said speaker.
- 10. A teleconferencing system according to claim 1, wherein the number of sound-collection means is different from the number of persons participating in said conference.
- 11. A teleconferencing system according to claim 2, wherein the number of sound-collection means is different from the number of persons participating in said conference.
- 12. A teleconferencing system according to claim 1, wherein the imaging direction of said speaker-imaging means capturing an image of said speaker can be freely changed.
- 13. A teleconferencing system according to claim 2, wherein the imaging direction of said speaker-imaging means capturing an image of said speaker can be freely changed.
- 14. A teleconferencing system according to claim 1, wherein a method of extracting movement pixels from a captured image from said speaker-imaging means in which a speaker exists and determining the movement pixel distribution thereof is that of determining an image differencing between a plurality of adjacent frames of said captured image, and determining a movement pixel distribution formed by movement pixels from the differential image, so as to identify the position of said speaker within said displayed image.
- 15. A teleconferencing system according to claim 2, wherein a method of extracting movement pixels from a captured image from said speaker-imaging means in which a speaker exists and determining the movement pixel distribution thereof is that of determining an image differencing between a plurality of adjacent frames of said captured image, and determining a movement pixel distribution formed by movement pixels from the differential image, so as to identify the position of said speaker within said displayed image.
- 16. A teleconferencing system according to claim 14, wherein a final position at which said speaker to is to be displayed within a displayed image is the approximately center of said displayed image.
- 17. A teleconferencing system according to claim 15, wherein a final position at which said speaker to is to be displayed within a displayed image is the approximately center of said displayed image.
- 18. A teleconferencing system according to claim 14, wherein said movement pixel distribution is determined by a histogram of a number of pixels with regard to at least one of the horizontal direction and the vertical direction in said displayed image, and a position of said speaker in said image is determined from said histogram.
- 19. A teleconferencing system according to claim 15, wherein said movement pixel distribution is determined by a histogram of a number of pixels with regard to at least one of the horizontal direction and the vertical direction in said displayed image, and a position of said speaker in said image is determined from said histogram.
- 20. A teleconferencing system according to claim 1, wherein in a case in which, based on voice direction information obtained from said sound-collection means, an imaging direction of said speaker-imaging means is directed toward a predicted direction of said speaker, if there is no speaker in said displayed image, a zoom function of said speaker-imaging means is caused to operate, so as to expand an imaging range of said speaker-imaging means, and the existence or non-existence of a speaker is detected again.
- 21. A teleconferencing system according to claim 2, wherein in a case in which, based on voice direction information obtained from said sound-collection means, an imaging direction of said speaker-imaging means is directed toward a predicted direction of said speaker, if there is no speaker in said displayed image, a zoom function of said speaker-imaging means is caused to operate, so as to expand an imaging range of said speaker-imaging means, and the existence or non-existence of a speaker is detected again.
- 22. An imaging means control apparatus in a teleconferencing system comprising:
a plurality of sound-collection means; a speaker-imaging means, which captures an image of a speaker; a speaker direction detection means, which predicts a direction of said speaker based on information from said sound-collection means; a first imaging control means, which changes a facing direction of said speaker-imaging means, based on information of said speaker direction detection means; an image display means, which displays an image captured by said speaker-imaging means that is caused to face to a prescribed direction in response to a control signal from said first imaging control means; a movement pixel detection means, which detects movement pixels from said captured image; a movement distribution measurement means, which measures a movement distribution from movement pixels detected by said movement pixel detection means; a speaker position establishing means, which determines said position of a speaker within said image, based on the measurement results of said movement distribution measurement means; and a second imaging control means, which further controls a facing direction of said speaker-imaging means, based on information of said speaker position establishing means.
- 23. An imaging means control apparatus in a teleconferencing system comprising:
a plurality of sound-collection means; a speaker-imaging means, which captures an image of a speaker; a speaker direction detection means, which predicts a direction of said speaker based on information from said sound-collection means; a first imaging control means, which changes a facing direction of said speaker-imaging means, based on information of said speaker direction detection means; an image display means, which displays an image captured by said speaker-imaging means that is caused to face to a prescribed direction in response to a control signal from said first imaging control means; a movement pixel detection means, which detects movement pixels from said captured image; a movement distribution measurement means, which measures a movement distribution from movement pixels detected by said movement pixel detection means; a speaker position establishing means, which determines said position of a speaker within said image, based on the measurement results of said movement distribution measurement means; and a second imaging control means, which further controls said speaker-imaging to change a size of said speaker on said image, based on information of said speaker position establishing means.
- 24. A control apparatus according to claim 22, wherein said first imaging control means and said second imaging control means are one the same imaging control means.
- 25. A control apparatus according to claim 23, wherein said first imaging control means and said second imaging control means are one the same imaging control means.
- 26. A control apparatus according to claim 22, wherein said speaker direction detection means predicts a direction of a speaker based on either a phase difference or a voice level of speaker voices input to said plurality of sound-collection means.
- 27. A control apparatus according to claim 23, wherein said speaker direction detection means predicts a direction of a speaker based on either a phase difference or a voice level of speaker voices input to said plurality of sound-collection means.
- 28. A control apparatus according to claim 22, wherein said movement pixel detection means creates an image differencing from different captured image frames.
- 29. A control apparatus according to claim 23, wherein said movement pixel detection means creates an image differencing from different captured image frames.
- 30. A control apparatus according to claim 28, wherein said differential image is created by causing storage of successively captured display images for each frame in an appropriate storage means, and determining a differential image either between different selected display image frames from said stored display image frames, or between one display image frame stored in said storage means and a currently obtained display image frame.
- 31. A control apparatus according to claim 29, wherein said differential image is created by causing storage of successively captured display images for each frame in an appropriate storage means, and determining a differential image either between different selected display image frames from said stored display image frames, or between one display image frame stored in said storage means and a currently obtained display image frame.
- 32. A control apparatus according to claim 22, wherein said movement distribution measurement means creates a histogram with regard to pixel information for at least one direction of the horizontal direction and the vertical direction, from said extracted movement pixel information.
- 33. A control apparatus according to claim 23, wherein said movement distribution measurement means creates a histogram with regard to pixel information for at least one direction of the horizontal direction and the vertical direction, from said extracted movement pixel information.
- 34. A control apparatus according to claim 22, wherein said speaker position establishing means verifies a position of a speaker on a display image from said histogram formed by said distribution measurement means.
- 35. A control apparatus according to claim 23, wherein said speaker position establishing means verifies a position of a speaker on a display image from said histogram formed by said distribution measurement means.
- 36. A control apparatus according to claim 22, wherein said speaker position establishing means, in a case in which it is not possible to verify a position at which a speaker exists, expands the region currently imaged by said speaker-imaging means, and from the results executes a verification of said speaker position on said captured image.
- 37. A control apparatus according to claim 23, wherein said speaker position establishing means, in a case in which it is not possible to verify a position at which a speaker exists, expands the region currently imaged by said speaker-imaging means, and from the results executes a verification of said speaker position on said captured image.
- 38. A control apparatus according to claim 23, wherein control by said second imaging control means is performed so as to operate a zoom mechanism of said speaker-imaging means.
- 39. An imaging means control method in a teleconferencing system comprising a plurality of sound-collection means, at least one speaker imaging means, and a imaging control means, which changes an imaging direction of said speaker-imaging means so as to capture an image of the speaker, based on voice direction information with regard to the speaker obtained from said sound-collection means, said method comprising:
a first step of predicting a direction of a speaker from information collected as voice sound by each of said sound-collection means; a second step of causing an imaging direction axis of said speaker-imaging means to be faced to said predicted direction of said speaker by a first imaging control means driving said speaker-imaging means, based on speaker direction information predicted in said first step; a third step of displaying an image captured by said speaker-imaging means on a display means; a fourth step of extracting movement pixel information from said displayed image information; a fifth step of calculating a movement distribution from said extracted movement pixel information; a sixth step of determining a position of said speaker within said captured image, from said movement distribution information; and a seventh step of a second imaging control means causing said speaker-imaging means to move, based on position information of the speaker within a displayed image, so that said image of said speaker is moved to a prescribed position within said captured image.
- 40. An imaging means control method in a teleconferencing system comprising a plurality of sound-collection means, at least one speaker imaging means, and a imaging control means, which changes an imaging direction of said speaker-imaging means so as to capture an image of the speaker, based on voice direction information with regard to the speaker obtained from said sound-collection means, said method comprising:
a first step of predicting a direction of a speaker from information collected as voice sound by each of said sound-collection means; a second step of causing an imaging direction axis of said speaker-imaging means to be faced to said predicted direction of said speaker by a first imaging control means driving said speaker-imaging means, based on speaker direction information predicted in said first step; a third step of displaying an image captured by said speaker-imaging means on a display means; a fourth step of extracting movement pixel information from said displayed image information; a fifth step of calculating a movement distribution from said extracted movement pixel information; a sixth step of determining a position of said speaker within said captured image, from said movement distribution information; and a seventh 7′ step of a second imaging control means adjusting a zoom mechanism of said speaker-imaging means so as to adjust a size of said speaker in said captured image based on position information of the speaker within a displayed image.
- 41. A recording medium onto which is stored a program for execution by a computer of an imaging means control method in a teleconferencing system comprising a plurality of sound-collection means, at least one speaker imaging means, and an imaging control means, which changes an imaging direction of said speaker-imaging means so as to capture an image of the speaker, based on voice direction information with regard to the speaker obtained from said sound-collection means, said method comprising:
a first step of predicting a direction of a speaker from information collected as voice sound by each of said sound-collection means; a second step of causing an imaging direction axis of said speaker-imaging means to be pointed at said predicted position of said speaker by a first imaging control means driving said speaker-imaging means, based on speaker direction information predicted in said first step; a third step of storing an image captured by said speaker-imaging means; a fourth step of extracting movement pixel information from said stored image; a fifth step of calculating a movement distribution from said extracted movement pixel information; a sixth step of determining a position of said speaker within said captured image, from said movement distribution information; and a seventh step of a second imaging control means causing said speaker-imaging means to move, based on position information of the speaker within a displayed image, so that said image of said speaker is moved to a prescribed position within said captured image.
- 42. A recording medium onto which is stored a program for execution by a computer of an imaging means control method in a teleconferencing system comprising a plurality of sound-collection means, at least one speaker imaging means, and an imaging control means, which changes an imaging direction of said speaker-imaging means so as to capture an image of the speaker, based on voice direction information with regard to the speaker obtained from said sound-collection means, said method comprising:
a first step of predicting a direction of a speaker from information collected as voice sound by each of said sound-collection means; a second step of causing an imaging direction axis of said speaker-imaging means to be pointed at said predicted position of said speaker by a first imaging control means driving said speaker-imaging means, based on speaker direction information predicted in said first step; a third step of storing an image captured by said speaker-imaging means; a fourth step of extracting movement pixel information from said stored image; a fifth step of calculating a movement distribution from said extracted movement pixel information; a sixth step of determining a position of said speaker within said captured image, from said movement distribution information; and a seventh step of a second imaging control means adjusting a zoom mechanism of said speaker-imaging means so as to adjust a size of said speaker in said captured image.
Priority Claims (1)
Number |
Date |
Country |
Kind |
2000-157354 |
May 2000 |
JP |
|
SPECIFICATION
[0001] Teleconferencing system, camera controller for a teleconferencing system, and camera control method for a teleconferencing system.