Field of the Invention
The invention relates generally to musical instruments and, in particular, to techniques suitable for use in portable device hosted implementations of musical instruments for capture and rendering of musical performances with game-play features.
Related Art
The proliferation of mobile, indeed social, music technology presents opportunities for increasingly sophisticated, yet widely deployable, tools for musical composition and performance. See generally, L. Gaye, L. E. Holmquist, F. Behrendt, and A. Tanaka, “Mobile music technology: Report on an emerging community” in Proceedings of the International Conference on New Interfaces for Musical Expression, pages 22-25, Paris, France (2006); see also, G. Wang, G. Essl, and H. Penttinen, “Do Mobile Phones Dream of Electric Orchestras?” in Proceedings of the International Computer Music Conference, Belfast (2008). Indeed, applications such as the Smule Ocarina™, Leaf Trombone®, I Am T-Pain™, AutoRap®, Sing! Karaoke™, Guitar! By Smule®, and Magic Piano® apps available from Smule, Inc. have shown that advanced digital acoustic techniques may be delivered on iPhone®, iPad®, iPod Touch® and other iOS® or Android devices in ways that provide users and listeners alike with compelling musical experiences.
As mobile music technology matures and as new social networking and monetization paradigms emerge, improved techniques and solutions are desired that build on well understood musical interaction paradigms but unlock new opportunities for musical composition, performance and collaboration amongst a new generation of artists using a new generation of audiovisual capable devices and compute platforms.
Despite practical limitations imposed by mobile device platforms and applications, truly captivating musical instruments may be synthesized in ways that allow musically expressive performances to be captured and rendered in real-time. In some cases, synthetic musical instruments can provide a game, grading or instructional mode in which one or more qualities of a user's performance are assessed relative to a musical score. By providing a range of modes (from score-assisted to fully user-expressive) and, in some cases, by adapting to the level of a given user musician's skill, user interactions with synthetic musical instruments can be made more engaging and may capture user interest over generally longer periods of time. Indeed, as economics of application software markets (at least those for portable handheld device type software popularized by Apple's iTunes Store for Apps and the Google Play! Android marketplace) transition from initial purchase price revenue models to longer term and recurring monetization strategies, such as through in-app purchases and/or subscriptions, user and group affinity characterization and social networking ties, importance of long term user engagement with an application or suite is increasing.
To those ends, techniques have been developed to tailor and adapt the user musician experience and to maintain long term engagement with apps and app suites. Some of those techniques can be realized in synthetic musical instrument implementations in which captured dynamics of user gestures (such as finger contact forces applied to a multi-touch sensitive display or surface and/or the temporal extent of sustained contact thereon) convey to the digital synthesis expressive aspects of a user's performance. Performance adaptive tempos may also be supported in some cases. Responsiveness of the digital synthesis to captured dynamics may be based a self-reported level of musical skill or that observable by the synthetic musical instrument or related computational facilities. In this way, amateur and expert users can be provided with very different, but appropriately engaging, user experiences. Similarly, a given user's experience can be adapted as the user's skill level improves.
More specifically, aspects of these performance- and/or skill-adaptive techniques have been concretely realized in synthetic musical instrument applications that are responsive to force touch dynamics of user digit (finger or thumb) contacts. These and other realizations will be understood in the context of specific implementations and teaching examples that follow, including those that pertain to synthetic piano- or keyboard-type musical instrument application software suitable for execution on a portable handheld computing devices of the type popularized by iOS and Android smartphones and pad/tablet devices. In some exemplary implementations, visual cues presented on a multi-touch sensitive display provide the user with temporally sequenced note and chord selections throughout a performance in accordance with the musical score. Note soundings are indicated by user gestures captured at the multi-touch sensitive display and may include measures of contact forces applied to the multi-touch sensitive display. In some cases, forces may be quantified by force responsive elements of the display itself. In some cases, auxiliary measurements, such as using an onboard accelerometer, may be employed. One or more measures of correspondence between actual note soundings (including finger or thumb contact dynamics) with a score-coded temporal sequence of notes (including onset, velocity and/or sustain attributes thereof) and chord selections are used to grade the user's performance.
In general, both visual cuing and note sounding gestures may be particular to the synthetic musical instrument implemented. For example, in a piano configuration or embodiment reminiscent of that popularized by Smule, Inc. in its Magic Piano application for iPad, iPhone and Android devices, user digit contacts (i.e., finger and/or thumb contacts, referred to hereinafter simply as “finger contacts”) at laterally displaced positions on the multi-touch sensitive display constitute gestures indicative of key strikes, and a digital synthesis of a piano is used to render an audible performance in correspondence with captured user gestures. By using a multi-touch sensitive display that is responsive to the level pressure or force applied (such as in Force Touch or 3D Touch enabled displays), multiple dimensions of users expressed note sounding gestures may be captured and conveyed to the digital synthesis of piano key strikes. A piano roll style set of visual cues provides the user with note and chord selections. In some cases, desired note velocity and/or sustained after-touch expressions of a performance may also visually cued. While the visual cues are driven by a musical score and revealed/advanced at a current performance tempo, it is the user's gestures that actually drive the audible performance rendering. Given this decoupling, the user's performance (as captured and audibly rendered) often diverges (in note velocity, note onset, sustained after-touch and/or tempo) from a precise score-coded version and/or visual cues based thereon. These divergences, particularly divergences from musically scored note velocity, sustained after-touch and tempo, can be expressive or simply errant.
As will be appreciated, pleasing musical performances are generally not contingent upon performing to an strict set of note velocities, sustains and/or tempo(s) as musically scored. Rather, variations in tempo are commonly used as intentional musical artifacts by performers, speeding up or slowing down phrases or individual notes to add emphasis. Likewise with sounded (or voiced) note velocity, note sustain, or after-touch key pressure. Indeed, these modulations in tempo (onset and sustain) as well as note velocity and/or after-touch (or post-onset key pressure) are often described as “expressiveness.” Techniques described herein aim to allow users to be expressive while remaining, generally speaking, rhythmically and otherwise consistent with the musical score.
Accordingly, techniques have been developed to support user expressivity by mapping force-quantified (or force-estimated) aspects of user expressed note sounding gestures captured at multi-touch sensitive display to musically expressive parameters such as sounded note velocity or after-touch key pressure. Such musically expressive parameters are, in turn, supplied to the digital synthesis in a Magic Piano type instrument. By conveying these newly captured (or capturable) parameters of note sounding gestural expression to the digital synthesis, it is possible for a synthetic music instrument implemented on a portable computing device to support greater levels of user expression.
In addition, selective override or modulation of existing expressive parameters represented in a MIDI-encoded score file may be provided. In some cases or embodiments, the degree of override or modulation of one or more score coded parameters such as sounded note velocity, sustain or after-touch key pressure may be design choice or mode setting. In other cases or embodiments, the degree of override or modulation may be exposed by a user interface of the synthetic music instrument. For example, an expression slider (or rotary knob, or even a set of buttons) may be provided as a user interface element to allow the user to decide how much expressive control they want. In this way, a no (or low) expression setting may provide an operating mode in which notes are visually cued based on score-coded note “on” timing, but actually sounded based on user gestures, though with note velocity (loudness/timbre) and duration (note-off time) determined by the score. On the other hand, a full expression setting may place note velocity and note sustain under user control of the user based on captured dynamics of finger contact (e.g., measured or estimated striking contact force and after-touch pressure). Between the extremes, intermediate levels of expression may be selected such that half way would allow user expressed finger contact dynamics to influence the score coded dynamics, a quarter way would be mostly score, while three quarters would be mostly user expression. Linear or other “parameter cross-fade” curves can be used, depending on the function being controlled.
Typically, visual cues are supplied for score-coded notes in a manner that suggests to a user a timing (start time) for expressing a note sounding gesture. In some cases or embodiments, visual cues may be selected to also suggest to a user note velocity (loudness/timbre), duration or sustain (note-off time), and/or after-touch pressure. In such cases, correspondence of a user's actual performance with cued note velocity, sustain or after-touch may be measured as part of performance evaluation, grading or gameplay.
Techniques have likewise been developed to optionally and adaptively adjust a current value of target tempo against which timings of successive note or chord soundings are evaluated or graded. Tempo adaptation is based on actual tempo of the user's performance and, in some embodiments, includes filtering or averaging over multiple successive note soundings with suitable non-linearities (e.g., dead band(s), rate of change limiters or caps, hysteresis, etc.) in the computational implementation. In some cases, changes to the extent and parameterization of filtering or averaging windows may themselves be coded in correspondence with the musical score. In any case, by repeatedly recalculating the current value for target tempo throughout the course of the user's performance, both the pace of success visual cues and the temporal baseline against which successive user note/chord sounding are evaluated or graded may be varied.
In this way, expressive accelerations and decelerations of performance tempo are tolerated in the performance evaluation/grading and are adapted-to in the supply of successive note/chord sounding visual cues. Discrimination between expressive and merely errant/random divergences from a current target tempo may be based on consistency of the tempo over a filtering or averaging window. In some cases, expressive accelerations and decelerations of performance tempo may not only be tolerated, but may themselves contribute as a quality metric to the evaluation or grading of a user's performance.
In general, audible rendering includes synthesis of tones, overtones, harmonics, perturbations and amplitudes and other performance characteristics based on the captured gesture stream. In some cases, rendering of the performance includes audible rendering by converting to acoustic energy a signal synthesized from the gesture stream encoding (e.g., by driving a speaker). In some cases, the audible rendering is on the very device on which the musical performance is captured. In some cases, the gesture stream encoding is conveyed to a remote device whereupon audible rendering converts a synthesized signal to acoustic energy.
Thus, in some embodiments, a synthetic musical instrument (such as a synthetic piano, guitar or trombone) allows the human user to control a parameterized synthesis or, in some cases, an actual expressive model of a vibrating string or resonating column of air, using multi-sensor interactions (key strikes, fingers on strings or at frets, strumming covering of holes, etc.) via a multi-touch sensitive display. The user is actually causing the sound and controlling the parameters affecting pitch, quality, etc.
In some embodiments, a storybook mode provides lesson plans which teach the user to play the synthetic instrument and exercise. User performances may be graded (or scored) as part of a game (or social-competitive application framework), and/or as a proficiency measure for advancement from one stage of a lesson plan to the next. In general, better performance lets the player (or pupil) advance faster. High scores both encourage the pupil (user) and allow the system to know how quickly to advance the user to the next level and, in some cases, along which game or instructive pathway. In each case, the user is playing a real/virtual model of an instrument, and their gestures actually control the sound, timing, etc.
Often, both the device on which a performance is captured and that on which the corresponding gesture stream encoding is rendered are portable, even handheld devices, such as pads, mobile phones, personal digital assistants, smart phones, media players, or book readers. In some cases, rendering is to a conventional audio encoding such as AAC, MP3, etc. In some cases, rendering to an audio encoding format is performed on a computational system with substantial processing and storage facilities, such as a server on which appropriate CODECs may operate and from which content may thereafter be served. Often, the same gesture stream encoding of a performance may (i) support local audible rendering on the capture device, (ii) be transmitted for audible rendering on one or more remote devices that execute a digital synthesis of the musical instrument and/or (iii) be rendered to an audio encoding format to support conventional streaming or download.
In some embodiments in accordance with the present invention(s), a method includes a method includes using a portable computing device as a synthetic musical instrument, presenting a user of the synthetic musical instrument with visual cues on a multi-touch sensitive display of the portable computing device, capturing note sounding gestures indicated by the user based on finger contacts with the multi-touch sensitive display, and audibly rendering a performance on the portable computing device in real-time correspondence with the captured note sounding gestures, including the finger contact dynamics thereof. The presented visual cues are indicative of temporally sequenced note selections in accord with a musical score. Individual ones of the captured note sounding gestures are characterized, at least in part, based on position and dynamics of finger contact with the multi-touch sensitive display.
In some cases or embodiments, the finger contact dynamics include a characterization of finger contact force applied to the multi-touch sensitive display, and the characterization of finger contact force is used as at least a contributing indicator for velocity with which a corresponding note is sounded in the audibly rendered performance. In some embodiments, for member notes of a chord sounded in the audibly rendered performance, the method further includes applying a generally uniform velocity based on the characterization of at least one corresponding finger contact force. In some embodiments, for member notes of a chord sounded in the audibly rendered performance, the method further includes applying individual velocities based, at least in part, on characterizations of respective finger contact forces.
In some cases or embodiments, the finger contact force is characterized at the portable computing device based on sensitivity of the multi-touch sensitive display itself to a range of applied force magnitudes. In some cases or embodiments, the characterization of finger contact force includes a remapping from a multi-touch sensitive display contact force data domain to a mapped range of note velocities for the synthetic musical instrument. In some cases or embodiments, the synthetic musical instrument includes a piano or keyboard, and the remapping is in accord with a normalized half-sigmoidal-type mapping function.
In some cases or embodiments, the finger contact force is characterized at the portable computing device based on accelerometer data associable with the finger contact. In some cases or embodiments, the finger contact dynamics further include both onset and release of a finger contact. A temporal extent of the finger contact, from onset to release, is used as at least a contributing indicator for sustaining of a corresponding note sounded in the audibly rendered performance. In some cases or embodiments, the finger contact dynamics further include after-touch dynamics used as at least for vibrato or bend of a corresponding note sounded in the audibly rendered performance.
In some embodiments wherein the musical score encodes a temporal sequencing of note selections together with corresponding dynamics, the method further includes: (1) for at least a subset of the captured note sounding gestures, computing effective note sounding dynamics based, for a given note sounding gesture, on both: the score-coded dynamics for the corresponding note selection and user-expressed dynamics of finger contact with the multi-touch sensitive display, and (2) audibly rendering the performance on the portable computing device in real-time correspondence with the captured note sounding gestures based on the computed effective note sounding dynamics.
In some embodiments, the method further includes computing the effective note sounding dynamics as a function that includes a weighed sum of the score-coded and user-expressed dynamics. In some cases or embodiments, the weighed sum includes an approximately 25% contribution in accord with score-coded note velocities and an approximately 75% contribution in accord user-expressed note sounding velocity characterized based on finger contact forces applied to the multi-touch sensitive display. In some embodiments, the method further includes varying comparative contributions of score-coded dynamics and user-expressed dynamics to the computed effective note sounding dynamics based on a user interface control.
In some cases or embodiments, the user interface control is provided at least in part, using a slider, knob or selector visually presented on the multi-touch sensitive display. The user interface control provides either or both of: a predetermined set of values for the comparative contributions and an effectively continuous variation of the comparative contributions. In some embodiments, the method further includes dynamically varying the comparative contributions.
In some embodiments, the method further includes dynamically varying (based on the musical score) during a course of the performance comparative contributions of score-coded dynamics and user-expressed dynamics to the computed effective note sounding dynamics. In some embodiments, the method further computing the effective note sounding dynamics as a function that modulates score-coded note velocities based on characterization of user-expressed finger contact forces applied to the multi-touch sensitive display in connection with the particular note sounding gestures.
In some cases or embodiments, the presentation of visual cues is in correspondence with a target tempo. The method further includes repeatedly recalculating a current value for the target tempo throughout the performance by the user and thereby varying, at least partially in correspondence with an actual performance tempo indicated by the captured note sounding gestures, a pace at which visual cues for successive ones of the note selections arrive at a sounding zone of the multi-touch sensitive display. In some cases or embodiments, the repeatedly recalculating includes, for at least a subset of successive note sounding gestures: computing a distance from an expected sounding of the corresponding visually cued note selection and updating the current value for the target tempo based on a function of the computed distance.
In some embodiments, the method further includes determining correspondence of respective captured note sounding gestures with the visual cues and grading the user's performance based on the determined correspondences. In some embodiments, the method further includes presenting the user with visual cues indicative of score-coded note velocities, wherein the determined correspondences include correspondence of score-coded note velocities with note velocities actually expressed by the users note sounding gestures. In some cases or embodiments, the determined correspondences includes one or more of: (i) a measure of temporal correspondence of a particular note sounding gesture with arrival of a visual cue in a sounding zone, (ii) a measure of note selection correspondence of the particular note sounding gesture with the visual cue, and (iii) a measure of correspondence of finger contact dynamics for particular note sounding gesture with visually cued note velocity.
In some cases or embodiments, the presented visual cues traverse at least a portion of the multi-touch sensitive display toward a sounding zone. In some cases or embodiments, the synthetic musical instrument is a piano or keyboard, and the visual cues travel across the multi-touch sensitive display and represent, in one dimension of the multi-touch sensitive display, desired key contacts in accordance with notes of the score and, in a second dimension generally orthogonal to the first, temporal sequencing of the desired key contacts. In some cases or embodiments, the synthetic musical instrument is a string instrument, and the visual cues code, in one dimension of the multi-touch sensitive display, desired contact with corresponding ones of the strings in accordance with the score and, in a second dimension generally orthogonal to the first, temporal sequencing of the desired contacts paced in accord with the current value of the target tempo. In some cases or embodiments, the captured note sounding gestures are indicative of both string excitation and pitch selection for the excited string.
In some embodiments, the method further includes presenting on the multi-touch sensitive display a lesson plan of exercises, wherein the captured note selection gestures correspond to performance by the user of a particular one of the exercises, and advancing the user to a next exercise of the lesson plan based on a grading of the user's performance of the particular exercise.
In some cases or embodiments, the portable computing device includes a communications interface and the method further includes transmitting an encoded stream of the note sounding gestures via the communications interface for rendering of the performance on a remote device.
In some cases or embodiments, the audible rendering includes: modeling acoustic response for one of a piano, a guitar, a violin, a viola, a cello, a double bass, organ(s) and a accordion, and driving the modeled acoustic response with inputs corresponding to the captured note sounding gestures and, for at least some of the captured note sounding gestures, a combination of score-coded and user-expressed dynamics. In some cases or embodiments, the portable computing device is selected from the group of: a compute pad; a personal digital assistant or book reader; and a mobile phone or media player.
In some embodiments, the method further includes geocoding the transmitted gesture stream and displaying a geographic origin for, and in correspondence with audible rendering of, another user's performance encoded as another stream of notes sounding gestures received via the communications interface directly or indirectly from a remote device.
In some embodiments in accordance with the present invention, a method includes using a portable computing device as a synthetic musical instrument; presenting a user of the synthetic musical instrument with visual cues on a multi-touch sensitive display of the portable computing device, the presented visual cues indicative of temporally sequenced note selections in accord with a musical score, wherein the musical score further encodes dynamics for at least some of the note selections; capturing note sounding gestures indicated by the user based on finger contacts with the multi-touch sensitive display, wherein individual ones of the captured note sounding gestures are characterized, at least in part, based on position and dynamics of finger contact with the multi-touch sensitive display; for at least a subset of the captured note sounding gestures, computing effective note sounding dynamics based, for a given note sounding gesture, on both the score-coded dynamics for the corresponding note selection and user-expressed dynamics of finger contact with the multi-touch sensitive display; and audibly rendering the performance on the portable computing device in real-time correspondence with the captured note sounding gestures based on the computed effective note sounding dynamics.
In some embodiments, the method further includes computing the effective note sounding dynamics as a function that includes a weighed sum of the score-coded and user-expressed dynamics. In some cases or embodiments, the weighed sum includes an approximately 25% contribution in accord with score-coded note velocities and an approximately 75% contribution in accord user-expressed note sounding velocity characterized based on finger contact forces applied to the multi-touch sensitive display.
In some embodiments, the method further includes varying comparative contributions of score-coded dynamics and user-expressed dynamics to the computed effective note sounding dynamics based on a user interface control. In some cases or embodiments, the user interface control is provided at least in part, using a slider, knob or selector visually presented on the multi-touch sensitive display. The user interface control provides either or both of: a predetermined set of values for the comparative contributions and an effectively continuous variation of the comparative contributions. In some embodiments, the method further includes dynamically varying the comparative contributions.
In some embodiments, the method further includes dynamically varying (based on the musical score) during a course of the performance comparative contributions of score-coded dynamics and user-expressed dynamics to the computed effective note sounding dynamics.
In some embodiments, the method further includes computing the effective note sounding dynamics as a function that modulates score-coded note velocities based on characterization of user-expressed finger contact forces applied to the multi-touch sensitive display in connection with the particular note sounding gestures.
In some cases or embodiments, the finger contact dynamics include a characterization of finger contact force applied to the multi-touch sensitive display, and the characterization of finger contact force is used as at least a contributing indicator for velocity with which a corresponding note is sounded in the audibly rendered performance. In some embodiments, for member notes of a chord sounded in the audibly rendered performance, the method further includes applying a generally uniform velocity based on the characterization of at least one corresponding finger contact force. In some embodiments, for member notes of a chord sounded in the audibly rendered performance, the method further includes applying individual velocities based, at least in part, on characterizations of respective finger contact forces.
In some cases or embodiments, the finger contact force is characterized at the portable computing device based on sensitivity of the multi-touch sensitive display itself to a range of applied force magnitudes. In some cases or embodiments, the characterization of finger contact force includes a remapping from a multi-touch sensitive display contact force data domain to a mapped range of note velocities for the synthetic musical instrument. In some cases or embodiments, the synthetic musical instrument includes a piano or keyboard and the remapping is in accord with a normalized half-sigmoidal-type mapping function. In some cases or embodiments, the remapping is in accord with a cosine, exponential, log, raised cosine or arctangent-type mapping function.
In some cases or embodiments, the finger contact force is characterized at the portable computing device based on accelerometer data associable with the finger contact. In some embodiments, the method further includes determining correspondence of respective captured note sounding gestures with the visual cues and grading the user's performance based on the determined correspondences. In some embodiments, the method further includes presenting the user with visual cues indicative of score-coded note velocities, wherein the determined correspondences include correspondence of score-coded note velocities with note velocities actually expressed by the users note sounding gestures.
In some cases or embodiments, the determined correspondences includes one or more of: (i) a measure of temporal correspondence of a particular note sounding gesture with arrival of a visual cue in a sounding zone, (ii) a measure of note selection correspondence of the particular note sounding gesture with the visual cue, and (iii) a measure of correspondence of finger contact dynamics for particular note sounding gesture with visually cued note velocity.
In some cases or embodiments, the presented visual cues traverse at least a portion of the multi-touch sensitive display toward a sounding zone. In some cases or embodiments, the synthetic musical instrument is a piano or keyboard. The visual cues travel across the multi-touch sensitive display and represent, in one dimension of the multi-touch sensitive display, desired key contacts in accordance with notes of the score and, in a second dimension generally orthogonal to the first, temporal sequencing of the desired key contacts. In some cases or embodiments, the synthetic musical instrument is a string instrument, and the visual cues code, in one dimension of the multi-touch sensitive display, desired contact with corresponding ones of the strings in accordance with the score and, in a second dimension generally orthogonal to the first, temporal sequencing of the desired contacts paced in accord with the current value of the target tempo. In some cases or embodiments, the captured note sounding gestures are indicative of both string excitation and pitch selection for the excited string.
In some embodiments, the method further includes presenting on the multi-touch sensitive display a lesson plan of exercises, wherein the captured note selection gestures correspond to performance by the user of a particular one of the exercises; and advancing the user to a next exercise of the lesson plan based on a grading In some cases or embodiments, the portable computing device includes a communications interface, and the method further includes transmitting an encoded stream of the note sounding gestures via the communications interface for rendering of the performance on a remote device.
In some cases or embodiments, the audible rendering includes: modeling acoustic response for one of a piano, a guitar, a violin, a viola, a cello and a double bass; and driving the modeled acoustic response with inputs corresponding to the captured note sounding gestures and, for at least some of the captured note sounding gestures, a combination of score-coded and user-expressed dynamics.
In some embodiments, the method further includes geocoding the transmitted gesture stream; and displaying a geographic origin for, and in correspondence with audible rendering of, another user's performance encoded as another stream of notes sounding gestures received via the communications interface directly or indirectly from a remote device.
In some embodiments in accordance with the present invention, an apparatus includes a portable computing device and machine readable code executable thereon. The portable computing device has a multi-touch display interface. The machine readable code is executable on the portable computing device to implement the synthetic musical instrument, the machine readable code including instructions executable to present a user of the synthetic musical instrument with visual cues on a multi-touch sensitive display of the portable computing device, the presented visual cues indicative of temporally sequenced note selections in accord with a musical score, wherein the musical score further encodes dynamics for at least some of the note selections. The machine readable code further executable to (i) capture note sounding gestures indicated by the user based on finger contacts with the multi-touch sensitive display, wherein individual ones of the captured note sounding gestures are characterized, at least in part, based on position and dynamics of finger contact with the multi-touch sensitive display and (ii) for at least a subset of the captured note sounding gestures, to compute effective note sounding dynamics based, for a given note sounding gesture, on both the score-coded dynamics for the corresponding note selection and user-expressed dynamics of finger contact with the multi-touch sensitive display.
In some embodiments, the apparatus further includes machine readable code executable on the portable computing device to audibly render the performance on the portable computing device in real-time correspondence with the captured note sounding gestures based on the computed effective note sounding dynamics.
In some cases or embodiments, the apparatus is embodied as one or more of a compute pad, a handheld mobile device, a mobile phone, a personal digital assistant, a smart phone, a media player and a book reader.
In some embodiments in accordance with the present invention, a computer program product is encoded in media and including instructions executable to implement a synthetic musical instrument on a portable computing device having a multi-touch display interface. The computer program product encodes and includes: (i) instructions executable on the portable computing device to present a user of the synthetic musical instrument with visual cues on the multi-touch sensitive display of the portable computing device, the presented visual cues indicative of temporally sequenced note selections in accord with a musical score, wherein the musical score further encodes dynamics for at least some of the note selections; and instructions executable on the portable computing device to (i) capture note sounding gestures indicated by the user based on finger contacts with the multi-touch sensitive display, wherein individual ones of the captured note sounding gestures are characterized, at least in part, based on position and dynamics of finger contact with the multi-touch sensitive display and (ii) for at least a subset of the captured note sounding gestures, to compute effective note sounding dynamics based, for a given note sounding gesture, on both the score-coded dynamics for the corresponding note selection and user-expressed dynamics of finger contact with the multi-touch sensitive display.
In some embodiments, the computer program product further encodes and includes instructions executable on the portable computing device to audibly render the performance on the portable computing device in real-time correspondence with the captured note sounding gestures based on the computed effective note sounding dynamics. In some cases or embodiments, the media are readable by the portable computing device or readable incident to a computer program product conveying transmission to the portable computing device.
These and other embodiments in accordance with the present invention(s) will be understood with reference to the description herein as well as the drawings and appended claims which follow.
The present invention is illustrated by way of example and not limitation with reference to the accompanying figures, in which like references generally indicate similar elements or features.
Skilled artisans will appreciate that elements or features in the figures are illustrated for simplicity and clarity and have not necessarily been drawn to scale. For example, the dimensions or prominence of some of the illustrated elements or features may be exaggerated relative to other elements or features in an effort to help to improve understanding of embodiments of the present invention.
Many aspects of the design and operation of a synthetic musical instrument with touch dynamics and/or expressiveness control will be understood based on the description herein of certain exemplary piano- or keyboard-type implementations and teaching examples. Nonetheless, it will be understood and appreciated based on the present disclosure that variations and adaptations for other instruments are contemplated. Portable computing device implementations and deployments typical of a social music applications for iOS® and Android® devices are emphasized for purposes of concreteness. Score or tablature user interface conventions popularized in the Magic Piano®, Magic Fiddle™, Magic Guitar™, Leaf Trombone: World Stage™ and Ocarina 2 applications (available from Smule Inc.) are likewise emphasized.
While these synthetic keyboard-type, string and even wind instruments and application software implementations provide a concrete and helpful descriptive framework in which to describe aspects of the invented techniques, it will be understood that Applicant's techniques and innovations are not necessarily limited to such instrument types or to the particular user interface designs or conventions (including e.g., musical score presentations, note sounding gestures, visual cuing, sounding zone depictions, etc.) implemented therein. Indeed, persons of ordinary skill in the art having benefit of the present disclosure will appreciate a wide range of variations and adaptations as well as the broad range of applications and implementations consistent with the examples now more completely described.
Exemplary Synthetic Piano-Type Application
Just as early and late sounding of cued notes are potentially expressive, so too can be finger contact dynamics that, in embodiments of a synthetic musical instrument implemented on a portable computing device capable of registering variations finger contact forces applied a multi-touch sensitive display. More specifically, measured or estimated magnitudes of finger contact forces applied in the course of the key strike gestures described above are captured as user expression of keyed note velocity and/or after-touch key pressure. Persons of skill in the art having benefit of the present disclosure will appreciate that, in certain embodiments, visual cuing symbologies such as that illustrated in
In general, the audible rendering can include synthesis of tones, overtones, harmonics, perturbations and amplitudes and other performance characteristics based on the captured gesture stream. In some cases, rendering of the performance includes audible rendering by converting to acoustic energy a signal synthesized from the gesture stream encoding (e.g., by driving a speaker). In some cases, the audible rendering is on the very device on which the musical performance is captured. In some cases, the gesture stream encoding is conveyed to a remote device whereupon audible rendering converts a synthesized signal to acoustic energy.
The digital synthesis (554) of a piano (or other keyboard-type percussion instrument) allows the user musician to control an actual expressive model using multi-sensor interactions (e.g., finger strikes at laterally coded note positions on screen, perhaps with sustenance or damping gestures expressed by particular finger travel or via a orientation- or accelerometer-type sensor 517) as inputs. In a portable computing device 501 embodiment that provides a force or pressure sensitive multi-touch sensitive display or which is configured to generate similar accelerometer-based data, key strike forces are captured as an additional component of user expression. Note that digital synthesis (554) is, at least for full synthesis modes, driven by the user musician's note sounding gestures, rather than by mere tap triggered release of the next score coded note. In this way, the user is actually causing the sound and controlling the timing, velocity, sustain, decay, pitch, quality and other characteristics of notes (including chords) sounded. A variety of computational techniques may be employed and will be appreciated by persons of ordinary skill in the art. For example, exemplary techniques include wavetable or FM synthesis.
Wavetable or FM synthesis is generally a computationally efficient and attractive digital synthesis implementation for piano-type musical instruments such as those described and used herein as primary teaching examples. However, and particularly for adaptations of the present techniques to syntheses of certain types of multi-string instruments (e.g., unfretted multi-string instruments such as violins, violas cellos and double bass), physical modeling may provide a livelier, more expressive synthesis that is responsive (in ways similar to physical analogs) to the continuous and expressively variable excitation of constituent strings. For a discussion of digital synthesis techniques that may be suitable in other synthetic instruments, see generally, commonly-owned U.S. Pat. No. 8,772,621, which is incorporated by reference herein.
Referring again to
In general, musical scores in storage 556 may be included with a distribution of the synthetic musical instrument or may be demand retrieved by a user via a communications interface as an “in-app” purchase. Generally, scores may be encoded in accord with any suitable coding scheme such as in accord with well-known musical instrument digital interface—(MIDI-) or open sound control—(OSC-) type standards, file/message formats and protocols (e.g., standard MIDI [.mid or .smf] formats, extensible music file, XMF formats; extensible MIDI [.xmi] formats; RIFF-based MIDI [.rmi] formats; extended RMID formats, etc.). Formats may be augmented or annotated to indicate operative windows for adaptive tempo management and/or musical phrase boundaries or key notes.
Performance Grading, Evaluation or Scoring
Specifically,
In some embodiments and game-play modes, note soundings by a user-musician are “scored” or credited to a grade, if the selections, timings, velocities, and/or after-touch key pressures expressed in the form of captured note sounding gestures correspond to visually-cued aspects of the musical score. Thus, grading of a user's expressed performance (653) will be understood as follows:
In this manner, songs that are longer and have more notes will yield potentially higher scores or at least the opportunity therefor. The music itself becomes a difficulty metric for the performance, some songs will be easier (and contain fewer notes, simpler sequences and pacings, etc.), while others will be harder (and may contain more notes, more difficult note/chord sequences, varied note velocities, after-touch key pressures, paces, etc.). Users can compete for top scores on a song-by-song basis so the variations in difficulty across songs are not a concern.
Expressiveness
A flexible performance grading system will generally allow users to create expressive musical performances. As will be appreciated by many a musician, successful and pleasing musical performances are generally not contingent upon performing to precisely-specified note velocities or to an absolute and strict single tempo. Instead, variations in expressed note velocities and tempo are commonly (and desirably) used as intentional musical artifacts by performers, emphasizing and deemphasizing certain notes, chords or members of a chord, embellishing with note sustains or variations after-touch key pressures, speeding up or slowing down phrases, etc. to add emphasis. These modulations in tempo (onsets and sustains) as well as note velocity and/or after-touch (or post-onset key pressure) can all contribute to “expressiveness.” Accordingly, in synthetic piano implementations described herein, we aim to allow users to be expressive while remaining, generally speaking, rhythmically and otherwise consistent with musical score.
While the invention(s) is (are) described with reference to various embodiments, it will be understood that these embodiments are illustrative and that the scope of the invention(s) is not limited to them. Many variations, modifications, additions, and improvements are possible. For example, while a synthetic piano implementation has been used as an illustrative example, variations on the techniques described herein for other synthetic musical instruments such as string instruments (e.g., guitars, violins, etc.) and wind instruments (e.g., trombones) will be appreciated. Furthermore, while certain illustrative processing techniques have been described in the context of certain illustrative applications, persons of ordinary skill in the art will recognize that it is straightforward to modify the described techniques to accommodate other suitable signal processing techniques and effects.
Embodiments in accordance with the present invention may take the form of, and/or be provided as, a computer program product encoded in a machine-readable medium as instruction sequences and other functional constructs of software, which may in turn be executed in a computational system (such as a iPhone handheld, mobile device or portable computing device) to perform methods described herein. In general, a machine readable medium can include tangible articles that encode information in a form (e.g., as applications, source or object code, functionally descriptive information, etc.) readable by a machine (e.g., a computer, computational facilities of a mobile device or portable computing device, etc.) as well as tangible storage incident to transmission of the information. A machine-readable medium may include, but is not limited to, magnetic storage medium (e.g., disks and/or tape storage); optical storage medium (e.g., CD-ROM, DVD, etc.); magneto-optical storage medium; read only memory (ROM); random access memory (RAM); erasable programmable memory (e.g., EPROM and EEPROM); flash memory; or other types of medium suitable for storing electronic instructions, operation sequences, functionally descriptive information encodings, etc.
In general, plural instances may be provided for components, operations or structures described herein as a single instance. Boundaries between various components, operations and data stores are somewhat arbitrary, and particular operations are illustrated in the context of specific illustrative configurations. Other allocations of functionality are envisioned and may fall within the scope of the invention(s). In general, structures and functionality presented as separate components in the exemplary configurations may be implemented as a combined structure or component. Similarly, structures and functionality presented as a single component may be implemented as separate components. These and other variations, modifications, additions, and improvements may fall within the scope of the invention(s).
The present application claims priority of U.S. Provisional Application No. 62/222,824, filed Sep. 24, 2015. The present application is also a continuation-in-part of U.S. patent application Ser. No. 14/797,695, filed on Jul. 13, 2015, which is a continuation of U.S. patent application Ser. No. 13/664,939, filed Oct. 31, 2012, now U.S. Pat. No. 9,082,380, which in-turn claims priority of U.S. Provisional Application No. 61/553,781, filed Oct. 31, 2011. Each of the foregoing applications is incorporated herein by reference.
Number | Name | Date | Kind |
---|---|---|---|
8222507 | Salazar | Jul 2012 | B1 |
8772621 | Wang et al. | Jul 2014 | B2 |
9082380 | Hamilton | Jul 2015 | B1 |
9620095 | Hamilton | Apr 2017 | B1 |
20110134061 | Lim | Jun 2011 | A1 |
20120174736 | Wang | Jul 2012 | A1 |
20120186416 | Souppa et al. | Jul 2012 | A1 |
20130180385 | Hamilton | Jul 2013 | A1 |
20140047970 | Yoshikawa et al. | Feb 2014 | A1 |
20140349761 | Kruge | Nov 2014 | A1 |
Number | Date | Country |
---|---|---|
10-0294603 | Nov 2001 | KR |
10-2005-0117808 | Dec 2005 | KR |
Entry |
---|
Gaye, L. et al., “Mobile music technology: Report on an emerging community”, Proceedings of the International Conference on New Interfaces for Musical Expression, pp. 22-25, Paris, France, 2006. |
G. Wang et al., “MoPhO: Do Mobile Phones Dream of Electric Orchestras?” In Proceedings of the International Computer Music Conference, Belfast, Aug. 2008. |
Jason Snell, “Best 3D Touch Apps for the iPhone 6s and 6s Plus”, Nov. 6, 2015 (retrieved Sep. 26, 2016), Tom's Guide, pp. 1-15, http://www.tomsguide.com/. |
PCT International Search Report /Written Opinion of International Search Authority for counterpart application, mailed Jan. 10, 2017, of PCT/US2016/053731. |
Number | Date | Country | |
---|---|---|---|
20170011724 A1 | Jan 2017 | US |
Number | Date | Country | |
---|---|---|---|
61553781 | Oct 2011 | US | |
62222824 | Sep 2015 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 13664939 | Oct 2012 | US |
Child | 14797695 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14797695 | Jul 2015 | US |
Child | 15275807 | US |