The invention relates generally to computer software, and more specifically, to generating musical compositions from image pixels in a computing environment.
Personal media libraries are on the rise. First, low cost still cameras and video cameras allow users to capture media in almost any venue. Further, the advent of inexpensive and vast storage resources removes limitations of selectively capturing and saving media. Additionally, nearly infinite stores of media are available for download on the Internet from social networking, e-mail and search engine results. Once a computing or media device is obtained, the cost of building personal media collections is negligible.
For example, pre-teenagers often own smart telephones (tablets and other devices) with the ability to take photos on a whim, and thousands of photos can be stored within the small devices. Many users own several other media capturing devices, such as tablet computing device, laptops with integrated cameras, digital still cameras, in addition to media downloaded from the web sites on the Internet and e-mailed amongst friends. All in all, users have large personal media libraries at their disposal on portable computing devices.
One way to share media is through social networking such as Facebook or Instagram. Additionally, media can be printed out as photographs or burned onto a DVD. Multiple photographs can be combined into a standard-shaped collage. Editing programs allow digital media to be processed by combining, cropping, correcting, and the like.
However, none of these techniques allow users to enjoy captured media by generating creative media using captured media as components. In particular, there is no technique to generate a musical composition from pixels of an image.
Methods, computer-readable mediums, and system for generating a musical composition from image pixels are described. Also, an image can be generated from music.
To do so, pixel values are mapped to musical elements together for creating the musical compositions. Additionally, images are formed from pixels generated from musical compositions. More generally, a computer system creatively generates media using captured media as a source. The system also generates collage images in which individual images are pixels for the collage image. Collages are further generated from text
Additionally, the algorithms described herein can be executed on low-power devices such as mobile and handheld devices (e.g., smart cell phones, laptops, tablets, etc.). The techniques can thus process large graphical files locally and with no or limited off-loading to a remote server. The results can be shared from the low-power device.
In the following drawings, like reference numbers are used to refer to like elements. Although the following figures depict various examples of the invention, the invention is not limited to the examples depicted in the figures.
Methods, computer-readable mediums, and system for generating a musical composition from pixels of an image are described. Also, an image can be generated pixel by pixel, from a musical composition.
More generally, a computer system creatively generates media using captured media as a source. Thus, music generation from image pixels, described below, is just one aspect of the system capabilities. Music generation capabilities can be implemented independently or in coordination with other media generation capabilities. In other embodiments, a collage of individual images appears itself to be an image. The individual images are pixels for the collage image. Also, a collage can be generated from text such as ASCII characters. The following description is intended to be exemplary embodiments for the purpose of illustration and not to limit additional embodiments that would be apparent to one of ordinary skill in the art, in view of the description.
Musical compositions, as referred to herein, include any type of music that can be played back on a computing device, including songs, solo instrumentals, band instrumentals, random sounds, and the like. More specifically, a musical composition is made up of a string of musical elements derived from image pixels. Each pixel of a media source can represent, for example, one or more chords, one or more instruments, an adjustment to volume, or the like. Musical compositions can also be generated from video, collages, or other media sources.
I. System to Generate Music from an Image or Collage of Images
The image database 110 can be any resource for images used to generate music. For example, and without limitation, the image database 110 can comprise an online Picasa account, or search results from Google Images. In some implementations, the image database 110 stores musical compositions associated with stored images.
The images can be photographs, online pictures, or animations, for example. The images can be independent, or a portion of a collage or video stream. Examples of media file formats include MP3, JPG, GIF, PDF and custom camera formats. In one case, the image is scanned into a digital format with a larger file having more pixilation than a smaller file.
The media generation engine 120 generates musical compositions for images. In one embodiment, the media generation engine 120 provides a user interface for users to configure musical elements and how they relate to image pixels. Instruments or groups of instruments can be selected (e.g., string quartet, symphony, jazz band, etc.). An instrument can be mapped to a group of pixel values such as mapping a piano to blue colors, and piano chords to different shades of blue. Likewise, a chord can be mapped to a color, and instruments to play the chord to different shades of the color. In another embodiment, music elements can have a default or random assignment. Example of pixel values for color are hex codes (e.g., #33CC33) and RGB vector values (e.g., 51, 104, 51 corresponds to hex code #33CC33). Instrument sounds can be pre-recorded snippets, or computer generated.
A play speed, play direction and type of instruments are also adjustable by a content creator through one embodiment of the media generation engine 120. These are merely examples of many design specific settings that are available. The play speed allows a tempo to be sped up or slowed down, affecting a mood of the playback. The play direction dictates how pixels are read from the subject image source.
The media generation engine 120 generates a musical composition corresponding to the image. To do so, in one touch embodiment, a template is associated with the image to set music element mappings, genre, instrument types, mood, and the like. In a customized embodiment, a sequential order for reading individual pixel values from the source image is determined. The order can be default or selected by the user. Many variations are possible, such as row by row, column by column, every other pixel, spiral from middle to outside perimeter, or any appropriate pattern.
The media generation engine 120 receives individual pixels in the sequential order. Each pixel is formed from digital information about color, brightness, intensity, and other characteristics. Therefore, pixel values can be extracted from an image file. Alternatively, pixel values can be estimated from a display of the image on a screen, without the actual image file.
Next, the media generation engine 120 can match musical elements to individual pixel values based on assignments. In one embodiment, a histogram of a source image is mapped to musical elements wherein frequency of a certain pixel can indicate volume for instruments played for that color. Also, a musical composition can be garneted from the histogram by mapping the aggregate number of a certain pixel colors to musical elements. A resulting musical composition can be stored for real-time playback, for sharing, or for playback at a later time. Certain global characteristics can be set by user or by default, such a volume and playback speed.
In a different embodiment, the media generation engine 120 analyses pixels of a source image and selects an existing musical composition from an audio collection that most closely matches the analysis results. For example, a histogram of pixels can indicate a calm mood with lots of mid-range instruments, so a jazz song with an emphasis on piano tones is selected.
The media generation engine 120 can be implemented in a single server, across many servers, as a software as a service, hardware, software, or a combination of hardware and software. Example components of the media generation engine 120 are set forth in more detail below with respect to
The user device 130 can provide a user interface for the music generation architecture. In one case, an app is downloaded an installed. The app connects with backend resource on the Internet (e.g., the image database 110 and the media generation engine 120). In one case, the entire system 100 to generate music is operated locally from the user device 130. In another case, a network browser interfaces with the image database 110 and/or the media generation engine 120. In an embodiment, the user device 130 includes audio speakers and a display for audio and optionally video playback. Further, a local media application plays back digital media files.
The user device 130 can comprise a smart phone, a tablet, a phablet, a PC, a touch screen device, or any appropriate computing device, such as the generic computing device discussed below in association with
Network 130 can be one or more of: the Internet, a wide area network, a local area network, any data network, an IEEE 802.11 protocol Wi-Fi network, a 3G or 4G cellular network, a Bluetooth network, or any other appropriate communication network. Each of the Image database 110, the media generation engine 120, and the user device 130 is connected to the network 199 for communications. In some embodiments of the system 100A, the image database 110 and the media generation engine 120 execute in the same physical device. The components can be located on a common LAN, or spread across a WAN in a cloud-computing configuration.
In one embodiment, users create a gallery with multiple canvases of generated music compositions or generated images, videos or collages, along with source media. The gallery can store individual canvases can be marked for public viewing or private viewing. Images can be uploaded to a gallery, and a template for musical generation associated with the image. The template can be from another canvas, or from a library of themed templates (e.g., rock, symphony, or electronic dance music).
In one case, the gallery is part of a larger online community or social networking service that shares canvases for browsing by members or friends (e.g., Pinterest or Facebook). Viewers may be allowed to edit or contribute to generated media, and add a lyrics overlay to musical compositions. Canvases can also be sold or offered for download from galleries.
The media cache 122 stores source images used for musical compositions. The musical element module 134 defines assignments between pixels and musical elements used as a base for musical compositions. The musical composition module 136 analyzes an image to determine pixel values for matching to musical elements as assigned. The media player 138 reads the musical composition to produce audio and optionally video for playback. Audio playback can be concurrent with display of a source image. Also, audio generated from a slide show as a whole can be played back with individual images from the slide show. A template associated with an image or musical composition can be changed during playback to modify the generated media.
The computing device 900, of the present embodiment, includes a memory 910, a processor 920, a storage drive 930, and an I/O port 940. Each of the components is coupled for electronic communication via a bus 999. Communication can be digital and/or analog, and use any suitable protocol.
The memory 910 further comprises network applications 912 and an operating system 914. The network applications 912 can include components of the media generation engine 120 or an app on the user device 130. Other network applications 912 can include a web browser, a mobile application, an application that uses networking, a remote application executing locally, a network protocol application, a network management application, a network routing application, or the like.
The operating system 914 can be one of the Microsoft Windows® family of operating systems (e.g., Windows 95, 98, Me, Windows NT, Windows 2000, Windows XP, Windows XP x64 Edition, Windows Vista, Windows CE, Windows Mobile, Windows 7 or Windows 8), Linux, HP-UX, UNIX, Sun OS, Solaris, Mac OS X, Alpha OS, AIX, IRIX32, or IRIX64. Other operating systems may be used. Microsoft Windows is a trademark of Microsoft Corporation.
The processor 920 can be a network processor (e.g., optimized for IEEE 802.11), a general purpose processor, an application-specific integrated circuit (ASIC), a field programmable gate array (FPGA), a reduced instruction set controller (RISC) processor, an integrated circuit, or the like. Qualcomm Atheros, Broadcom Corporation, and Marvell Semiconductors manufacture processors that are optimized for IEEE 802.11 devices. The processor 920 can be single core, multiple core, or include more than one processing elements. The processor 920 can be disposed on silicon or any other suitable material. The processor 920 can receive and execute instructions and data stored in the memory 910 or the storage drive 930.
The storage drive 930 can be any non-volatile type of storage such as a magnetic disc, EEPROM (electronically erasable programmable read-only memory), Flash, or the like. The storage drive 930 stores code and data for applications.
The I/O port 940 further comprises a user interface 942 and a network interface 944. The user interface 942 can output to a display device and receive input from, for example, a keyboard. The network interface 944 (e.g. RF antennae) connects to a medium such as Ethernet or Wi-Fi for data input and output.
Many of the functionalities described herein can be implemented with computer software, computer hardware, or a combination.
Computer software products (e.g., non-transitory computer products storing source code) may be written in any of various suitable programming languages, such as C, C++, C#, Oracle® Java, JavaScript, PHP, Python, Perl, Ruby, AJAX, and Adobe® Flash®. The computer software product may be an independent application with data input and data display modules. Alternatively, the computer software products may be classes that are instantiated as distributed objects. The computer software products may also be component software such as Java Beans (from Sun Microsystems) or Enterprise Java Beans (EJB from Sun Microsystems).
Furthermore, the computer that is running the previously mentioned computer software may be connected to a network and may interface with other computers using this network. The network may be on an intranet or the Internet, among others. The network may be a wired network (e.g., using copper), telephone network, packet network, an optical network (e.g., using optical fiber), or a wireless network, or any combination of these. For example, data and other information may be passed between the computer and components (or steps) of a system of the invention using a wireless network using a protocol such as Wi-Fi (IEEE standards 802.11, 802.11a, 802.11b, 802.11e, 802.11g, 802.11i, 802.11n, and 802.11ac, just to name a few examples). For example, signals from a computer may be transferred, at least in part, wirelessly to components or other computers.
In an embodiment, with a Web browser executing on a computer workstation system, a user accesses a system on the World Wide Web (WWW) through a network such as the Internet. The Web browser is used to download web pages or other content in various formats including HTML, XML, text, PDF, and postscript, and may be used to upload information to other parts of the system. The Web browser may use uniform resource identifiers (URLs) to identify resources on the Web and hypertext transfer protocol (HTTP) in transferring files on the Web.
II. Method for Generating Music from an Image or Collage
When performing the reverse method 200C, one difference is that musical instruments and chords are layered (i.e., played in combination at the same time). In one embodiment, musical compositions are pixelated by taking samples at a certain frequency. An individual sample can then be converted to an image pixel including all instruments and chords playing at that snapshot moment. In another embodiment, individual instruments are observed over time and pixelated for conversion to pixels. Many other embodiments are possible. In other instances, other musical characteristics such a tempo and volume can be converted to other pixel or image characteristics such as brightness.
With respect to the loop 304 of
The method 300A continued in
In the loop 313 of
The collage canvas can be saved locally or on a server in any appropriate format (e.g., PNG, JPG, PDF or GIF). Next, the collage canvas can be shared, printed, or otherwise distributed. Individual images can be annotated, including separate instances of the same image.
As discussed above, a master image is loaded (step 330) for optimizing and processing (step 331). The musical settings are retrieved (step 335). A play list array is generated and traversed until all pixels are read (loop 336). Each pixel can represent, for example, a chord or set of chords, one or more instrument, a song, or the like. Pixels can also represent speed, volume, and other characteristics of music. Palette images of a collage can also be used in the same described manner as pixels to generate music. In one embodiment, a database maps pixel characteristics to music characteristics. Finally, audio is composed or recorded (step 337) for playback, storing or sharing (step 338).
III. Methods for Generating Collages
A master image can be printed as a poster or other analog image, or be shared digitally. In one embodiment, a digital master image has controls such as zoom, pan and rotate. For example, the master image can be shared on Facebook as a single image. The master image and palette images can be selected from an online Facebook gallery or locally from a smart phone physical memory.
A selected image can be edited to optimize for its intended use. An internal or external application applies various types of processing to the image. For example, the image can be cropped, zoomed in or out, adjusted for color, tint, hue, saturation, brightness/contrast, and the like.
If the user is satisfied with the image, the main image is committed for the collage (step 502). Otherwise, the image is closed and the process repeats until a satisfactory image is selected (loop 501).
Selected images may be required to meet certain uniform constraints with respect to size, shape, color, or other characteristics. Edits can be applied automatically or manually to meet the constraints. At any point, additional images can be added or deleted from the palette (altogether loop 511).
A master image (text image or non-text image) is read (step 540), optimized (step 541), processed (step 542) and sized as discussed above. Font type faces are loaded using the preconfigured message (step 543). A message text is loaded (step 544) and characters initialized (step 545). The source image size is updated (step 546) and then an empty collage canvas is generated (step 547) and filled with, for example, ASCII characters (loop 548). The collage is stored (step 549) and converted to an appropriate format (step 550) for sharing (step 551).
This description of the invention has been presented for the purposes of illustration and description. It is not intended to be exhaustive or to limit the invention to the precise form described, and many modifications and variations are possible in light of the teaching above. The embodiments were chosen and described in order to best explain the principles of the invention and its practical applications. This description will enable others skilled in the art to best utilize and practice the invention in various embodiments and with various modifications as are suited to a particular use. The scope of the invention is defined by the following claims.
This application claims the benefit of priority under 35 U.S.C. 119(e) to U.S. Application No. 62/032,486, filed Aug. 1, 2014, entitled COLLAGE-BASED REPRESENTATION OF IMAGES AND MUSIC, by Rajinder SINGH, and to U.S. Application No. 62/053,181, filed Sep. 21, 2014, entitled COLLAGE-BASED REPRESENTATION OF IMAGES AND MUSIC, by Rajinder SINGH, the contents of both being hereby incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
20100092107 | Mochizuki | Apr 2010 | A1 |
20120007892 | Ohkubo | Jan 2012 | A1 |
20120011472 | Ohkubo | Jan 2012 | A1 |
20120011473 | Ohkubo | Jan 2012 | A1 |
20130322651 | Cheever | Dec 2013 | A1 |
20160035330 | Singh | Feb 2016 | A1 |
Number | Date | Country | |
---|---|---|---|
20160035330 A1 | Feb 2016 | US |
Number | Date | Country | |
---|---|---|---|
62032486 | Aug 2014 | US | |
62053181 | Sep 2014 | US |