The present invention relates, in general, to the field of mobile electronic devices. More particularly, the present invention relates to mobile electronic devices utilizing reconfigurable processing techniques to enable higher speed applications with lowered power consumption for, inter alia, increased battery life. The techniques disclosed herein are also applicable to implantable medical devices and other portable electronic systems especially those applications wherein minimization of power consumption and increased computational power is desired.
Today's mobile devices are very limited in computational capability due to the desire to have long battery life, small physical size and be light in weight. As a result the software applications that can be performed by such a device are inherently limited. This has led to a “reach back” model of computation involving the “cloud” computing model. Unfortunately as more and more streaming activities such as Netflix™ come on line, the ability to access bandwidth to the cloud will also become very limited. For example, it has been reported that Netfix already consumes 30% of all internet bandwidth between 6 and 9 PM. It is therefore apparent that it would not require much additional usage by cloud providers or other streaming media services in order to render the mobile reach back model effectively non-functional.
In order to address this situation, mobile device manufacturers are attempting to incorporate a very low power consumption microprocessor as the primary processor along with a higher power consumption and slightly higher capability applications processor. While this does provide a somewhat improved mobile processing capability, the gains are relatively minimal.
A more efficacious solution is to incorporate reconfigurable processing capability into future mobile devices. Reconfigurable processors have been shown to consume as little as 1% of an equivalent performing microprocessor solution allowing the performance of the previously mentioned applications processor to be exceeded by a factor of 100 while consuming the same amount of power. In this regard the IMPLICIT+EXPLICIT™ Architecture available from SRC Computers LLC, assignee of the present invention, supports just such a configuration and allows for the programming of the mobile device to still be performed using standard high level programming languages.
One method of implementation using this architecture would be that of Package-on-Package (PoP) assembly. This commonly used mobile device assembly technique stacks one ball grid array device on top of another creating a footprint no larger than the bottom component. In mobile devices it is common to stack memory on top of the microprocessor. Since SRC Computer LLC's patented architecture interconnect a reconfigurable device and a microprocessor together through a shared memory, it can be seen that extending the PoP assembly to also include a reconfigurable layer would allow the SRC MAP® architecture to be implemented in a fashion usable in a mobile device.
Other possible assemblies for implementation of the reconfigurable computing architecture disclosed herein include, for example, stacked die connected by means of trough silicon vias (TSV), 2.5 D assemblies utilizing fine-pitch interposers and other known multiple integrated circuit die packaging techniques. Further, the reconfigurable computing architecture of the present invention may also be implemented in a configuration wherein the reconfigurable logic and microprocessor are formed on a single integrated circuit die.
Mobile devices having the significantly higher computational capability reconfigurable processing provides would also have several concomitant cost and performance benefits as well as open new applications domains not currently contemplated with these devices. First, web sites would no longer have to maintain both a standard version and a mobile version providing immediate savings of many millions of dollars annually. Further, with a more computationally capable mobile device more complex data compression/decompression techniques could be employed allowing much more data to be sent to the mobile device utilizing the same amount of bandwidth as used today.
Secondly, with the ability to perform significantly more secure encryption algorithms that reconfigurable computing provides, the mobile device could then become the user's primary repository of secure data. Physical credit cards could be eliminated and instead replaced by 2D bar codes on the device display thus greatly reducing credit card fraud caused by giving access to a physical card by the individual performing the transaction. The mobile device could also be used in a highly secure wireless mode, whereupon entering a store for example, to allow sensors located there to know of the buyer's presence, previous desired products, current purchasing limits and the like.
Such enhanced mobile encryption capability would also allow the user to retain sensitive data such as medical records on their person which could prove to be very beneficial in the case of a medical emergency or accident while traveling. In another application, electronic car keys could be replaced by encrypted codes loaded into the mobile device.
Thirdly, with the mobile device becoming the principal audio and video media device for many users, significantly higher processing capability would greatly improve these applications as well. For instance one could have the ability to remove the motion blur common in cell phone photos due to limited flash range and slow shutter speeds. Moreover, many of the basic image processing functions performed today such as “red eye” elimination could now be performed on full motion video as well. Audio compression techniques currently in use could also be greatly enhanced without the need to pre-process the audio in non-real time. Still further, the reconfigurable processing techniques disclosed herein are likewise applicable to mobile gaming applications allowing for the provision of improved overall game performance and optimal performance at different points in the game.
It is noted that applications in current Android™ (trademark of Google, Inc.) devices are written in Java™ (trademark of Oracle Corporation). This allows the applications to be portable between all Android devices without requiring that they utilize the same processor. This is accomplished because each processor executes code that is a virtual java processor which, in turn, then executes the application. The result is portability but at the cost of about four times lower performance than the equivalent C code. This is because the instruction processor has been constructed in software as opposed to hardware. Utilizing the reconfigurable processing techniques disclosed herein the Java code could instead be instantiated in reconfigurable logic such as an FPGA resulting in the elimination of the current processor emulation slow down. The result is Java portability with hardware execution speed.
Disclosed herein is a mobile device incorporating the reconfigurable processing technique of the present invention that instantiates a Java Virtual Machine in reconfigurable logic to eliminate the performance degradation observed when implementing the Java Virtual Machine on a microprocessor. In conjunction with the present invention, a compiler is disclosed which takes applications and generates code suitable for being run on a mobile device comprising reconfigurable processors. The compiler disclosed herein is further operable to take Java applications in particular and alter the code, or be taken in the form of byte code, such that it can be run on a mobile device comprising reconfigurable processors.
Particularly disclosed herein is a mobile device incorporating reconfigurable computing. In a particular embodiment of the present invention the reconfigurable processing capability of the mobile device enables greater computational capability for accessing web sites and allowing for the use of complex data compression and data encryption techniques. The reconfigurable processing technique for mobile devices of the present invention further enables a mobile device to contain secure user medical or other personal information while also providing for potential use as an automotive ignition or other access key.
Other possible applications of the reconfigurable processing technique of the present invention include enabling a mobile device to provide enhanced computational capability to allow for improved audio and video quality through enhanced image processing techniques. Such improved on-board image processing can then provide real-time video to the mobile device including high definition video. In a particular implementation of the reconfigurable processing technique of the present invention, implicit and explicit logic can be utilized in the form of a dense logic device and direct execution logic coupled to a shared memory in accordance with SRC Computers' IMPLICIT+EXPLICIT™ Architecture.
The aforementioned and other features and objects of the present invention and the manner of attaining them will become more apparent and the invention itself will be best understood by reference to the following description of a preferred embodiment taken in conjunction with the accompanying drawings, wherein:
With reference now to
The microprocessor device 102 is further coupled to a system interconnect to which may be attached individual or separately packaged logic devices such as graphics rendering logic 106, encryption/decryption logic 108, interface logic for various input/output (I/O) devices 110, image processing logic 112, audio/video compression/decompression logic 114, secondary microprocessor (μP) logic 116 and the like depending on the mobile device 100 function and features. Each of these separate devices individually, and collectively, places demands on the mobile device 100 power supply and, as such, may have to be of diminished functionality in order not to deplete any on-board battery power too quickly.
With reference additionally now to
As illustrated, the mobile device 200 may also include a user viewable display 208, a speaker 210, a keypad and/or touchscreen 212 for input of data or commands to the mobile device 200, an on-board battery 214 or other power supply and an antenna 216 for transmission or reception of signals external to the mobile device 200. The mobile device 200 may also include a microphone 216 for use in conjunction with the transmission or reception of external signals, for example, in conjunction with a cellular phone feature and/or the input of voice commands to the mobile device 200 itself with the mobile device incorporating a voice recognition function. Briefly, the mobile device 200 may have all of the features of a conventional smart phone, personal digital assistant or any other mobile device.
Representative embodiments for possible implementations of the reconfigurable logic 202, shared memory 204 and microprocessor logic 206 and programming techniques therefor are disclosed in one or more of the following United States patents issued to SRC Computers LLC, assignee of the present invention, the disclosures of which are herein specifically incorporated by this reference in their entirety: U.S. Pat. No. 6,026,459; U.S. Pat. No. 6,076,152; U.S. Pat. No. 6,247,110; U.S. Pat. No. 6,295,598; U.S. Pat. No. 6,339,819; U.S. Pat. No. 6,356,983; U.S. Pat. No. 6,434,687; U.S. Pat. No. 6,594,736; U.S. Pat. No. 6,836,823; U.S. Pat. No. 6,941,539; U.S. Pat. No. 6,961,841; U.S. Pat. No. 6,964,029; U.S. Pat. No. 6,983,456; U.S. Pat. No. 6,996,656; U.S. Pat. No. 7,003,593; U.S. Pat. No. 7,124,211; U.S. Pat. No. 7,134,120; U.S. Pat. No. 7,149,867; U.S. Pat. No. 7,155,602; U.S. Pat. No. 7,155,708; U.S. Pat. No. 7,167,976; U.S. Pat. No. 7,197,575; U.S. Pat. No. 7,225,324; U.S. Pat. No. 7,237,091; U.S. Pat. No. 7,299,458; U.S. Pat. No. 7,373,440; U.S. Pat. No. 7,406,573; U.S. Pat. No. 7,421,524; U.S. Pat. No. 7,424,552; U.S. Pat. No. 7,565,461; U.S. Pat. No. 7,620,800; U.S. Pat. No. 7,680,968; U.S. Pat. No. 7,703,085; and U.S. Pat. No. 7,890,686.
With reference now to
The system 300 comprises, in pertinent part, a unified executable 302 produced through SRC Computers' Carte™ programming environment 304 which allows for application source files being input in, for example, the Fortran or C programming languages. An implicit device 306 and explicit device 308 are programmed through the Carte programming environment, which will be more fully described hereinafter and both are coupled to provide access to a common memory 310. In this regard, the implicit device 306 corresponds to the microprocessor logic 206 (
In this architecture, the explicit and implicit processors 306, 308 are peers with respect to their ability to access system memory contents in the form of common memory 310. In this fashion, overhead associated with having both types of processors working together on the same program is minimized. This allows the SRC Computers' Carte programming tools to utilize whichever processor type is best for a given portion of the overall application without concern for control handoff penalties.
The implicit devices 306 may also be referred to as Dense Logic Devices (DLDs) and encompass a family of components that includes microprocessors, digital signal processors, Graphics Processor Units (GPUs), as well as some Application Specific Integrated Circuits (ASICs). These processing elements are all implicitly controlled and typically are made up of fixed logic that is not altered by the user. These devices execute software-directed instructions on a step-by-step basis in fixed logic having predetermined interconnections and functionality.
On the other hand, the explicit devices 308 may also be referred to as Direct Execution Logic (DEL) and comprise a family of components that is explicitly controlled and is typically reconfigurable. This includes Field Programmable Gate Arrays (FPGAs), Field Programmable Object Arrays (FPOAs) and Complex Programmable Logic Devices (CPLDs). This set of elements enables a program to establish an optimized interconnection among the selected functional units in order to implement a desired computational, pre-fetch and/or data access, functionality for maximizing the parallelism inherent in the particular code.
Both the implicit device 306 (DLD) and explicit device 308 (DEL) processing elements are interconnected as peers to a shared system memory (e.g. common memory 310) in one fashion or another and it is not required that interconnects support cache coherency since data sharing can be implemented in an explicit fashion.
The DEL computing of the explicit device 308 uses dynamic logic, which conforms to the application rather than forcing the application into a fixed microprocessor architecture where one size must fit all. This delivers the most efficient circuitry for any particular code in terms of the precision of the functional units and the parallelism that can be found in the code. The result is a dynamic application specific processor that can evolve along with a given code and/or can be reprogrammed in a fraction of a second to handle different codes. DEL computing provides users the performance of a special purpose computer and the economy of a general-purpose machine.
The Carte Programming Environment makes this integration possible by enabling the programmer to utilize ANSI standard languages such as Fortran or C high-level languages to specify their application on both the implicit and explicit devices 306, 308. The output from compilation in the Carte Programming Environment is a single, unified executable for the target heterogeneous computer system such as mobile device 200 (
In some currently available heterogeneous computer systems, a low bandwidth and high latency input/output bus separates the FPGA device from the CPU. The SRC IMPLICIT+EXPLICIT Architecture removes this limitation by enabling the DLD and DEL processors to operate as peers with respect to the system memory. This means only system memory bandwidth and latency limits these devices, which greatly improves overall application performance on the system. The unified programming environment using standard languages and the implicit and explicit devices 306, 308 limited only by system memory 310 characteristics of the IMPLICIT+EXPLICIT Architecture, provides the user with an easy-to-use high-performance application platform unmatched by any system available today.
The IMPLICIT+EXPLICIT Architecture allows users to execute existing code, or easily recompile and develop new codes to take advantage of the power of the reconfigurable DEL processors in the system. This hardware and software architecture fully integrates microprocessor technology and reconfigurable DEL processors to deliver orders of magnitude increases in performance and reductions in power consumption. The SRC Carte Programming Environment eliminated the historic problems that programmers faced in getting microprocessor portions of code to work with reconfigurable processor portions.
With reference additionally now to
In this case, the Carte compiler 410 receives the source files 402, uses the hardware version of the Carte macro libraries 412 and invokes the FPGA place and route tools 414 in order to generate an FPGA bit stream. This bit stream is included in the object file output 416 by the Carte compiler 410. All object files 408 and 416 are linked at step 418 with the hardware macro library symbols 420 being resolved, using the Carte libraries. In this way, the FPGA programming bit stream and the runtime code 424 is embedded within the single unified application executable 422. It is also possible for programmers to incorporate their own Verilog or VHDL IP into these libraries. This allows them to instantiate the IP by using a simple function call.
The programming software comprises two major elements: standard third party software and the SRC Carte Programming Environment. The mobile device 200 (
The Carte Programming Environment takes applications written in standard ANSI Fortran and/or C and seamlessly integrates the computational capability of the reconfigurable logic 202 and microprocessor logic 206 (
Although the Carte Programming Environment is comprised of several components, the major software component is the SRC MAP processor compiler, which is currently available as a MAP/Fortran compiler or a MAP/C compiler. The MAP compiler creates the direct execution logic for the MAP FPGAs. The compilation system extracts maximum parallelism from the code and generates pipelined hardware logic instantiated in the FPGAs. The compiler generates all the required interface code to manage the movement of data to and from the MAP processor, and to coordinate microprocessor execution with the logic running in the MAP processor. The libraries fully support integer, single and double precision floating point data types.
All of the required interface and management code is contained in the Carte runtime libraries. The SNAP™ (trademark of SRC Computers LLC) driver and the associated libraries are provided with the Carte Programming Environment, allowing the application developer to easily design and implement their algorithms in a fully integrated manner. The Carte Programming Environment also provides users with the ability to emulate and simulate compiled code in “debug mode”. Debug mode compilation allows the user to compile and test all of their code on the CPU without invoking the FPGA place and route tools. Loop performance information is also provided in debug mode, which enables accurate MAP processor code performance estimation before FPGA place and route.
With reference additionally now to
Representative implementations and the process for producing possible embodiments of the stacked die 500 are disclosed in one or more of the following United States patents issued to Arbor Company LLP, the disclosures of which are herein specifically incorporated by this reference in their entirety: U.S. Pat. No. 6,627,985; U.S. Pat. No. 6,781,226; U.S. Pat. No. 7,126,214; U.S. Pat. No. 7,282,951 and RE42,035.
With reference additionally now to
The package-on-package 600 comprises a series of high density ball grid array (BGA) contacts 602 for coupling the PoP 600 to a circuit board. The contacts 602 are affixed to a laminate substrate 604 which supports either a single or multiple integrated circuit die element(s) 606 such as the microprocessor logic 206 of
The PoP 600 further comprises a number of lower density BGA contacts 612 which are affixed to another laminate substrate 614 which also supports one or more integrated circuit die element(s) 616. In this regard, the die element(s) 616 may comprise, for example, the memory 204 and reconfigurable logic 202 of the mobile device of
With reference additionally now to
The 2.5 D 700 configuration comprises a number of BGA solder ball contacts 702 and a package substrate 704 having a number of interconnections therethrough to another number of smaller solder bumps 706. The solder bumps 706 support an interposer 708 and are electrically coupled through TSVs (not shown) to a number of high-bandwidth, low-latency interconnections 710 formed in the interposer 708. The interconnections 710 are, in turn, coupled to a number of microbumps 712 which provide electrical connection to various integrated circuit die 714. In this regard, any of the integrated circuit die 714 can comprise the reconfigurable logic 202, memory 204 and/or microprocessor logic 206 of the mobile device 200 of
With reference additionally now to
With reference additionally now to
While there have been described above the principles of the present invention in conjunction with specific apparatus, device configurations and programming environments, it is to be clearly understood that the foregoing description is made only by way of example and not as a limitation to the scope of the invention. Particularly, it is recognized that the teachings of the foregoing disclosure will suggest other modifications to those persons skilled in the relevant art. Such modifications may involve other features which are already known per se and which may be used instead of or in addition to features already described herein. Although claims have been formulated in this application to particular combinations of features, it should be understood that the scope of the disclosure herein also includes any novel feature or any novel combination of features disclosed either explicitly or implicitly or any generalization or modification thereof which would be apparent to persons skilled in the relevant art, whether or not such relates to the same invention as presently claimed in any claim and whether or not it mitigates any or all of the same technical problems as confronted by the present invention. The applicants hereby reserve the right to formulate new claims to such features and/or combinations of such features during the prosecution of the present application or of any further application derived therefrom.
As used herein, the terms “comprises”, “comprising”, or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a recitation of certain elements does not necessarily include only those elements but may include other elements not expressly recited or inherent to such process, method, article or apparatus. None of the description in the present application should be read as implying that any particular element, step, or function is an essential element which must be included in the claim scope and THE SCOPE OF THE PATENTED SUBJECT MATTER IS DEFINED ONLY BY THE CLAIMS AS ALLOWED. Moreover, none of the appended claims are intended to invoke paragraph six of 35 U.S.C. Sect. 112 unless the exact phrase “means for” is employed and is followed by a participle.
The present invention is related to, and claims priority from, U.S. Provisional Patent Application Ser. No. 61/576,846 filed Dec. 16, 2011, the disclosure of which, inclusive of all patents and patent applications cited therein, is herein specifically incorporated by this reference in its entirety.
Number | Date | Country | |
---|---|---|---|
61576846 | Dec 2011 | US |