The present invention relates to Very Large Scale Integrated (VLSI) circuit design technology generally and, more particularly, to a method for composing memory on programmable platform devices to meet varied memory requirements with a fixed set of resources.
Programmable platform architectures for Very Large Scale Integrated (VLSI) circuit designs provide a fixed set of resources for implementing different custom logic designs applied to the platform. Embedded memory is one such resource. The embedded memory requirements of different custom logic designs to be applied to the same programmable platform device can be quite different.
In conventional solutions, standard size embedded memory blocks are provided by the programmable platform device. The blocks are combined to create a desired memory width and depth. The conventional solutions suffer from a lack of flexibility. The designer of the circuit to be fabricated on the programmable platform device has very little flexibility in the customized use of the embedded arrays. The chip designer can only use the resources provided in the restricted mode that has been implemented by the platform designer. A situation can occur where the chip designer does not have the resources to use a memory in an organization best suited to the application.
Conventional solutions also waste die real estate. Combining embedded memory arrays of a preset size can lead to wasted die area. For example, creating a 256×50 array by combining two available 256×40 arrays wastes 75% of the second array. Conventional solutions can also result in late timing information feedback. The effect of the interconnection delay on the timing of the random access memory is not discovered until full chip timing tests can be made, which is usually late in the design process. When working to minimize the time to design a custom logic chip, the earlier in the process that accurate design constraints can be provided to the designer, the simpler (and quicker) relevant design tradeoffs between choices can be made. When accurate information is available only later in the process, significant rework can be necessary, essentially restarting the design with new constraint information, thus negating the progress made under the inaccurate assumptions.
It would be desirable to provide an embedded memory solution that may fulfill the memory size and performance specifications of different designs using a fixed set of resources.
The present invention concerns a method for composing memory on a programmable platform device generally comprising the steps of (A) accepting information about a programmable platform device comprising one or more diffused memory regions and one or more gate array regions; (B) accepting predetermined design information for one or more memories; and (C) composing one or more memory building blocks (i) in the one or more diffused memory regions, (ii) in the one or more gate array regions or (iii) in both the diffused memory and the gate array regions based upon the predetermined design information and the information about the programmable platform device.
The objects, features and advantages of the present invention include providing a method for composing memory on programmable platform devices to meet varied memory criteria with a fixed set of resources that may (i) provide the ability to compose memories from a combination of fixed block diffused memory and gate array memory resources, (ii) provide the ability to include physical ram and logic information in the memory composition process, (iii) provide an automated tool to perform the memory composition methodology, (iv) provide for high flexibility to allow a much wider and richer set of memory combinations to be available to the chip designer, (v) provide for higher density over conventional methods through the intelligent composition of integrated circuit memory resources that reduces wasted silicon, (vi) allow for performance feedback by providing an early view of memory timing performance based on the integrated circuit physical information; (vii) reduce costly redesign late in the design cycle, and/or (viii) provide automated generation of RTL views.
These and other objects, features and advantages of the present invention will be apparent from the following detailed description and the appended claims and drawings in which:
Referring to
The resource selector 102 may be configured, in one example, to compare (i) information about device resources (e.g., the block 110), (ii) information about current availability of the device resources (e.g., the block 112) and (iii) physical layout data (e.g., the block 114) of the programmable platform device with memory specifications of a customer (e.g., the block 116). The resource selector 102 generally determines which resources of the programmable platform device are devoted to composing a particular memory or memories specified by the customer. The resource selector 102 generally passes the resource selection (or allotment) information to the memory composer 104. The memory composer 104 generally generates (or manages the configuration of) various memory shells to satisfy the customer specifications.
The comparison and allocation of resources is generally performed based on a combination of specifications that may be applied to optimize the design. These specifications may, for example, include: User preferences (e.g., the user may chose to instruct the tool as to which resource gets allocated to which memory requirement); Change minimization (e.g., the allocation may be made in a way to minimize change to a nearly complete design); Timing Considerations (e.g., the allocation may be made to most closely match timing requirements to the speed of available resources); Location or Routing Congestions (e.g., the allocation may be made to minimize distance from memory to associated logic, that may improve timing and reduce routing congestion); Density (e.g., the allocation may be made to generate and/or optimize memory density, that may minimize the required chip silicon area).
The device resources 110 generally include composable memory elements comprising, for example, (i) non-pipelined diffused memory, (ii) pipelined diffused memory, (iii) non-pipelined gate array based memory, and/or (iv) pipelined gate array based memory. The non-pipelined diffused memory may be implemented, in one example, as bit cell based diffused memory. The pipelined diffused memory may be implemented, in one example, as diffused memory with stages of flip-flops, registers and/or latches on the memory inputs and/or outputs. The flip-flops, registers and/or latches may be configured for functional or timing purposes. The non-pipelined gate array based memory may be implemented, in one example, as memory built upon sea of gate array elements (e.g., A-cells) on the programmable platform device. The pipelined gate array based memory may be implemented, in one example, as gate array based memory with stages of flip-flops, registers and/or latches on the memory inputs and/or outputs. The flip-flops, registers and/or latches may be configured for functional or timing purposes.
The memory composer 104 may be configured to accept a plurality of inputs from the resource selector 102. For example, the plurality of inputs from the resource selector 102 may comprise customer logic specification inputs, chip resources allotted, and/or physical placement data. However, other inputs may be implemented accordingly to meet the design criteria of a particular implementation. The customer logic specification inputs may comprise, for example, memory performance specifications (e.g., cycle time, access time, etc.), memory dimensions (e.g., array width and depth), number and type of ports in the memory (e.g., single port, dual port, etc.), memory latency, and/or memory power consumption specifications. The chip resources allotment inputs may comprise, for example, type of resource allotted (e.g., diffused memory or gate array memory), the amount of resources allotted, etc. The physical placement data inputs may comprise, in one example, the physical placement of resources and the physical placement of logic accessing the memory. However, other resource selection information (or inputs) may be implemented accordingly to meet the design criteria of a particular implementation.
The memory composer 104 may be configured to provide a plurality of outputs 116. The outputs 116 may comprise, in one example, an RTL view of the generated memory (or memories), synthesis scripts for the generated memory and associated wrappers, static timing scripts for the generated memory and the associated wrappers, and/or a memory built-in self test (BIST) test wrapper (e.g., logic vision compatible). The memory composer 104 may provide basic error checking feedback. For example, the memory composer 104 may be configured to provide information regarding mismatches between resources and customer specifications (e.g., the block 118). For example, the memory composer 104 may be configured to detect and indicate problems such as the timing of a random access memory (RAM) in combination with the interconnection delay and the delay inserted by the wrapper elements being insufficient to meet the customer specifications. The memory composer 104 generally provides an early view of memory timing performance based on the physical information of the chip. By providing the early view of the timing performance, the present invention may reduce or eliminate costly redesign later in the design cycle.
The memory composer 104 generally provides a number of memory composition features. For example, the memory composition features may comprise gross memory solution checking, a number of single port memory compositions and a number of multi-port memory compositions. Gross memory solution checking may comprise analysis of, in one example, customer performance specification versus composed memory performance. Such an analysis may include, for example, a calculation of an interconnect delay from physical placement information and/or additional delay inserted by the wrapper elements (e.g., test and functional wrapper).
Referring to
The memory composer 104 may be configured to generate a number of multi-port memory compositions. For example, the memory composer 104 may provide a two port memory from (i) a double wide combination of single port memories, (ii) a double clocked combination of single port memories, (iii) a single diffused dual port memory (e.g.,
Referring to
Referring to
Referring to
In one example, a 512×80 memory 139 may be composed from two 256×80 memories 140a and 140b. Each of the memories 140a and 140b may include a memory test wrapper (e.g., BIST collar). A wrapper for the memory may comprise logic (e.g., logic gates 142, 144 and 146) for generating an enable signal for each of the memories based on the high address bit. If the composed memory is to be pipelined, the wrapper may include pipeline flip-flops 148a and 148b.
Referring to
Referring to
Referring to
Referring back to
Referring to
The process 100 may further comprise a design qualifier stage 120. The design qualifier 120 may be configured to determine whether the outputs of the memory composer 104 meet the specifications of the customer. When the outputs of the memory composer 104 do not meet the specifications of the customer (e.g., based on predetermined criteria of the customer), the design qualifier may pass information to the resource selector that may result in a new allotment of the available resources.
Referring to
In general, the A-cells may be, in one example, building blocks for logic and/or storage elements. For example, one way of designing a chip that performs logic and storage functions may be to lay down numerous A-cells row after row, column after column. A large area of the chip may be devoted to nothing but A-cells. The A-cells may be personalized (or configured) in subsequent production steps (e.g., by depositing metal layers) to provide particular logic functions. The logic functions may be further wired together (e.g., a gate array design).
The device 190 may comprise one or more hard macros 198. The hard macros 198 may include diffused patterns of a circuit design that is customized and optimized for a particular function. The hard macros generally act much like an ASIC design. For example, a high speed interface may be routed into the hard macro. The hard macro may be configured to perform signal processing to correctly receive the interface and correct for any errors that may be received at the interface, according to the levels of the interface protocol. In general, hard macros may be implemented to provide a number of functions on the device 190. For example, the hard macros 198 may comprise phase locked loops (PLLs), instances of processors, memories, input/output PHY level macros, etc.
Referring to
When the building blocks have been generated, the process may continue by generating RTL code for any pipeline stages, inputs, outputs, multiplexers and/or test structures associated with the types of memories in the customer specifications (e.g., the block 210). The process 200 may perform basic error checking on the compositions (e.g., the block 212). If the compositions do not meet the specifications (e.g., the NO path from the block 212), the process may provide mismatch information (e.g., the block 214). When all of the memories specified have been composed and meet the specifications, the process 200 may present a number of outputs (e.g., the block 216).
In general, the present invention provides a process and architecture to facilitate composing memory building blocks that may be assembled (e.g., customized with one or more metal routing layers) during circuit fabrication to satisfy varied memory specifications based on a fixed set of resources. Using a fixed set of resources for many different designs is generally advantageous. From the point of view of inventory control of the uncustomized slices, the present invention may provide lowered costs and reduced slice design time. From the point of view of the designer the present invention may provide a wider range of platform choices. From the point of view of the platform provider, the present invention may provide a wider addressed market.
Incorporating test automation and debugging access into the automated path may have an advantage of providing right-by-construction test wrappers with very low designer investment. The present invention may provide regular test structures that may allow test program generation to occur outside of the critical path (e.g., the test program may be produced in parallel with the production of the mask sets and silicon, rather than having to be completed before the expensive mask sets are produced).
While the invention has been particularly shown and described with reference to the preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made without departing from the spirit and scope of the invention.
Number | Name | Date | Kind |
---|---|---|---|
4656592 | Spaanenburg et al. | Apr 1987 | A |
5406525 | Nicholes | Apr 1995 | A |
5818728 | Yoeli et al. | Oct 1998 | A |
5818729 | Wang et al. | Oct 1998 | A |
5912850 | Wood et al. | Jun 1999 | A |
6459136 | Amarilio et al. | Oct 2002 | B1 |
6510081 | Blyth et al. | Jan 2003 | B2 |
6529040 | Carberry et al. | Mar 2003 | B1 |
6552410 | Eaton et al. | Apr 2003 | B1 |
20040027856 | Lee et al. | Feb 2004 | A1 |
Number | Date | Country |
---|---|---|
01202397 | Jul 2001 | JP |
02202886 | Jul 2002 | JP |
Number | Date | Country | |
---|---|---|---|
20040111690 A1 | Jun 2004 | US |