This application claims priority of Taiwanese Application No. 097140968, filed on Oct. 24, 2008.
1. Field of the Invention
The invention relates to a digital signal processing method and system, more particularly to a stream processing method and system.
2. Description of the Related Art
A conventional multi-core processor having a micro architecture encounters data transmission congestion as a result of poor stream data access efficiency, thereby adversely affecting performance thereof.
Therefore, an object of the present invention is to provide a stream processing method and system that can effectively enhance stream data access efficiency in a multi-core stream processing system.
According to one aspect of the present invention, a stream processing system comprises:
a previous-stage module storing a plurality of stream elements each including a group of stream data, each of the stream elements being configured with a specific index value;
a post-stage module; and
a stream fetching module coupled to the previous-stage module and the post-stage module, the stream fetching module being operable to fetch from the previous-stage module the stream elements in sequence such that the index values of the fetched stream elements correspond to a sequence of predetermined index values associated with a desired stream fetching pattern, and providing in sequence the stream elements fetched thereby to the post-stage module.
According to another aspect of the present invention, a stream processing method comprises the steps of:
a) configuring each of a plurality of stream elements stored in a previous-stage module with a specific index value, each of the stream elements including a group of stream data;
b) fetching from the previous-stage module the stream elements in sequence such that the index values of the fetched stream elements correspond to a sequence of predetermined index values associated with a desired stream fetching pattern; and
c) providing in sequence the stream elements fetched in step b) to a post-stage module.
Other features and advantages of the present invention will become apparent in the following detailed description of the preferred embodiment with reference to the accompanying drawings, of which:
Referring to
The memory 12 is a frame buffer memory for storing frame data in this embodiment. The frame data can be divided to form a plurality of stream elements each including a group of stream data. It is noted that the stream elements may overlap each other. Furthermore, if the stream processing system is applied in a video coding system, each stream element can be a macroblock. If the stream processing system of the preferred embodiment is applied in an audio coding system, each stream element can be data of a filter window. If the stream processing system is applied for graphic processing, each stream element can be data of a corresponding one of vertexes of a graphic polygon object.
The first one of the stream fetching modules 13 is operable to fetch from the memory 12 a predetermined number of the stream elements, each configured with a specific index value, in sequence such that the index values of the fetched stream elements correspond to a sequence of predetermined index values associated with a desired stream fetching pattern, and provides in sequence the stream elements fetched thereby to the first one of the stream processing modules 11. It is noted that the predetermined number of the stream elements comes from a part of the frame data. In other embodiments, the predetermined number of the stream elements is the whole of the frame data. It is noted that the desired stream fetching pattern is programmable, as shown in
Similarly, the Pth one of the stream fetching modules 13 fetches from the (P−1)th one of the stream processing modules 11 a plurality of stream elements in sequence, and provides in sequence the stream elements fetched thereby to the Pth one of the stream processing modules 11, where 2≦P≦N. The (N+1)th one of the stream fetching modules 13 fetches from the Nth one of the stream processing modules 11 a plurality of stream elements in sequence, and stores the stream elements fetched thereby in the memory 12.
Each stream fetching module 13 includes an address generation unit 131 (see
Referring to
For the Pth one of the stream processing modules 11, where 1≦P≦N, the input register 14 is coupled to the Pth one of the stream fetching modules 13 for receiving and storing the stream elements provided thereby, and the output register 17 is coupled to the (P+1)th one of the stream fetching modules 13.
For each stream processing module 11, the constant register 15 stores constant reference values. The temporary register 16 stores dynamic data during stream processing. The cache memory 18 is coupled to the temporary register 16, and stores required data during stream processing to minimize number of data accesses. The stream processing unit 10 is coupled to the input register 14, the output register 17, the constant register 15 and the temporary register 16. The stream processing unit 10 is operable to perform in sequence processing of the stream elements stored in the input register 14 using the constant reference values stored in the constant register 15, the dynamic data stored in the temporary register 16, and the required data stored in the cache memory 18 to obtain a processing result, and stores the processing result in the output register 17. It is noted that, for the Pth one of the stream processing modules 11, the processing result stored in the output register 17 includes the plurality of the stream elements fetched by the (P+1)th one of the stream fetching modules 13.
In step S1, the frame data stored in the memory 12 is divided to obtain the predetermined number of the stream elements.
In step S2, each of the stream elements obtained in step S1 is configured with the specific index value.
In step S3, each of the stream fetching modules 13 fetches from a corresponding previous-stage module the stream elements in sequence such that the index values of the fetched stream elements correspond to the sequence of the predetermined index values associated with the desired stream fetching pattern. For the first one of the stream fetching modules 13, the corresponding previous-stage module is the memory 12. For the Pth one of the stream fetching modules 13, where 2≦P≦N+1, the corresponding previous-stage is the (P−1)th one of the stream processing modules 11.
In step S4, each of the stream fetching modules 13 provides in sequence the stream elements fetched in step S3 to a corresponding post-stage module. For the Pth one of the stream fetching modules 13, where 1≦P≦N, the corresponding post-stage module is the Pth one of the stream processing modules 11. For the (N+1)th one of the stream fetching modules 13, the post-stage module is the memory 12.
In sum, the stream processing system of the present invention can be configured as a macro pipeline architecture. Each stream processing module 11 is regarded as a pipeline stage and is coupled in series to a pipeline previous-stage and a pipeline post-stage via the corresponding pair of the stream fetching modules 13, respectively. Furthermore, due to the configuration of the index values for the stream elements and the utilization of the programmable stream fetching pattern, stream data access efficiency of the stream processing system of the present invention can be effectively enhanced.
While the present invention has been described in connection with what is considered the most practical and preferred embodiment, it is understood that this invention is not limited to the disclosed embodiment but is intended to cover various arrangements included within the spirit and scope of the broadest interpretation so as to encompass all such modifications and equivalent arrangements.
Number | Date | Country | Kind |
---|---|---|---|
097140968 | Oct 2008 | TW | national |