The present invention relates generally to communication using broadband passive optical networks (PONs).
As the demand from users for bandwidth is rapidly increasing, optical transmission systems, where subscriber traffic is transmitted using optical networks, is installed to serve this demand. These networks are typically referred to as fiber-to-the-curb (FTTC), fiber-to-the-building (FTTB), fiber-to-the-premise (FTTP), or fiber-to-the-home (FTTH). Each such network provides an access from a central office (CO) to a building, or a home, via optical fibers installed near or up to the subscribers' locations. As the transmission quantity of such an optical cable is much greater than the bandwidth actually required by each subscriber, a passive optical network (PON), shared between a plurality of subscribers through a splitter, was developed.
An exemplary diagram of a typical PON 100 is schematically shown in
Traffic processing by an ONU 120 is typically performed by a packet processor that is required to serve a plurality of PON applications of different PON types (e.g., BPON, EPON and GPON) and to process multiple data streams at high rate. In addition, the packet processor should be capable of performing standard networking tasks such as bridge learning, ATM queuing and shaping, reassembling of packets, and so on.
It would be advantageous to provide a packet processor for PON applications which is capable of efficiently performing the above-mentioned tasks.
It is an object of the invention to provide a packet processor for PON applications which is capable of efficiently performing the above-mentioned tasks.
This object is realized in accordance with a first aspect of the invention by a passive optical network (PON) packet processor for processing PON traffic, said PON packet processor comprising:
a core processor for executing threads related to the processing of said PON traffic;
a plurality of hardware (HW) accelerators coupled to the core processor for accelerating the processing of said PON traffic; and
a memory unit coupled to the core processor for maintaining program and traffic data.
According to another aspect of the invention there is provided a method for effective selection of PON traffic processing related threads for use by a scheduler operative in a passive optical network (PON) packet processor, said method comprising:
receiving requests for invoking said threads from a plurality of request generators;
based on a priority policy selecting a thread to be executed; and
sending an identification (ID) number of the selected thread to a context manager.
In order to understand the invention and to see how it may be carried out in practice, an embodiment will now be described, by way of non-limiting example only, with reference to the accompanying drawings, in which:
The packet processor 200 includes a core processor 210, a plurality of hardware (HW) accelerators 220-1 through 220-7, and a memory 230. The core processor 210 may be, for example, a RISC machine that designed to execute processing tasks with a minimal latency. For this purpose, all arithmetic and logic operations as well as source and destination variables are register based. The only operations that require access to the memory 230 are load and store operations. Furthermore, the core processor 210 is designed with separate channels dedicated respectively to program, data, and context accesses. Specifically, the memory units included in the memory 230 are high speed synchronous memories that are used for program, data and context. The program memory 230-1 is, for example, a read only memory that holds tasks' instructions. The program memory 230-1 is accessible by an external microprocessor 250. The data memory 230-2 is a read/write memory that keeps data of the various tasks. The context memory 230-3 holds instances of registers used by core processor 210. When switching contexts, the previous context is saved in memory 230-3 and a new context is fetched. The context memory 230-3 is also accessible by the external microprocessor 250. The context switching mechanism is controlled by a context manager 240 and will be described in greater detail below.
The HW accelerators 220 are dedicated hardware processing components designed to increase the performance of packet processor 200 by speeding up time-consuming tasks. These dedicated processing components include at least a lookup table 220-1, a cyclical redundancy checking (CRC) accelerator 220-2, a scheduler 220-3, a register file 220-4, a direct memory access (DMA) accelerator 220-5, an internal bus interface 220-6, and a timer 220-7. The lookup table 220-1 includes MAC addresses used for accessing both PON and Ethernet MAC adapters. Specifically, the lookup table 220-1 includes all learnt destination and source MAC addresses. Entries are added to the lookup table 220-1 by a learning process and removed from the lookup table 220-1 by an aging process. The learning process is triggered if a designated source address was not found in the lookup table 220-1. In addition, the lookup table 220-1 may be used in forwarding packets, filtering packets having unknown MAC addresses, and assigning of Virtual LAN (VLAN) tags.
The CRC accelerator 220-2 enables the fast CRC calculation for data received through the PON. The CRC accelerator 220-2 operates off line on data stored in data memory 230-2. Specifically, the CRC accelerator 220-2 includes at least five CRC engines (not shown) that are tailored to calculate the CRC of PON traffic. Typically, each CRC engine is capable of computing a different CRC polynomial. The CRC engines may be, but are not limited to, 32-CRC engine for computing the CRC over Ethernet and AAL5 frames, 10-CRC engine for calculating the CRC over OAM cells, 5-CRC utilized for AAL2 frames, and 16-BIP engine for computing parity bit of ATM cells. The core processor 210 instructs CRC accelerator 220-2 which CRC engine or engines are required for the computation. The DMA accelerator 220-5 is responsible for data transfer from and to the memory 230 and an external memory 260. The register file 220-4 includes all configuration and input/output (I/O) space register. Configuration registers can be read and written by the external microprocessor 250, while the I/O registers are for the core processor 210 internal uses. The scheduler 220-3 is coupled to different request generators 302-1 to 320-N (shown in
The core process 210 supports the execution of multiple threads, each of which runs a PON related task. For example, threads run over the core processor 210 include, but are not limited to, a PON RX thread for receiving traffic from an OLT, a PON TX for transmitting traffic to an OLT, an Ethernet RX for processing frames received from a subscriber device connected to an ONU, an Ethernet TX for constructing frames to be sent to a subscriber device, or any other user-defined threads. The execution of some of these threads is described in greater detail below. To ensure optimized performance while executing PON related tasks, the scheduler 220-3 is designed with dedicated mechanisms. These mechanisms comprise a priority-based selection of threads, zero-latency context switching, and the enablement of asynchronous and synchronous requests.
The request generators 320 generate and send to the scheduler 220-3 two types of requests for invoking a thread: asynchronous and synchronous. Asynchronous requests are arbitrarily generated by any peripheral unit 270, the timer 220-7, or by another thread as a result of a thread's activity. For example, a thread may either trigger itself directly or indirectly through a DMA command. A synchronous request is generated by the DMA accelerator 220-5. As can be understood from the above discussion the DMA accelerator 220-5 may be considered as a synchronous request generator, whereas the peripheral units 270 and the timer 220-7 are asynchronous request generators.
An asynchronous request includes an enable bit having a value controlled by the core processor 210. When the enable bit is set to a low logic value (i.e., disable), the scheduler 220-3 does not accept the request. This ensures that certain threads complete their execution. For example, a PON transmit task executes a DMA command, and thus can be invoked only at the end of this command. In this case, all the asynchronous requests are disabled and not served. However, the disabled requests are saved and served once the enable bit is set to a high value. Another indication that allows controlling requests for invoking threads is a mask indication. This indication masks an active thread, i.e., none of the request generators can call the masked thread.
Each thread may have one or more pending requests which are kept together with their status by the scheduler 220-3. In an embodiment of the present invention the decision which request to serve is performed by a priority-based selection algorithm. Specifically, during initialization each thread is configured with a priority, e.g., the priority may be the ID number of the thread. Furthermore, all threads are divided into a predefined number of priority groups. Generally, a higher priority request is always served before a lower priority request. However, the scheduler 220-3 may select a request of lower priority, if the lower priority request raises an urgent flag. The priority mechanism is designed to allow the PON packet processor 200 to process data with minimum latency, maximum bandwidth utilization, and minimum data loss. For example, the PON TX thread is assigned with a higher priority in order to avoid situations of transmitting IDLE cell when a T-CONT is granted. As another example, the Ethernet RX thread is set with a higher priority in order to avoid loss of frames. As yet another example, the ability to temporarily raise the priority of a specific thread (using the urgent flag) allows burst data to be handled efficiently. The priority policy of the scheduler 220-3 is designed to serve burst native communication protocols, e.g., a PON protocol. The PON packet processor 200 and the scheduler 220-3 are optimized to support maximum receive and transmit burst size. A person skilled in the art will note that other selection algorithms may be used by the scheduler 220-3. These algorithms include, but are not limited, to round-robin, weighted round-robin, and the like.
The ID number of the selected thread is sent to the context manager 310 prior to any context switching. The scheduler 220-3 may update its selection as long as context switching has not occurred. The context manager 240 unit is responsible for three functions: a) restoring the next thread registers from context memory 230-3 to a shadow register file; b) performing context switch; and c) saving the registers of a prior thread to the context memory 230-3.
To allow zero-latency when switching context, a new thread is selected and its registers are fetched during the execution of a current thread. The context manager periodically triggers a new thread selection and starts fetching its registers. The next thread to be executed is the latest chosen thread whose registers were completely fetched.
The execution of the PON RX thread is initiated by a PON MAC adapter, which sends an asynchronous request to scheduler 220-3. At S510, an acknowledge message is sent to the PON adapter. At S515, the flow context according to the flow-ID is retrieved from the context memory 230-3. At S520, a validity check is performed in order to determine the status of the flow, and if the flow is invalid then, at S530, the flow is discarded and execution terminates; otherwise, execution proceeds to S535. The validity check may be also performed by hardware filters embedded in the PON MAC adapter. At S535, the CRC accelerator 220-2 is instructed to perform a CRC check on the data saves in the data memory 230-2. At S540, another check is made to determine if the incoming data chunk is the last data chunk of a packet, and if so execution continues with S565; otherwise, execution proceeds to S545. At S545, an incoming data chunk is saved in the external memory 260. This is performed using a DMA command and by means of the DMA accelerator 220-5. During the execution of the DMA command all incoming requests are masked. At S550, the result of the CRC calculation is read and the residue is stored.
At S565, the reassembled packet is retrieved. This is performed using a DMA command and by means of the DMA accelerator 220-5. During the execution of the DMA command all asynchronous requests are disabled, i.e., core processor 210 waits for the completion of the data transfer. At S570, the result of the CRC calculation is obtained from the CRC accelerator 220-2 and, at S575, the calculated CRC value is compared to the CRC of the reassembled packet. If the comparison result states inequality, then at S580 the packet is discarded and execution terminates; otherwise, at S585 the packet along with its descriptor is written to an output buffer. Packets in the output buffer are ready to be forwarded to Ethernet MAC adapter. Once the PON packet processor 210 writes the reassembled packet in the output buffer it is ready to receive a new packet from the PON MAC adapter.
This application is a continuation-in-part application of U.S. patent application Ser. No. 11/238,022 filed on Sep. 29, 2005, whose contents are incorporated herein by reference. U.S. Pat. No. 6,229,788May 2001Graves, et al.U.S. Pat. No. 6,385,366May 2002LinU.S. 20030058505March 2003Arol; et al.U.S. 20040202470October 2004Se-Youn; et al.U.S. 20040208631October 2004Jae-Yeon; et al.U.S. 20040218534November 2004Jae-Yeon; et al.U.S. 20040264961December 2004Hong Soon; et al.U.S. 20040136712July 2005Stiscia; et al.
Number | Name | Date | Kind |
---|---|---|---|
5404463 | McGarvey | Apr 1995 | A |
5930262 | Sierens et al. | Jul 1999 | A |
6229788 | Graves et al. | May 2001 | B1 |
6330584 | Joffe et al. | Dec 2001 | B1 |
6385366 | Lin | May 2002 | B1 |
6519255 | Graves | Feb 2003 | B1 |
6546014 | Kramer et al. | Apr 2003 | B1 |
6934781 | Stone et al. | Aug 2005 | B2 |
7171604 | Sydir et al. | Jan 2007 | B2 |
7372854 | Kang et al. | May 2008 | B2 |
7385995 | Stiscia et al. | Jun 2008 | B2 |
7428385 | Lee et al. | Sep 2008 | B2 |
7567567 | Muller et al. | Jul 2009 | B2 |
7739479 | Bordes et al. | Jun 2010 | B2 |
20030058505 | Arol et al. | Mar 2003 | A1 |
20030137975 | Song et al. | Jul 2003 | A1 |
20040136712 | Stiscia et al. | Jul 2004 | A1 |
20040202470 | Lim et al. | Oct 2004 | A1 |
20040208631 | Song et al. | Oct 2004 | A1 |
20040218534 | Song et al. | Nov 2004 | A1 |
20040264961 | Nam et al. | Dec 2004 | A1 |
20050165985 | Vangal et al. | Jul 2005 | A1 |
20050276283 | Gyselings et al. | Dec 2005 | A1 |
20060080477 | Seigneret et al. | Apr 2006 | A1 |
20060179172 | Ayinala et al. | Aug 2006 | A1 |
20060256804 | Kawarabata et al. | Nov 2006 | A1 |
20070019956 | Sorin et al. | Jan 2007 | A1 |
Entry |
---|
A. Kamil, Notes for CS162, Spring 2004, Discussions 2, 3 and 4, 2004. |
MPC8260 Power QUICC II, User's Manual, Motorola Inc. pp. 4-1 to 4-45, (1999). |
Number | Date | Country | |
---|---|---|---|
20070074218 A1 | Mar 2007 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11238022 | Sep 2005 | US |
Child | 11349917 | US |