The present disclosure relates generally to a producer, a method performed by the producer, a consumer and a method performed by the consumer. The present disclosure further relates to a system and a method performed by the system. More particularly, the present disclosure relates to inputting and outputting data items into and out from a ring buffer.
Queues are commonly used in operating systems and telecommunication software to handle scheduling of tasks, jobs or items like communication buffers. In systems where jobs or items may originate from more than one source, e.g. a producer, the queue need to handle simultaneous producers inputting items to the queue. Likewise, the queue implementation needs, in the case where there are multiple destinations, e.g. consumers, outputting items from the queue, to ensure that each item is consumed by exactly one consumer.
A queue may be implemented in various ways. The two most common are:
To handle the situation with multiple concurrently executing producers and/or consumers the implementation needs to secure that the queue remains consistent through simultaneous operation. Existing queue solutions are either based on using locks ref or spin locks ref to avoid simultaneous operation from different producers and consumers; or based on using multiple sets of head-tail pointer pairs for lock-less operation.
The queues described above do not have any filters. In other words, all consumers are equal so that items originating from any producer may be consumed by any consumer. Typical use is a work queue where one or more producers may queue jobs, normally a function pointer and a pointer to some data. These jobs are then pulled by any one from the set of worker threads, e.g. consumers, that service the work queue.
The existing solutions require use of locks to serialize access to the queue or utilizes multiple head/tail pointer pairs and a retry in the case where multiple producers and multiple consumers tried to operate the queue simultaneously.
Modern Central Processing Units (CPU) provide support for a CPU to suspend execution and wait for an update of a certain memory location. Once that memory location is being written to by either some other CPU or by some peripheral device, e.g. a Network Interface Card (NIC), the CPU will resume execution. The main reason to suspend execution during the wait time is to reduce power consumption. On AMD CPU's this is available to user space execution through the instructions MONITORX, to set up the memory position to monitor, and MWAITX to suspend execution.
To make use of the feature to allow a CPU to suspend execution in an efficient way, it is a need for a way to implement a queue where the consumers do not wait for, e.g. monitor, the same memory position to be updated. In the case with the linked list and the ring buffer, as described earlier, the consumer would need to monitor either the head pointer of the linked list or the output pointer or output index of the ring buffer. This way, all the consumers would resume execution when the queue is updated and then they need to arbitrate access to the item to pull from the queue, using locks or some other lockless scheme. This is inefficient both from a power perspective as well as from execution time perspective.
Therefore, there is a need to at least mitigate or solve this issue.
An object is to obviate at least one of the above disadvantages and to improve inputting and outputting of data items in and from a ring buffer.
According to a first aspect, the object is achieved by a method performed by a producer for inputting data items in a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an input counter. The producer performs an atomic operation comprising obtaining a sampled value of the input counter and incrementing the input counter. The producer determines an input position in the ring buffer to be the sampled value of the input counter modulo a size of the ring buffer. The producer determines if a status of the element located at the input position indicates that the element is free or occupied. If the status indicates free, then the producer inputs the data item in the element located at the input position. After the data item has been inputted, the producer sets the status of the element to occupied. Else, if the status indicates occupied, then the producer repeats the step of determining if status of the element located at the input position is free or occupied until the status indicates free.
According to a second aspect, the object is achieved by method performed by a producer for inputting multiple data items into a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an input counter. The producer performs an atomic operation comprising obtaining a sampled value of the input counter and incrementing the input counter with a number of data items to insert. The producer sets an index to zero prior to inputting a first data item into the ring buffer. The producer determines an input position in the ring buffer to be the sampled value of the input counter plus the index modulo a size of the ring buffer. The producer determines if a status of the element located at the input position indicates that the element is free or occupied. If the status indicates free, the producer inputs a data item in the element located at the input position. After the data item has been inputted, the producer sets the status of the element to occupied. Else, if the status indicates occupied, then the producer repeats the step of determining if status of the element located at the input position is free or occupied is repeated until the status indicates free. After the first data item has been inputted, the producer increments the index with one after each of the data items have been inputted into the ring buffer. After each data item has been inputted into the ring buffer, the producer compares the index with the number of data items to be inputted. If the index is lower than the number of data items to be inputted, the producer handles the next data item. If the index is not lower than the number of data items to be inputted, the producer determines that all data items have been inputted into the ring buffer.
According to a third aspect, the object is achieved by a method performed by a consumer for outputting data items from a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an output counter. The consumer performs an atomic operation comprising to obtain a sampled value of the output counter and to increment the output counter. The consumer determines an output position in the ring buffer to be the sampled value of the output counter modulo a size of the ring buffer. The output position is unique for the consumer. The consumer monitors a status of the element at the output position. The status indicates that the element is free or occupied. If the status indicates free, the consumer continues to monitor the status of the element until the status indicates occupied. If the status indicates occupied, the consumer determines that the element has been validly read at the output position. After it has been determined that the element has been validly read, the consumer sets the status of the element to free. The consumer outputs the data item from the element which has been validly read and located at the output position.
According to a fourth aspect, the object is achieved by a producer for inputting data items in a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an input counter. The producer is adapted to perform an atomic operation comprising to obtain a sampled value of the input counter and to increment the value of the input counter. The producer is adapted to determine an input position in the ring buffer to be the sampled value of the input counter modulo a size of the ring buffer. The producer is adapted to determine if status of the element located at the input position indicates that the element is free or occupied. The producer is adapted to, if the status indicates free, input the data item in the element located at the input position. The producer is adapted to, after the data item has been inputted, set the status of the element to occupied. The producer is adapted to, if the status indicates occupied repeat the determining if status of the element located at the input position is free or occupied until the status indicates free.
According to a fifth aspect, the object is achieved by a producer for inputting multiple data items into a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an input counter. The producer is adapted to perform an atomic operation comprising obtaining a sampled value of the input counter and incrementing the input counter with a number of data items to insert. The producer is adapted to set an index to zero prior to inputting a first data item into the ring buffer. The producer is adapted to determine an input position in the ring buffer to be the sampled value of the input counter plus the index modulo a size of the ring buffer. The producer is adapted to determines if a status of the element located at the input position indicates that the element is free or occupied. The producer is adapted to, if the status indicates free, input a data item in the element located at the input position. The producer is adapted to, after the data item has been inputted, set the status of the element to occupied. The producer is adapted to, else, if the status indicates occupied, repeat the step of determining if status of the element located at the input position is free or occupied is repeated until the status indicates free. The producer is adapted to, after the first data item has been inputted, increment the index with one after each of the data items have been inputted into the ring buffer. The producer is adapted to, after each data item has been inputted into the ring buffer, compare the index with the number of data items to be inputted. The producer is adapted to, if the index is lower than the number of data items to be inputted, handle the next data item. The producer is adapted to, if the index is not lower than the number of data items to be inputted, determine that all data items have been inputted into the ring buffer.
According to a sixth aspect, the object is achieved by a consumer for outputting data items from a ring buffer. The ring buffer comprises a number of elements. The ring buffer is associated with an output counter. The consumer is adapted to perform an atomic operation comprising to obtain a sampled value of the output counter and to increment the output counter. The consumer is adapted to determine an output position in the ring buffer to be the sampled value of the output counter modulo a size of the ring buffer. The output position is unique for the consumer. The consumer is adapted to monitor a status of the element at the output position. The status indicates that the element is free or occupied. The consumer is adapted to, if the status indicates free, continue to monitor the status of the element until the status of indicates occupied. The consumer is adapted to, if the status indicates occupied, determine that the element has been validly read at the output position. The consumer is adapted to, after it has been determined that the element has been validly read, set the status of the element to free. The consumer is adapted to output the data item from the element which has been validly read and located at the output position.
According to a seventh aspect, the object is achieved by a method performed system for inputting and outputting data items into and from a ring buffer. The system comprises at least one producer and at least one consumer. The ring buffer comprises a number of elements and the ring buffer is associated with an input counter and an output counter. The method comprises the steps of the first aspect or the second aspect performed by the at least one producer, and the steps of any the third aspect performed by the at least one consumer.
According to an eight aspect, the object is achieved by a system for inputting and outputting data items into and from a ring buffer. The system comprises at least one producer and at least one consumer. The at least one producer is adapted to carry out the method according to the first aspect or the second aspect. The at least one consumer is adapted to carry out the method according to the third aspect.
Since the consumer has a unique output position in the ring buffer to monitor, no other consumer will monitor and output data items from the same element in the ring buffer. Thus, inputting and outputting of data items in and from a ring buffer is improved.
The present disclosure herein affords many advantages, of which a non-exhaustive list of examples follows:
One advantage is that the consumer determines a position in the ring buffer to output a data item from that is unique, i.e. only one consumer will watch each position in the ring buffer. This way, the need to repeatedly arbitrate access by multiple consumers to the same position in the ring buffer is avoided, since it is handled by a single atomic operation where the consumer determines the position to watch.
Another advantage is that the elements comprised in the ring buffer may be dimensioned such that they do not share cache line, which reduces the need for cache line updates, e.g. snooping. If elements share cache line, an update in one element would cause invalidation of the cache line, and local caches of the CPU cores where the consumer is monitoring elements that are in the cache line would need to have their cache line updated.
A further advantage is that the elements comprised in the ring buffer may be dimensioned such that they fit the size of memory area that supports unique memory monitoring. Such monitoring is used to allow a CPU to go to a power saving state where its clock is gated and the CPU is blocked until a write operation to the monitored memory area occurs, either initiated from another CPU or an I/O device writing to that memory location. By having each consumer monitoring “its own” position, only the consumer that is monitoring this certain ring buffer position resumes execution when the position is updated, avoiding the power consumption caused by more consumers being resumed.
Another advantage of the present disclosure is that it is lockless, and this increases the efficiency both from a power perspective as well as from execution time perspective, as compared to a method which uses locks. The CPU's comprising the consumer only need to wake up to handle items that match the position in the ring buffer they got. If all consumers would monitor the front of the ring buffer, all would wake up from the input instruction just to arbitrate one consumer to pick it. With respect to the execution time, when a data item is inserted into the element at the position monitored by a consumer, that consumer immediately knows that the data item belongs to it and does not need to arbitrate that using locks or other means, so the insertion job can be deployed, consumed and handled with a shorter latency.
A further advantage of the present disclosure is that it enables producers to insert multiple data items into the ring by using only one single atomic operation with some store instructions to store the data.
Another advantage of the present disclosure is that the producers can input data items into multiple data items using only one instruction.
The present disclosure is not limited to the features and advantages mentioned above. A person skilled in the art will recognize additional features and advantages upon reading the following detailed description.
The present disclosure will now be described in more detail by way of example only in the following detailed description by reference to the appended drawings in which:
The drawings are not necessarily to scale, and the dimensions of certain features may have been exaggerated for the sake of clarity. Emphasis is instead placed upon illustrating the principle.
Each element 103 in the ring buffer 100 may be free or not free. In other words, a status of each element 103 may indicate that the element 103 is free or not free.
Each element 103 in the ring buffer 100 is adapted to store or comprise a data item. The data item comprised in an element 103 may be outputted from the element 103, when the status of the element 103 indicates a not free element. A data item may be inserted into the element 103 when the status of the element 103 indicates a free element. The data item may be referred to as an object, a data object. The data item may comprise data or it may comprise a pointer or reference to data, or it may comprise both data and a pointer to data, i.e. a pointer to other or more data. When the data item comprises a pointer or reference to data, then the pointer or reference is towards a location where the data is stored. The data item may comprise a pointer to a buffer with received network data. The data item may comprise a structure representing a task to perform, i.e. combining a function pointer and a data pointer, for example. The data item may comprise one or more pointers, and the pointers may be a data pointer or a function pointer, or both a data pointer and a function pointer. The data item may comprise a function pointer and a data pointer in a scenario when the ring buffer 100 is used to implement a work queue.
The elements 103 comprised in the ring buffer 100 may be equally sized. The data items that are inputted and outputted from the ring buffer 100 may all be of the same type or they may be of different types, where the type is either data or a pointer or reference to data.
As schematically illustrated in
The producer 105 is adapted to produce data items and to input the produced data items in the ring buffer 100, i.e. in elements 103 comprised in the ring buffer 100. The consumer 108 is adapted to consume data items that it outputs from the ring buffer 100, i.e. from elements 103 comprised in the ring buffer 100. Inputting data items may be referred to as inserting data items, inserting data items, adding data items, writing data items, populating data items, enqueues data items etc. Outputting data items may be referred to as retrieving data items, reading data items, pulling data items, dequeuing data items etc. As mentioned above, the ring buffer 100 comprises a vector of elements 103. On reaching the last element 103 the vector, the producer 105 and the consumer 108 loops back to the beginning and to the first element 103 of the vector. The consumer 108 tracks behind the producer 105 in the ring buffer 100.
The ring buffer 100 comprises a fixed number of equally sized elements 103. The number of elements 103 in the ring buffer 100 may be equal to or greater than the number of consumers 108 that may concurrently request items from the ring buffer 100. Each element 103 may have a status attribute which indicates whether the element 103 is free or not.
An input counter (icnt) is associated with the ring buffer 100. The input counter may be an atomic unsigned integer, only counting upwards and wraps at the arithmetic limit of the integer. The value of the input counter modulo the size, i.e. number of elements 103, of the ring buffer 100 serves as the position at where elements 103 are inputted into the ring buffer 100. The term atomic refers to “atomically operated on”, i.e. one can in one operation read, modify and write the counter. I.e. if one consumer 108 or producer 105 performs an atomic fetch and add procedure, only that consumer 108 or producer 105 will retrieve that read value and a competing consumer 108 or producer 105 would read the value after update from the consumer 108 or producer 105.
An output counter (ocnt) is associated with the ring buffer 100. The output counter (ocnt) may be an atomic unsigned integer, only counting upwards and wraps at the arithmetic limit of the integer. The value of the output counter modulo the size, i.e. number of elements 103, of the ring buffer 100 serves as the position or point at where elements 103 are outputted from the ring buffer 100.
The present disclosure makes use of atomic operations, e.g. atomic_fetch_add or fetch_add, to operate the counters, e.g. the input counter and output counter. The counters will be described in more detail below.
The elements 103 in the ring buffer 100 are located at a position. There may be input positions and output positions. The positions may be referred to as atomic pointers to entries in the ring buffer 100. The positions may be used in combination with a scheme where the consumers 108 pre-allocate positions in the ring buffer 100 combined with an indication in the element 103 itself whether it is free or not.
The consumer 108 determines an output position in the ring buffer 100 to output or pull a data item from that is unique. The output position is unique because it is determined using a modulo operation, i.e. the output position nis the sampled value (ocnt_s) of the output counter (ocnt) modulo the size of the ring buffer. The output position is unique in that only one consumer 108 is allowed to monitor and output a data item from an element 103 at this position. This may be described as the consumer 108 has pre-allocated a position in the ring buffer 100 which it would monitor and output data items from. This means that only one consumer 108 will monitor its unique output position in the ring buffer 100. This way, the need to repeatedly arbitrate access by multiple consumers 108 to the same position in the ring buffer 100 is avoided. This is obtained with one single atomic operation where the consumer 108 determines the position in the ring buffer 100 to watch. With this, multiple consumers 108 may output data items from the ring buffer 100 at the same time, but from their respective unique output position.
There are at least two different purposes to avoid having the consumers 108 monitoring the same position. One purpose that the consumer 108 may have a local cache memory and when that memory is being updated, e.g. due to an insertion into the ring buffer 100, the local cache of all the consumers 108 monitoring that position will need to be updated even though only one consumer 108 will output the item being inserted. This is avoided by aligning the element memory position and size of the ring buffer 100 to the size of the cache line of the consumer 108, e.g. the processor of the consumer 108. The other purpose is that if the consumer 108 supports suspending execution await update of a memory location, as described in earlier. Then it is desired that each consumer 108 monitors its own memory position or range which. The present disclosure may benefit from this by aligning the element memory position and size of the ring buffer 100 to the size which the consumer 108 may monitor to resume execution.
Some examples of the inputting and outputting of data items in the ring buffer 100 will now be described.
The producer 105 determines that it has a data item to input into the ring buffer 100, i.e. it initiates a put(data) operation.
The producer 105 executes a fetch_add operation, e.g. an atomic fetch_add operation, on an input counter (icnt) to atomically get the sampled input counter (icnt_s) and forwards the input counter (icnt) to its next value, possibly wrapping to zero. The phrase “wrapping to zero” may be described as when the input counter (icnt) has reached its maximum value, then it will get the value 0 at the next incrementation. The sampled input counter (icnt_s) is the sampled value of the input counter (icnt) before it is incremented. The sampled value of the input counter may be described as the current value of the input counter.
Execution of the fetch_add operation may be described as the producer 105 reading the input counter (icnt). Forwarding the input counter (icnt) to its next value may be described as the producer 105 incrementing the input counter (icnt).
The input counter (icnt) may be any suitable counter. The input counter (icnt) may be described as a counter associated with the inputting of the data item and which counts upwards. The input counter may be an unsigned integer.
The producer 105 determines the input position (pos) in the ring buffer 100. The input position is determined to be the sampled input counter (icnt_s) modulo (%) the length of the ring buffer 100. The length of the ring buffer 100 may be referred to as the size of the ring buffer 100 and is the same as the number of elements 103 comprised in the ring buffer 100. The input position (pos) may be referred to as an insertion point. The % sign seen in
The producer 105 checks if the status of the element 103 at the input position (pos) that was determined in step 202 is free or not. If the element 103 located at the position is not free, indicated with “no” in
The input position (pos) may point to an occupied element 103, for example in the case where the buffer ring 100 is full. Then the producer 105 needs to wait for that position to become free and then it can input its data item to that position. For a healthy and non-exhausted system, the ring buffer 100 may be dimensioned so that this never or rarely happens.
The producer 105 inputs the data item into the element 103 located at the input position (pos), possibly with the status free. Using other words, the input position (pos) is populated with its data item by the producer 105.
The producer 105 sets the status of the element 103 at the input position (pos) to occupied, i.e. valid or not free. In other words, the status is changed from free to not free. After step 205, the inputting method may be considered as being completed.
In
This step corresponds to step 201 in
Step 301 may be described as the producer 105 reading the input counter (icnt) and increments the input counter (icnt) by the number of elements 103 to insert (n), for example using an atomic operation.
The input counter (icnt) may be any suitable counter. The input counter (icnt) may be described as a counter associated with the inputting of the data item and which counts upwards. The input counter (icnt) may be an unsigned integer.
The producer 105 sets an index to zero, i=0. The index is sent to zero prior to inputting the first data item into the ring buffer, i.e. the first data item of the multiple data items to be inputted. In other words, the producer 105 initializes the index (i) for the input data to zero.
This step corresponds to step 202 in
The input position in step 303 is determined in a different way than in step 202 in that the index (i) is taken into account in step 303, in addition to the sampled input counter (icnt_s) and the length of the ring buffer 100. In step 202, only the sampled input counter (icnt_s) and the length of the ring buffer 100 are taken into account.
This step corresponds to step 203 in
The input position (pos) may point to an occupied element 103, for example in the case where the buffer ring 100 is full. Then the producer 105 needs to wait for that input position (pos) to become free and then it can input its data item to that input position (pos). For a healthy and non-exhausted system, the ring buffer 100 may be dimensioned so that this never or rarely happens.
This step corresponds to step 204 in
This step corresponds to step 205 in
After the first data item has been inputted, the producer 105 increments the index i by one, i=i+1, after each of the data items have been inputted into the ring buffer 100.
The producer 105 checks if there are remaining data items to be put into the ring buffer 100, i.e. the producer 105 compares the index i with the number of data items to be inputted. In the comparison, the producer 105 checks if i<n, where n is the total number of data items that the producer 105 will input into the ring buffer 100. If there are remaining data items to be put into the ring buffer 100, i.e. if the index is lower than the number of data items, then the producer 105 loops back and performs step 303 again. If there are no remaining data items, i.e. all data items have been inputted into the ring buffer 100, then the method proceeds to step 309.
If the index i is not lower than the number of data items to be inputted, then the producer 105 determines that all data items have been inputted into the ring buffer 100 and the method is completed.
Steps 303-308 may be performed for one or multiple data items at a time. If the producer 105 determines that consecutive elements 103 are free in parallel, then the producer 105 may update them using one operation. The loop of steps 303-308 may handle multiple data items per lap.
The consumer 108 uses a fetch_add operation, e.g. an atomic_fetch_add operation, on the output counter to atomically get the sampled output counter (ocnt_s) and increment the output counter (oct) to reflect the next position, possibly wrapping to zero. The sampled output counter (ocnt_s) is the sampled value of the output counter (ocnt) before it is incremented. The sampled value of the output counter may be described as the current value of the output counter (ocnt).
Step 401 may be described as the consumer 108 reads the output counter (oct) and increments it, for example using an atomic operation.
The output counter (oct) may be any suitable counter. The output counter (oct) may be described as a counter associated with the outputting of the data item and which counts upwards. The output counter (oct) may be an unsigned integer.
The consumer 108 determines the output position (pos) in the ring buffer 100 as the sampled output counter (ocnt_s) modulo (%) the length of the ring buffer 100. Thus, the consumer 108 calculates the point in the ring buffer 100 where to retrieve the data item. The length of the ring buffer 100 may be referred to as the size of the ring buffer 100 and is the same as the number of elements 103 comprised in the ring buffer 100. The % sign seen in
This step may be described as the consumer 108 pre-allocates the output position (pos) in the ring buffer 100 where it shall output a data item from. The output position (pos) may be used in combination with a status of the element 103, and this will be described later in more detail.
The consumer 108 starts to read or watch, i.e. monitor, the determined output position (pos) for a data item. Watching the determined output position (pos) may be referred to as to poll. The determined output position (pos) is unique for the consumer 108, i.e. no other consumers 108 monitors or outputs data items from the same output position (pos).
The consumer 108 checks if the status of the element 103 at the output position (pos) that was determined in step 402 is free or not. If the element 103 is free, indicated with “yes” in
Step 404 may be described as if the status of the output position (pos) in the ring buffer 100 indicates a free position in the ring buffer 100, the consumer 108 repeats the read and continues to check the status of the output position (pos).
This step is performed if the consumer 108 determined in step 404 that the element 103 at the output position (pos) is not free, i.e. that a data item is comprised in the element 103 at the output position (pos). When the data item has been read from the output position (pos) in the ring buffer 100, the consumer 108 may mark the position in the ring buffer 100 as free. In other words, the status of the output position (pos) is changed from not free to free.
This step is performed if the consumer 108 determined in step 404 that the element 103 at the output position (pos) is not free, i.e. that a data item is comprised in the element 103 at the output position (pos). The consumer 108 outputs the data item at the output position (pos). The step may also be described as returning the data item to the consumer 108.
Table 1 below provides an overview of some terms used herein:
Inputting and outputting of data items into and from the buffer ring 100 will now be described using an example illustrated in
The ring buffer 100 in these
The consumer 108 may be adapted to act as a consumer 108 only, or it may be adapted to act as both a consumer 108 and a producer 105. For example, during handling of a data item, it may result in one or more data items being inputted into the ring buffer 100, making the consumer 108 also act as a producer.
The processors 601 are adapted to communicate with each other using any suitable communication link, e.g. wired or wireless communication link. The processor 601 may be referred to as a CPU. The processor 601 may comprise internal memory units. The term processor 601 comprises any circuit and/or device suitably adapted to perform the functions described herein. The above term comprises general- or special-purpose programmable microprocessors, Digital Signal Processors (DSP), Application Specific Integrated Circuits (ASIC), Programmable Logic Arrays (PLA), Field Programmable Gate Arrays (FPGA), special purpose electronic circuits, etc., or a combination thereof.
The system 600 comprises a memory 603. The memory 603 comprises the ring buffer 100. The processors 601 are adapted to communicate with a memory 603, i.e. the memory 603 is shared amongst the processors 601 and consequently amounts the producers 105 and consumers 108. The communication may be to input a data item, to output a data item etc. The memory 603 may be for example a Random Access Memory (RAM), a storage medium, such as a Read-Only Memory (ROM) or other non-volatile memory, such as flash memory, or another device accessible via a suitable data interface. The memory 603 may be a stationary memory or a cloud memory. The memory 603 is a read and write memory, and it is accessible by the processor 601. The processor 601 performs coherent random access supporting atomic operations on the memory 603.
The system 600 comprises a computer program or computer program product which carries out the methods described herein. The computer program or computer program product may be stored in the memory 603.
The data processing system 600 exemplified in
The method for for inputting data items in a ring buffer 100 will now be described with reference to the flowchart depicted in
This step corresponds to step 201 in
The atomic operation may be fetch_add.
The atomic operation further comprises to increments the input counter (icnt).
The input counter may be incremented by a number which corresponds to a number of elements 103 in which a data item is to be inputted.
Using other words, using an atomic operation, the producer 105 obtains a sampled value (icnt_s) of the input counter (icnt) and increments the input counter (icnt).
This step corresponds to step 202 in
If the producer 105 should input one data item into the ring buffer 100, then the input position is determined to be the sampled value (icnt_s) of the input counter (icnt_s) modulo a size of the ring buffer 100.
If the producer 105 should input multiple data items into the ring buffer 100, then the input position is determined to be the sampled value (icnt_s) of the input counter (icnt) plus an index (i) modulo the size of the ring buffer 100.
This step corresponds to step 203 in
If the status indicates occupied, then the determining of the status is repeated until the status indicates free.
If the status indicates free, then the method proceeds to step 706 to input the data item.
This step corresponds to step 304 in
This step corresponds to step 302 in
This step corresponds to step 204 in
The data item may be inputted in the element 103 at step 706 if the status was determined to indicate free in step 703.
When multiple data items are to be inputted into the ring buffer 100 by the producer 105 and if the producer 105 determined in step 704 that multiple elements 103 at consecutive positions have a status which indicates that they are free, then the multiple data items may be inputted into the multiple elements 103 in parallel. Furthermore, the index (i) may be incremented in accordance with the multiple data items.
This step corresponds to step 205 in
This step corresponds to step 307 in
This step corresponds to step 308 in
The step is performed when multiple data items are to be inputted into the ring buffer 100 by the producer 105. If the index is lower than the number of data items to be inputted the producer 105 handles the next data item.
This step corresponds to step 309 in
The method described above will now be described seen from the perspective of the consumer 108.
This step corresponds to step 401 in
The atomic operation may be a fetch_add operation.
The atomic operation further comprises to increment the output counter (ocnt).
In other words, the consumer 108 obtains a sampled value (ocnt_s) of the output counter (ocnt) and increments the output counter (ocnt), using an atomic operation.
This step corresponds to step 402 in
This step corresponds to step 403 in
This step corresponds to steps 403 and 404 in
If the status indicates free, i.e. the status that was monitored in step 804, the consumer 108 continues monitoring the status of the element 103 until the status of indicates occupied.
If the status indicates occupied, i.e. the status that was monitored in step 804, the consumer 108 determines that the element 103 has been validly read at the output position.
This step corresponds to step 405 in
This step corresponds to step 406 in
The method described above will now be described seen from the perspective of the system 600. Below will a method performed by a system 600 for inputting and outputting data items into and from a ring buffer 100 be described. The system 600 comprises at least one producer 105 and at least one consumer 108. The ring buffer 100 comprises a number of elements 103. The ring buffer 100 is associated with an input counter (icnt) and an output counter (ocnt). The method comprises at least one of the following steps to be performed, which steps may be performed in any suitable order than described below:
This step corresponds to step 201 in
This step corresponds to step 202 in
This step corresponds to step 203 in
This step corresponds to step 204 in
This step corresponds to step 205 in
This step corresponds to step 401 in
This step corresponds to step 402 in
This step corresponds to steps 403 and 404 in
The consumer 108 continues, if the status indicates free, to monitor the status of the element 103 until the status of indicates occupied.
This step corresponds to step 806 in
This step corresponds to step 405 in
This step corresponds to step 406 in
The consumer 108 and producer 105 comprised in the system 600 may perform any of the steps as described above with reference to
To perform the method steps shown in
The present mechanism for inputting data items in a ring buffer 100 may be implemented through one or more processors, such as a processor 901 in the arrangement depicted in
The producer 105 may comprise a memory 903 comprising one or more memory units. The memory 903 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the producer 105.
The producer 105 may receive information from, e.g. the consumer 108, through a receiving port 905. The producer 105 may receive information from another structure in the system 600 through the receiving port 905. Since the receiving port 905 may be in communication with the processor 901, the receiving port 905 may then send the received information to the processor 901. The receiving port 905 may also be configured to receive other information.
The processor 901 in the producer 105 may be configured to transmit or send information to another structure in the system 600, through a sending port 908, which may be in communication with the processor 901, and the memory 903.
The producer 105 may comprise a performing module 909, a determining module 910, an obtaining module 913, an incrementing module 915, an inputting module 918, a setting module 920, and other module(s) 923 etc.
Those skilled in the art will also appreciate that the performing module 909, the determining module 910, the obtaining module 913, the determining module 910, the obtaining module 913, the incrementing module 915, the inputting module 918, the setting module 920, and the other module(s) 923 etc. described above may refer to a combination of analogue and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 901, perform as described above. One or more of these processors, as well as the other digital hardware, may be comprised in a single ASIC, or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a System-on-a-Chip (SoC).
The different units 909-923 described above may be implemented as one or more applications running on one or more processors such as the processor 901.
The producer 105 is adapted to, e.g. by means of the performing module 909 to perform an atomic operation comprising to obtain a sampled value (icnt_s) of the input counter (icnt) and to increment the sampled value (icnt_s) of the input counter (icnt).
The producer 105 is adapted to, e.g. by means of the determining module 910, determine an input position in the ring buffer 100. When one data item is to be inputted into the ring buffer 100, then the input position is determined to be the sampled value (icnt_s) of the input counter (icnt_s) modulo a size of the ring buffer 100. When multiple data items are to be inputted into the ring buffer 100, then the input position is determined to be the sampled value (icnt_s) of the input counter (icnt_s) plus the index (i) modulo a size of the ring buffer 100.
The producer 105 is adapted to, e.g. by means of the inputting module 918, input the data item in the element 103 located at the determined input position.
The producer 105 is adapted to, e.g. by means of the determining module 910, determine if status of the element 103 located at the determined position indicates that the element 103 is free or occupied. The data item is inputted in the element 103 if the status indicates free. If the status indicates occupied, the producer 105 is adapted to repeat the determining if status of the element 103 located at the determined position is free or occupied, until the status indicates free.
The producer 105 is adapted to, e.g. by means of the setting module 920, after the data item has been inputted, set the status of the element 103 to occupied.
The input counter may be incremented by a number which corresponds to a number of elements 103 in which a data item is to be inputted.
There may be one or multiple data items to be inputted into the ring buffer 100.
There may be multiple data items to be inputted into the ring buffer 100 by the producer 105.
When multiple data items are to be inputted into the ring buffer 100, then the producer 105 is adapted to, e.g. by means of the setting module 920, set an index (i) to zero prior to inputting the first data item into the ring buffer 100.
When multiple data items are to be inputted into the ring buffer 100, then the producer 105 is adapted to, e.g. by means of the incrementing module 915, after the first data item has been inputted, increment the index with one after each of the data items has been inputted into the ring buffer 100.
When multiple data items are to be inputted into the ring buffer 100, then the producer 105 is adapted to, e.g. by means of the determining module 910, after each data item has been inputted into the ring buffer 100, compare the index with the number of data items to be inputted.
When multiple data items are to be inputted into the ring buffer 100, then the producer 105 is adapted to, e.g. by means of processor 901, if the index is lower than the number of data items to be inputted, handle the next data item.
When multiple data items are to be inputted into the ring buffer 100, then the producer 105 is adapted to, e.g. by means of the determining module 910, if the index is not lower, determine that all data items have been inputted into the ring buffer 100.
The producer 105 may be adapted to, e.g. by means of the determining module 910, determine that multiple elements 103 at consecutive positions have a status which indicates that they are free.
The multiple data items may be inputted into the multiple elements 103 in parallel. The index may be incremented in accordance with the multiple data items.
The input counter (icnt) may be an atomic unsigned integer and counting upwards.
The number of elements 103 comprised in the ring buffer 100 may be fixed and larger than one. The elements 103 comprised in the ring buffer 100 may be equally sized.
Thus, the methods described herein for the producer 105 may be respectively implemented by means of a computer program 925 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 901, cause the at least one processor 1001 to carry out the actions described herein, as performed by the producer 105. The computer program 925 product may be stored on a computer-readable storage medium 928. The computer-readable storage medium 928, having stored thereon the computer program 925, may comprise instructions which, when executed on at least one processor 901, cause the at least one processor 901 to carry out the actions described herein, as performed by the producer 105. The computer-readable storage medium 925 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. The computer program 928 product may be stored on a carrier containing the computer program 928 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the first computer-readable storage medium 925, as described above.
The producer 105 may comprise a communication interface configured to facilitate communications between the producer 105 and other nodes or devices, e.g., the consumer 108, or another structure. The interface may comprise a transceiver configured to transmit and receive signals over an interface in accordance with a suitable standard.
The producer 105 may comprise the following arrangement depicted in
Hence, the present disclosure also relates to the producer 105 operative to operate in the system 600. The producer 105 may comprise the processing circuitry 930 and the memory 903. The memory 903 comprises instructions executable by said processing circuitry 930. The producer 105 is operative to perform the actions described herein in relation to the producer 105, e.g., in
A computer program may comprise instructions which, when executed on at least one processor, e.g. the processor 901 or the processing circuitry 930, cause the at least one processor to carry out the method as described in any of
To perform the method steps shown in
The present disclosure associated with the consumer 108 may be implemented through one or more processors, such as a processor 1001 in the consumer 108 depicted in
The consumer 108 may comprise a memory 1003 comprising one or more memory units. The memory 1003 is arranged to be used to store obtained information, store data, configurations, schedulings, and applications etc. to perform the methods herein when being executed in the consumer 108.
The consumer 108 may receive information from, e.g., the producer 105, through a receiving port 1005. The consumer 108 may receive information from another structure in the system 600 through the receiving port 1005. Since the receiving port 1005 may be in communication with the processor 1001, the receiving port 1005 may then send the received information to the processor 1001. The receiving port 1005 may also be configured to receive other information.
The processor 1001 in the consumer 108 may be configured to transmit or send information to e.g., the producer 105, or another structure in the system 600, through a sending port 1008, which may be in communication with the processor 1001, and the memory 1003.
The consumer 108 may comprise a performing module 1009, an incrementing module 1010, a determining module 1013, a monitoring module 1015, a setting module 1016, an outputting module 1018, a reading module 1020, other module(s) 1023 etc.
Those skilled in the art will also appreciate that the performing module 1009, the incrementing module 1010, the determining module 1013, the monitoring module 1015, the setting module 1016, the outputting module 1018, the reading module 1020, other module(s) 1023 etc. described above may refer to a combination of analog and digital circuits, and/or one or more processors configured with software and/or firmware, e.g., stored in memory, that, when executed by the one or more processors such as the processor 1001, perform as described above. One or more of these processors, as well as the other digital hardware, may be comprised in a single ASIC, or several processors and various digital hardware may be distributed among several separate components, whether individually packaged or assembled into a SoC.
Also, the different units 1009-1023 described above may be implemented as one or more applications running on one or more processors such as the processor 10001.
The consumer 108 is adapted to, e.g. by means of the performing module 1009, perform an atomic operation comprising to obtain a sampled value (ocnt_s) of the output counter (ocnt) and to increment the output counter (ocnt).
The consumer 108 is adapted to, e.g. by means of the determining module 1013, determine an output position in the ring buffer 100 to be the sampled value (ocnt_s) of the output counter (ocnt) modulo a size of the ring buffer 100. The output position is unique for the consumer 108.
The consumer 108 is adapted to, e.g. by means of the monitoring module 1015, monitor a status of the element 103 at the output position, wherein the status indicates that the element 103 is free or occupied.
The consumer 108 is adapted to, e.g. by means of the monitoring module 1015, if the status indicates free, continue monitoring the status of the element 103 until the status of indicates occupied.
The consumer 108 is adapted to, e.g. by means of the determining module 1013, if the status indicates occupied, determine that the element 103 has been validly read at the output position.
The consumer 108 is adapted to, e.g. by means of the setting module 1016, after it has been determined that the element 103 has been validly read, set the status of the element 103 to free.
The consumer 108 is adapted to, e.g. by means of the outputting module 1018, output the data item from the element 103 which has been validly read and located at the output position.
The consumer 108 may be adapted to, e.g. by means of the reading module 1020, read the element 103 located at the output position. The status may be comprised in the element 103.
The consumer 108 may be adapted to, e.g. by means of the determining module 1013, determine to output a data item from the ring buffer 100.
The output counter (ocnt) may be an atomic unsigned integer and counting upwards.
The number of elements 103 comprised in the ring buffer 100 may be equal to or greater than a number of consumers 108 that concurrently outputs data items from elements 103 in the ring buffer 100.
The number of elements 103 comprised in the ring buffer 100 may be fixed and larger than one. The elements 103 comprised in the ring buffer 100 may be equally sized.
Thus, the methods described herein for the consumer 108 may be respectively implemented by means of a computer program 1028 product, comprising instructions, i.e., software code portions, which, when executed on at least one processor 1001, cause the at least one processor 1001 to carry out the actions described herein, as performed by the consumer 108. The computer program 1028 product may be stored on a computer-readable storage medium 1025. The computer-readable storage medium 1025, having stored thereon the computer program 1028, may comprise instructions which, when executed on at least one processor 1001, cause the at least one processor 1001 to carry out the actions described herein, as performed by the consumer 108. The computer-readable storage medium 1025 may be a non-transitory computer-readable storage medium, such as a CD ROM disc, or a memory stick. The computer program 1028 product may be stored on a carrier containing the computer program 1028 just described, wherein the carrier is one of an electronic signal, optical signal, radio signal, or the second computer-readable storage medium 1025, as described above.
The consumer 108 may comprise a communication interface configured to facilitate communications between the consumer 108 and other nodes or devices, e.g., the producer 105, or another structure. The interface may, for example, comprise a transceiver configured to transmit and receive signals over an interface in accordance with a suitable standard.
The consumer 108 may comprise the following arrangement depicted in
The consumer 108 may be operative to operate in the system 600. The consumer 108 may comprise the processing circuitry 1030 and the memory 1003. The memory 1003 comprises instructions executable by the processing circuitry 1030. The consumer 108 is operative to perform the actions described herein in relation to the consumer 108, e.g., in
A computer program may comprise instructions which, when executed on at least one processor, e.g. the processor 1001 or the processing circuitry 1030, cause the at least one processor to carry out the method as described in any of
As mentioned earlier, the system 600 may comprise at least one producer 105 and at least one consumer 108 for inputting and outputting data items into and from a ring buffer 100. The at least one producer 105 is adapted to carry out the method as described in any of
The present disclosure may be implemented by hardwired circuitry or by software, or by hardwired circuitry in combination with software.
Generally, all terms used herein are to be interpreted according to their ordinary meaning in the relevant technical field, unless a different meaning is clearly given and/or is implied from the context in which it is used. All references to a/an/the element, apparatus, component, means, step, etc. are to be interpreted openly as referring to at least one instance of the element, apparatus, component, means, step, etc., unless explicitly stated otherwise. The steps of any methods disclosed herein do not have to be performed in the exact order disclosed, unless a step is explicitly described as following or preceding another step and/or where it is implicit that a step must follow or precede another step.
In general, the usage of “first”, “second”, “third”, “fourth”, and/or “fifth” herein may be understood to be an arbitrary way to denote different elements or entities, and may be understood to not confer a cumulative or chronological character to the nouns they modify, unless otherwise noted, based on context.
The present disclosure is not limited to the above. Various alternatives, modifications and equivalents may be used. Therefore, disclosure herein should not be taken as limiting the scope. A feature may be combined with one or more other features.
The term “at least one of A and B” should be understood to mean “only A, only B, or both A and B.”, where A and B are any parameter, number, indication used herein etc.
It should be emphasized that the term “comprises/comprising” when used in this specification is taken to specify the presence of stated features, integers, steps or components, but does not preclude the presence or addition of one or more other features, integers, steps, components or groups thereof. It should also be noted that the words “a” or “an” preceding an element do not exclude the presence of a plurality of such elements.
The term “configured to” used herein may also be referred to as “arranged to”, “adapted to”, “capable of” or “operative to”.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/EP2021/078843 | 10/18/2021 | WO |