This application is based upon and claims priority to Chinese Patent Application No. 202110437721.9, filed on Apr. 22, 2021, the entire contents of which are incorporated herein by reference.
The present invention relates to a communication technology and an integrated circuit technology.
Due to the characteristics of artificial intelligence algorithms, a large amount of data needs to be transmitted in artificial intelligence chips. Generally speaking, using NoC (Network-on-Chip) is a relatively common method. In NoC, transmitted data can be divided into different services according to different types of transmitted data. Various service messages share a transmission network bandwidth. In a transmission network, services will be divided into two types of a delay-sensitive service and a delay-insensitive service. The delay-sensitive service is called a high-priority service, and the delay-insensitive service is called a low-priority service. Due to the diversity of services, the lengths of messages are also different. On a shared transmission node, there may be low-priority long messages that block high-priority short messages, resulting in increasing of the transmission delay of high-priority messages.
The technical problem to be solved by the present invention is to provide a data transmission method and a data transmission circuit, which can properly solve the problem of delay of high-priority data.
The second technical problem to be solved by the present invention is to provide an artificial intelligence chip with a faster processing speed.
The technical solution adopted by the present invention to solve the technical problem is as follows.
The present invention provides a data transmission circuit, including a data sending module and a data receiving module, wherein
the data sending module includes the following parts:
a message identification unit, used for sending messages to corresponding encapsulation units according to a priority of message data to be sent;
a low-priority message encapsulation unit, used for slicing low-priority messages, encapsulating message slices respectively to form low-priority message slice packets, and then sending the low-priority message slice packets to a low-priority sending queue;
a high-priority message encapsulation unit, used for encapsulating high-priority messages to form high-priority message packets and then sending the high-priority message packets to a high-priority sending queue; and
a message sending unit, used for sending message packets in the high-priority sending queue and the low-priority sending queue, and preferentially processing the high-priority sending queue;
the data receiving module includes:
a message parsing and distributing unit, used for decapsulating received message packets, and sending the message packets to corresponding message processing units according to a priority of the messages;
a low-priority message receiving unit, used for receiving decapsulated low-priority message slices, and recombining and restoring the low-priority message slices to the low-priority messages; and
a high-priority message receiving unit, used for receiving decapsulated high-priority messages.
the data sending module further includes a priority labeling unit for labeling priority information of message packets.
A working method of the data transmission circuit of the present invention includes the following steps:
a, identifying, by a sender, a priority of a message to be sent, if the message to be sent is of a high priority, encapsulating the message to be sent and then sending the message to be sent to a high-priority sending queue and proceeding Step c, and if the message to be sent is of a low priority, proceeding Step b;
b, slicing a low-priority message, and then encapsulating slices one by one and then sending the slices to a low-priority sending queue, and proceeding Step c;
c, preferentially sending a message packet in the high-priority sending queue; and
d, classifying, by a receiver, a received message according to encapsulation information thereof, if the received message is of a high priority, sending the received message to the high-priority queue, and if the received message is of a low priority, sending the received message to the low-priority queue for recombination.
The present invention has the following beneficial effects that the blocking of a high-priority message by a low-priority message is significantly reduced, the transmission speed of the high-priority message is ensured, and the present invention is applied to the artificial intelligence chip, improving the key processing speed of the chip.
The main point of the present invention is to slice low-priority messages at a scheduling moment to reduce the blocking time of high-priority messages.
A data transmission method of the present invention includes the following steps:
a, identifying, by a sender, a priority of a message to be sent, if the message to be sent is of a high priority, encapsulating the message to be sent and then sending the message to be sent to a high-priority sending queue and proceeding Step c, and if the message to be sent is of a low priority, proceeding Step b;
b, slicing a low-priority message, and then encapsulating slices one by one and then sending the slices to a low-priority sending queue, and proceeding Step c;
c, preferentially sending a message packet in the high-priority sending queue; and
d, classifying, by a receiver, a received message according to encapsulation information thereof, if the received message is of a high priority, sending the received message to the high-priority queue, and if the received message is of a low priority, sending the received message to the low-priority queue for recombination.
The present invention further provides a data transmission circuit, including a data sending module and a data receiving module, wherein the data sending module includes the following parts:
a message identification unit, used for sending messages to corresponding encapsulation units according to a priority of message data to be sent;
a low-priority message encapsulation unit, used for slicing low-priority messages, encapsulating message slices respectively, and then sending the message slices to a low-priority sending queue;
a high-priority message encapsulation unit, used for encapsulating high-priority messages and then sending the high-priority messages to a high-priority sending queue; and
a message sending unit, used for sending messages in the high-priority sending queue and the low-priority sending queue, and preferentially processing the high-priority sending queue;
the data receiving module includes:
a message parsing and distributing unit, used for decapsulating received messages, and sending the messages to corresponding message processing units according to a priority of the messages;
a low-priority message receiving unit, used for receiving decapsulated low-priority message slices, and recombining and restoring the slices; and
a high-priority message receiving unit, used for receiving decapsulated high-priority messages.
The present invention further provides an artificial intelligence chip with the above data transmission circuit.
In this embodiment, a low-priority message with a relatively long length is sliced at a granularity of 128 Bytes, and in this scene, a high-priority message is blocked for a transmission time of at most 128 Byte message, namely 2.048 us. Therefore, adopting a slice mode to transmit the low-priority message can greatly reduce blocking of the high-priority message by the low-priority message.
After a slicing technology is adopted, in order to ensure that the two communicating parties can correctly identify a location of a slice in an original message, it is necessary to mark relevant information (such as locations, serial numbers, etc.) in a data structure of an encapsulation header, so that a sliced message can be recombined on a receiver.
As shown in
On a data receiver, after messages are synchronized according to data formats of message data packet headers, the data packet headers are parsed and distributed to a high-priority queue and a low-priority queue respectively according to priority indications, and messages in the low-priority queue are recombined.
Data structures of high-priority message encapsulation headers (namely data packet headers) are shown in Table 1.
Data structures of low-priority message slice encapsulation headers (data packet headers) are shown in Table 2.
For a scene with a low-priority packet length of 9600, under 14 Mbps, 28 Mbps, 50 Mbps, and 100 Mbps scenes, the difference between a non-slicing solution and a slicing solution in blocking transmission delay of the high-priority message by the low-priority message is shown in
By considering a typical transmission rate of 50 Mbps, under 1500 Byte, 600 Byte, 300 Byte and 64 Byte scenes, the difference between a non-slicing solution and a slicing solution in blocking transmission delay of the high-priority message by the low-priority message is shown in
This embodiment provides more specific technical details.
An overall implementation of a sending side is shown in
Slice processing is recorded by a slice_len_cnt accumulator. After length information pkt_len of a packet is read from the low-priority packet length information FIFO, it is assigned to slice_len_cnt as an initial value, then a 128 is subtracted in each Cycle until the length is less than 128, and meanwhile, corresponding slice encapsulation header information is generated. Corresponding RTL implementation is as follows:
Slice header encapsulation: according to attributes of messages, high-priority or low-priority slice header encapsulation is performed on the messages. Since encapsulation headers are data added on the basis of an original message, it is necessary to splicing transmission data, which is completed by adopting a shift register mode. RTL implementation thereof is as follows:
An overall implementation of a receiving side is shown in
Synchronization processing is completed through a state machine, as shown in
RTL implementation generated by sync header correct sync_ok signals and sync header loss-of-synchronization sync_nok is as follows:
Data packet header parsing of slices: parsing of slice header domain information pri, seg, sn, len is mainly completed, and RTL code implementation thereof is as follows:
Slice data packets are decapsulated to complete stripping of slice headers and reorganization of data, and RTL code implementation thereof is as follows:
The specification has fully explained the necessary technical content of the present invention, and those of ordinary skill in the art can fully implement it accordingly, and more detailed technical details will not be repeated.
Number | Date | Country | Kind |
---|---|---|---|
202110437721.9 | Apr 2021 | CN | national |