This application pertains to the field of communication technologies, and specifically relates to a data collection method and apparatus, a first device, and a second device.
Artificial intelligence (AI) has been widely used in various fields. AI modules are implemented in many forms, for example, neural network, decision tree, support vector machine, and Bayesian classifier.
In wireless communication systems, centralized training of AI modules requires collection of a large amount of data from remote ends to construct data sets. Due to significant overlap in wireless data from the remote ends, it is not necessary to collect data from all users. This can alleviate transmission burden during data collection and avoid excessive duplicate (or similar) samples in data sets.
Embodiments of this application provide a data collection method and apparatus, a first device, and a second device.
According to a first aspect, a data collection method is provided and includes:
According to a second aspect, a data collection apparatus is provided and includes:
According to a third aspect, a data collection method is provided and includes:
According to a fourth aspect, a data collection apparatus is provided and includes:
According to a fifth aspect, a first device is provided. The first device includes a processor and a memory, where a program or instructions capable of running on the processor are stored in the memory, and when the program or instructions are executed by the processor, the steps of the method according to the first aspect are implemented.
According to a sixth aspect, a first device is provided and includes a processor and a communication interface. The communication interface is configured to: send a first instruction to a second device, instructing the second device to collect and report training data for training a specific AI model; and receive training data reported by the second device. The processor is configured to construct a data set using the training data and train the specific AI model.
According to a seventh aspect, a second device is provided. The second device includes a processor and a memory, where a program or instructions capable of running on the processor are stored in the memory, and when the program or instructions are executed by the processor, the steps of the method according to the third aspect are implemented.
According to an eighth aspect, a second device is provided and includes a processor and a communication interface. The communication interface is configured to: receive a first instruction from a first device, where the first instruction is used to instruct the second device to collect and report training data for training a specific AI model. The processor is configured to collect training data and report the training data to the first device.
According to a ninth aspect, a data collection system is provided and includes a first device and a second device, where the first device may be configured to execute the steps of the data collection method according to the first aspect, and the second device may be configured to execute the steps of the data collection method according to the third aspect.
According to a tenth aspect, a readable storage medium is provided, where the readable storage medium has a program or instructions stored thereon, and when the program or the instructions are executed by a processor, the steps of the method according to the first aspect are implemented, or the steps of the method according to the third aspect are implemented.
According to an eleventh aspect, a chip is provided. The chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to run a program or instructions to implement the method according to the first aspect or the method according to the third aspect.
According to a twelfth aspect, a computer program/program product is provided. The computer program/program product is stored in a storage medium. The computer program/program product is executed by at least one processor to implement the steps of the data collection method according to the first aspect or the steps of the data collection method according to the third aspect.
The following clearly describes the technical solutions in the embodiments of this application with reference to the accompanying drawings in the embodiments of this application. Apparently, the described embodiments are only some rather than all of the embodiments of this application. All other embodiments obtained by persons of ordinary skill in the art based on the embodiments of this application fall within the protection scope of this application.
The terms “first”, “second”, and the like in the specification and claims of this application are used to distinguish between similar objects rather than to describe a specific order or sequence. It should be understood that the terms used in this way are interchangeable in appropriate circumstances such that the embodiments of this application can be implemented in other orders than the order illustrated or described herein. In addition, objects distinguished by “first” and “second” are generally of a same type, and the quantities of the objects are not limited. For example, there may be one or more first objects. In addition, in this specification and claims, “and/or” indicates at least one of the connected objects, and the character “/” generally indicates an “or” relationship between the contextually associated objects.
It should be noted that technologies described in the embodiments of this application are not limited to a long term evolution (LTE)/LTE-Advanced (LTE-A) system, and may also be applied to other wireless communication systems, for example, code division multiple access (CDMA), time division multiple access (TDMA), frequency division multiple access (FDMA), orthogonal frequency division multiple access (OFDMA), single-carrier frequency division multiple access (SC-FDMA), and other systems. The terms “system” and “network” in the embodiments of this application are often used interchangeably, and the technology described herein may be used in the above-mentioned systems and radio technologies as well as other systems and radio technologies. In the following descriptions, a new radio (NR) system is described for illustration purposes, and NR terms are used in most of the following descriptions, although these technologies may also be applied to other applications than the NR system application, for example, to the 6th generation (6G) communication system.
Typically, the AI algorithms selected and the models adopted vary depending on the type of solution. A main method for improving the performance of the 5th generation (5G) network using AI is to enhance or replace the existing algorithms or processing modules with neural network-based algorithms and models. In specified scenarios, the neural network-based algorithms and models can achieve better performance than deterministic algorithms. Commonly used neural networks include deep neural networks, convolutional neural networks, and recurrent neural networks. With the aid of existing AI tools, operations of building, training, and verifying neural networks can be implemented.
Replacing modules in existing systems with AI methods can effectively improve system performance. As shown in
The performance of AI training with different iteration counts is shown in
When AI is applied to a wireless communication system, many models work on a terminal side, and many tasks working on a base station side require training with data collected on the terminal side. Due to the limited computational power of a terminal, a feasible solution is to report data to a network side for centralized training on the network side. Since many terminals have identical or similar terminal types and service types and operate in identical or similar environments, data of these terminals have high similarity. Identical data should be avoided as much as possible in data sets for training. This is because identical data have limited benefits for model convergence and may lead to overfitting or poor generalization performance. Meta-learning is a learning method to improve model generalization, where multiple tasks are constructed based on data sets, and multi-task learning is performed to obtain an optimal initialized model. An initialized model obtained through meta-learning in a new scenario can quickly undergo fine-tuning and convergence, featuring high adaptability. When multiple tasks are constructed, it is required that some differences are present between data characteristics of different tasks. Therefore, both a conventional learning scheme and a meta-learning scheme have requirements for the coincidence degree or similarity of data in data sets.
The following describes in detail the data collection method provided in the embodiments of this application through some embodiments and application scenarios thereof with reference to the accompanying drawings.
An embodiment of this application provides a data collection method. As shown in
Step 101: A first device sends a first instruction to a second device, instructing the second device to collect and report training data for training a specific AI model.
Step 102: The first device receives training data reported by the second device.
Step 103: The first device constructs a data set using the training data and trains the specific AI model.
In this embodiment of this application, the first device does not request all candidate second devices to collect and report training data. Instead, the first device first screens the candidate second devices to determine a second device required to collect and report the training data, and then sends the first instruction to the second device, instructing the second device to collect and report the training data. This can alleviate transmission burden during data collection and avoid excessive duplicate samples in data sets, thereby alleviating burden for model training.
In some embodiments, that a first device sends a first instruction to a second device includes:
In this embodiment, the candidate second devices are within a communication range of the first device, and the second devices reporting the training data are selected from the candidate second devices. All the candidate second devices can serve as the second devices or some candidate second devices can be selected to serve as the second devices. Broadcasting means that the first instruction is sent to all the candidate second devices, while unicasting means that the first instruction is sent to only the selected second devices. All the candidate second devices that have received the unicast first instruction are required to collect and report the training data. A candidate second device that has received the broadcast first instruction needs to judge whether it satisfies the second screening condition. Only the candidate second devices satisfying the second screening condition collect and report the training data.
In some embodiments, the first device sends the first instruction to the second devices through at least one of the following:
In some embodiments, before that a first device sends a first instruction to a second device, the method further includes:
In this embodiment, the candidate second devices can first report a small amount of training data (that is, the first training data) and/or the first parameter. The first device determines a source of data participating in the training based on the small amount of training data and/or the first parameter, and selects second devices to collect and report training data, preventing all the second devices from reporting training data.
In some embodiments, the first device receives only the first training data reported by the candidate second devices and determines the first parameter based on the first training data. The first device can deduce, perceive, detect, or infer the first parameter based on the first training data. The first device can screen the candidate second devices based on the first parameter to determine the second devices.
In some embodiments, the first parameter includes at least one of the following:
In this embodiment, the candidate second devices can first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, where the first parameter may be a judgment parameter for the first screening condition. The first device determines second devices required to collect and report training data according to the first screening condition. The second devices are selected from the candidate second devices. Specifically, there may be M candidate second devices, from which N second devices are determined to be required to collect and report training data, where N may be less than M or equal to M.
In a specific example, the second devices required to collect and report training data can be determined based on data types of candidate second devices, the candidate second devices are grouped based on the data types of the candidate second devices, and the data types of the candidate second devices in each group are same or similar. When the second devices are selected, K1 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K1 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on service types of candidate second devices, the candidate second devices are grouped based on the service types of the candidate second devices, and the service types of the candidate second devices in each group are same or similar. When the second devices are selected, K2 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K2 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on data distribution parameters of candidate second devices, the candidate second devices are grouped based on the data distribution parameters of the candidate second devices, and the data distribution parameters of the candidate second devices in each group are same or similar. When the second devices are selected, K3 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K3 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on operation scenarios of candidate second devices, the candidate second devices are grouped based on the operation scenarios of the candidate second devices, and the operation scenarios of the candidate second devices in each group are same or similar. When the second devices are selected, A candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where A is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on communication network access modes of candidate second devices, and priorities of the candidate second devices are ordered based on the communication network access modes of the candidate second devices. The communication network access modes include fixed networks, WiFi, and mobile networks, where the mobile networks include 2G, 3G, 4G, 5G, 6G, and the like. A priority of a fixed network being selected is greater than or equal to a priority of WiFi being selected, and the priority of WiFi being selected is greater than or equal to a priority of the mobile network being selected. In the mobile networks, a higher generation leads to a higher priority being selected. For example, a priority of a 5G candidate second device being selected is higher than that of a 4G candidate second device being selected. B candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where B is a positive integer.
In a specific example, the second devices required to collect and report training data can be determined based on channel quality of the candidate second devices, and priorities of the candidate second devices are ordered based on the channel quality of the candidate second devices. A candidate second device with higher channel quality has a higher priority being selected. C candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where C is a positive integer. This can ensure that second devices with good channel quality collect and report training data and ensure the training quality of the specific AI model.
In a specific example, the second devices required to collect and report training data can be determined based on data collection difficulties of the candidate second devices, and priorities of the candidate second devices are ordered based on the data collection difficulties of the candidate second devices. A candidate second device with a lower data collection difficulty has a higher priority being selected. D candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where D is a positive integer. This can reduce the data collection difficulty.
In a specific example, the second devices required to collect and report training data can be determined based on power states of the candidate second devices, and priorities of the candidate second devices are ordered based on the power states of the candidate second devices. A candidate second device with higher power has a higher priority being selected. In addition, a candidate second device being charged has the highest priority being selected. E candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where E is a positive integer. This can ensure that the second devices collecting and reporting training data have sufficient power.
In a specific example, the second devices required to collect and report training data can be determined based on storage states of the candidate second devices, and priorities of the candidate second devices are ordered based on the storage states of the candidate second devices. A candidate second device with a larger available storage space has a higher priority being selected. F candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where F is a positive integer. This can ensure that the second devices collecting and reporting training data have sufficient available storage spaces to store training data.
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
The identity of the candidate second device collecting data and the identity of the candidate second device not collecting data constitute the second screening condition. The candidate second device can judge whether it satisfies the second screening condition according to its own identity.
In some embodiments, after the training a specific AI model, the method further includes:
In this embodiment, the first device constructs a training data set based on the received training data, trains the specific AI model, and sends the AI model for training convergence and the hyper-parameter to the L inference devices. The inference devices are second devices required to perform performance verification and inference on the AI model. The inference devices can be selected from the candidate second devices or other second devices than the candidate second devices.
In some embodiments, the first device sends the trained AI model and the hyper-parameter to the inference devices through at least one of the following:
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter may be determined by the first parameter.
In some embodiments, the hyper-parameter associated with the meta-learning model includes at least one of the following:
In this embodiment, the first device may be a network-side device, and the second device may be a terminal. Alternatively, the first device is a network-side device, and the second device is a network-side device, for example, a scenario in which a plurality of network-side devices gather training data to one network-side device for training. Alternatively, the first device is a terminal, and the second device is a terminal, for example, a scenario in which a plurality of terminals gather training data to one terminal for training.
In addition, the candidate second device may be a network-side device or a terminal. The inference device may be a network-side device or a terminal.
An embodiment of this application further provides a data collection method. As shown in
Step 201: A second device receives a first instruction from a first device, where the first instruction is used to instruct the second device to collect and report training data for training a specific AI model.
Step 202: The second device collects training data and reports the training data to the first device.
In this embodiment of this application, the first device does not request all candidate second devices to collect and report training data. Instead, the first device first screens the candidate second devices to determine a second device required to collect and report the training data, and then sends the first instruction to the second device, instructing the second device to collect and report the training data. This can alleviate transmission burden during data collection and avoid excessive duplicate samples in data sets, thereby alleviating burden for model training.
In some embodiments, the second device reports the training data to the first device through at least one of the following:
In some embodiments, that a second device receives a first instruction from a first device includes:
In some embodiments, that the second device collects training data and reports the training data to the first device includes:
In this embodiment, the candidate second devices are within a communication range of the first device, and the second devices reporting the training data are selected from the candidate second devices. All the candidate second devices can serve as the second devices or some candidate second devices can be selected to serve as the second devices. Broadcasting means that the first instruction is sent to all the candidate second devices, while unicasting means that the first instruction is sent to only the selected second device. All the candidate second devices that have received the unicast first instruction are required to collect and report the training data. The candidate second device that has received the broadcast first instruction needs to judge whether it satisfies the second screening condition. Only the candidate second devices satisfying the second screening condition collect and report the training data.
In some embodiments, before that a second device receives a first instruction from a first device, the method further includes:
In this embodiment, the candidate second devices can first report a small amount of training data (that is, the first training data) and/or the first parameter. The first device determines a source of data participating in the training based on the small amount of training data and/or the first parameter, and selects second devices to collect and report training data, preventing all the second devices from reporting training data.
In some embodiments, the second device reports the first training data and/or the first parameter to the first device through at least one of the following:
In some embodiments, the candidate second devices report only the first training data to the first device, where the first training data is used for determining the first parameter.
In some embodiments, the first device receives only the first training data reported by the candidate second devices and determines the first parameter based on the first training data. The first device can deduce, perceive, detect, or infer the first parameter based on the first training data. The first device can screen the candidate second devices based on the first parameter to determine the second devices.
In some embodiments, the first parameter includes at least one of the following:
In this embodiment, the candidate second devices can first report a small amount of training data (that is, the first training data) and/or the first parameter to the first device, where the first parameter may be a judgment parameter for the first screening condition. The first device determines second devices required to collect and report training data according to the first screening condition. The second devices are selected from the candidate second devices. Specifically, there may be M candidate second devices, from which N second devices are determined to be required to collect and report training data, where N may be less than M or equal to M.
In a specific example, the second devices required to collect and report training data can be determined based on data types of candidate second devices, the candidate second devices are grouped based on the data types of the candidate second devices, and the data types of the candidate second devices in each group are same or similar. When the second devices are selected, K1 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K1 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on service types of candidate second devices, the candidate second devices are grouped based on the service types of the candidate second devices, and the service types of the candidate second devices in each group are same or similar. When the second devices are selected, K2 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K2 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on data distribution parameters of candidate second devices, the candidate second devices are grouped based on the data distribution parameters of the candidate second devices, and the data distribution parameters of the candidate second devices in each group are same or similar. When the second devices are selected, K3 candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where K3 is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on operation scenarios of candidate second devices, the candidate second devices are grouped based on the operation scenarios of the candidate second devices, and the operation scenarios of the candidate second devices in each group are same or similar. When the second devices are selected, A candidate second devices are selected from each group of candidate second devices to serve as the second devices required to collect and report training data, where A is a positive integer. This can ensure the diversity of data participating in training and ensure that there are second devices collecting and reporting training data in each group of candidate second devices.
In a specific example, the second devices required to collect and report training data can be determined based on communication network access modes of candidate second devices, and priorities of the candidate second devices are ordered based on the communication network access modes of the candidate second devices. The communication network access modes include fixed networks, WiFi, and mobile networks, where the mobile networks include 2G, 3G, 4G, 5G, 6G, and the like. A priority of a fixed network being selected is greater than or equal to a priority of WiFi being selected, and the priority of WiFi being selected is greater than or equal to a priority of the mobile network being selected. In the mobile networks, a higher generation leads to a higher priority being selected. For example, a priority of a 5G candidate second device being selected is higher than that of a 4G candidate second device being selected. B candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where B is a positive integer.
In a specific example, the second devices required to collect and report training data can be determined based on channel quality of the candidate second devices, and priorities of the candidate second devices are ordered based on the channel quality of the candidate second devices. A candidate second device with higher channel quality has a higher priority being selected. C candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where C is a positive integer. This can ensure that second devices with good channel quality collect and report training data and ensure the training quality of the specific AI model.
In a specific example, the second devices required to collect and report training data can be determined based on data collection difficulties of the candidate second devices, and priorities of the candidate second devices are ordered based on the data collection difficulties of the candidate second devices. A candidate second device with a lower data collection difficulty has a higher priority being selected. D candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where D is a positive integer. This can reduce the data collection difficulty.
In a specific example, the second devices required to collect and report training data can be determined based on power states of the candidate second devices, and priorities of the candidate second devices are ordered based on the power states of the candidate second devices. A candidate second device with higher power has a higher priority being selected. In addition, a candidate second device being charged has the highest priority being selected. E candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where E is a positive integer. This can ensure that the second devices collecting and reporting training data have sufficient power.
In a specific example, the second devices required to collect and report training data can be determined based on storage states of the candidate second devices, and priorities of the candidate second devices are ordered based on the storage states of the candidate second devices. A candidate second device with a larger available storage space has a higher priority being selected. F candidate second devices are selected from the candidate second devices in a descending order of priority to serve as the second devices required to collect and report training data, where F is a positive integer. This can ensure that the second devices collecting and reporting training data have sufficient available storage spaces to store training data.
In some embodiments, before the reporting the training data to the first device, the method further includes:
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
The identity of the candidate second device collecting data and the identity of the candidate second device not collecting data constitute the second screening condition. The candidate second device can judge whether it satisfies the second screening condition according to its own identity.
In some embodiments, after that the second device collects training data and reports the training data to the first device, the method further includes:
In this embodiment, the first device constructs a training data set based on the received training data, trains the specific AI model, and sends the AI model for training convergence and the hyper-parameter to the L inference devices. The inference devices are second devices required to perform performance verification and inference on the AI model. The inference devices can be selected from the candidate second devices or other second devices than the candidate second devices.
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter may be determined by the first parameter.
In some embodiments, the hyper-parameter includes at least one of the following:
In some embodiments, before that an inference device receives the trained AI model and a hyper-parameter sent by the first device, the method further includes:
In some embodiments, the AI model subjected to performance verification is an AI model sent by the first device or a model obtained through fine-tuning on the AI model sent by the first device.
In this embodiment, the inference device can directly use the AI model sent by the first device to perform performance verification or perform performance verification after performing fine-tuning on the AI model sent by the first device. For fine-tuning of meta learning, special hyper-parameters associated with meta learning corresponding to each inference device may be different. The special hyper-parameters associated with meta learning of each inference device can be determined based on a first parameter corresponding to each inference device (mainly based on data collection difficulty, power state, storage state, and the like in the first parameter).
In this embodiment, the first device may be a network-side device, and the second device may be a terminal. Alternatively, the first device is a network-side device, and the second device is a network-side device, for example, a scenario in which a plurality of network-side devices gather training data to one network-side device for training. Alternatively, the first device is a terminal, and the second device is a terminal, for example, a scenario in which a plurality of terminals gather training data to one terminal for training.
In addition, the candidate second device may be a network-side device or a terminal. The inference device may be a network-side device or a terminal.
In the foregoing embodiments, the specific AI model may be a channel estimation model, a mobility prediction model, and the like. The technical solutions of the embodiments of this application can be applied to not only 6G networks but also 5G networks and 5.5G networks.
The data collection method provided in the embodiments of this application can be executed by a data collection apparatus. In the embodiments of this application, the data collection apparatus performing the data collection method is used as an example to describe the data collection apparatus provided in the embodiments of this application.
An embodiment of this application provides a data collection apparatus. The data collection apparatus includes:
In some embodiments, the sending module is specifically configured to: select N second devices from M candidate second devices according to a preset first screening condition, and unicast the first instruction to the N second devices, where M and N are positive integers, and N is less than or equal to M; or
In some embodiments, the receiving module is further configured to: receive first training data and/or a first parameter reported by the candidate second devices, where the first parameter may be a judgment parameter for the first screening condition.
In some embodiments, the receiving module is configured to receive only the first training data reported by the candidate second devices and determine the first parameter based on the first training data.
In some embodiments, the first parameter includes at least one of the following:
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
In some embodiments, the sending module is further configured to send the trained AI model and a hyper-parameter to L inference devices, where L is greater than M, equal to M, or less than M.
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter is determined by the first parameter.
In some embodiments, the hyper-parameter includes at least one of the following:
In this embodiment, the first device may be a network-side device, and the second device may be a terminal. Alternatively, the first device is a network-side device, and the second device is a network-side device, for example, a scenario in which a plurality of network-side devices gather training data to one network-side device for training. Alternatively, the first device is a terminal, and the second device is a terminal, for example, a scenario in which a plurality of terminals gather training data to one terminal for training.
In addition, the candidate second device may be a network-side device or a terminal. The inference device may be a network-side device or a terminal. An embodiment of this application further provides a data collection apparatus. The data collection apparatus includes:
In some embodiments, the receiving module is configured to: receive the first instruction unicast by the first device, where the second device is a second device selected by the first device from candidate second devices according to a preset first screening condition; or
In some embodiments, the processing module is configured to: in a case that the second device has received the first instruction unicast by the first device, collect and report the training data through the second device; or
In some embodiments, the candidate second devices report first training data and/or a first parameter to the first device, where the first parameter may be a judgment parameter for the first screening condition.
In some embodiments, the candidate second devices report only the first training data
to the first device, where the first training data is used for determining the first parameter.
In some embodiments, the first parameter includes at least one of the following:
In some embodiments, the processing module is further configured to send a first request to the first device, requesting collection and reporting of training data.
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
In some embodiments, an inference device receives the trained AI model and a hyper-parameter sent by the first device.
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter is determined by the first parameter.
In some embodiments, the hyper-parameter includes at least one of the following:
In some embodiments, the data collection apparatus further includes:
In some embodiments, the AI model subjected to performance verification is an AI model sent by the first device or a model obtained through fine-tuning on the AI model sent by the first device.
In this embodiment, the first device may be a network-side device, and the second device may be a terminal. Alternatively, the first device is a network-side device, and the second device is a network-side device, for example, a scenario in which a plurality of network-side devices gather training data to one network-side device for training. Alternatively, the first device is a terminal, and the second device is a terminal, for example, a scenario in which a plurality of terminals gather training data to one terminal for training.
In addition, the candidate second device may be a network-side device or a terminal. The inference device may be a network-side device or a terminal. The data collection apparatus in this embodiment of this application may be an electronic device, for example an electronic device having an operating system, or may be a component of an electronic device, for example, an integrated circuit or a chip. The electronic device may be a terminal or a device other than terminals. For example, the terminal may include but is not limited to the types of the terminal 11 listed above, and the other devices may be servers, network attached storage (NAS), or the like, which are not specifically limited in the embodiments of this application.
The data collection apparatus provided in this embodiment of this application can implement the processes implemented in the method embodiments of
Optionally, as shown in
An embodiment of this application further provides a first device. The first device includes a processor and a memory, where a program or instructions capable of running on the processor are stored in the memory, and when the program or instructions are executed by the processor, the steps of the foregoing data collection method are implemented.
An embodiment of this application further provides a first device. The first device includes a processor and a communication interface. The communication interface is configured to: send a first instruction to a second device, instructing the second device to collect and report training data for training a specific AI model; and receive training data reported by the second device. The processor is configured to construct a data set using the training data and train the specific AI model.
An embodiment of this application further provides a second device. The second device includes a processor and a memory, where a program or instructions capable of running on the processor are stored in the memory, and when the program or instructions are executed by the processor, the steps of the foregoing data collection method are implemented.
An embodiment of this application further provides a second device. The second device includes a processor and a communication interface. The communication interface is configured to: receive a first instruction from a first device, where the first instruction is used to instruct the second device to collect and report training data for training a specific AI model. The processor is configured to collect training data and report the training data to the first device.
The first device may be a network-side device or a terminal, and the second device may be a network-side device or a terminal.
In a case that the first device and/or the second device is a terminal, an embodiment of this application further provides a terminal including a processor and a communication interface. This terminal embodiment corresponds to the foregoing method embodiment on the terminal side. All processes and implementations in the foregoing method embodiment are applicable to this terminal embodiment, with the same technical effects achieved. Specifically,
The terminal 700 includes but is not limited to at least some of these components: a radio frequency unit 701, a network module 702, an audio output unit 703, an input unit 704, a sensor 705, a display unit 706, a user input unit 707, an interface unit 708, a memory 709, a processor 710, and the like.
It can be understood by those skilled in the art that the terminal 700 may further include a power supply (for example, a battery) supplying power to the components. The power supply may be logically connected to the processor 710 via a power management system, so that functions such as charge management, discharge management, and power consumption management are implemented via the power management system. The structure of the terminal shown in
It should be understood that in this embodiment of this application, the input unit 704 may include a graphics processing unit (GPU) 7041 and a microphone 7042. The graphics processing unit 7041 processes image data of still pictures or videos that are obtained by an image capture apparatus (for example, a camera) in an image or video capture mode. The display unit 706 may include a display panel 7061. The display panel 7061 may be configured in a form of a liquid crystal display, an organic light-emitting diode display, or the like. The user input unit 707 includes at least one of a touch panel 7071 and other input devices 7072. The touch panel 7071 is also referred to as a touchscreen. The touch panel 7071 may include two parts: a touch detection apparatus and a touch controller. Specifically, the other input devices 7072 may include but are not limited to a physical keyboard, a function button (for example, volume control button or on/off button), a trackball, a mouse, and a joystick. Details are not described herein.
In this embodiment of this application, the radio frequency unit 701 receives downlink data from a network-side device and transfers the data to the processor 710 for processing; and the radio frequency unit 701 can additionally send uplink data to the network-side device. Generally, the radio frequency unit 701 includes but is not limited to an antenna, an amplifier, a transceiver, a coupler, a low noise amplifier, a duplexer, and the like.
The memory 709 may be configured to store software programs or instructions and various data. The memory 709 may include a first storage area for storing programs or instructions and a second storage area for storing data. The first storage area may store an operating system, an application program or instructions required by at least one function (for example, a sound playback function or an image playback function), and the like. Additionally, the memory 709 may include a volatile memory or a non-volatile memory, or the memory 709 may include both a volatile memory and a non-volatile memory. The non-volatile memory may be a read-only memory (ROM), a programmable read-only memory (PROM), an erasable programmable read-only memory (EPROM), an electrically erasable programmable read-only memory (EEPROM), or a flash memory. The volatile memory may be a random access memory (RAM), a static random access memory (SRAM), a dynamic random access memory (DRAM), a synchronous dynamic random access memory (SDRAM), a double data rate synchronous dynamic random access memory (DDRSDRAM), an enhanced synchronous dynamic random access memory (ESDRAM), a synchronous link dynamic random access memory (SLDRAM), and a direct rambus random access memory (DRRAM). The memory 709 in this embodiment of this application includes but is not be limited to these and any other applicable types of memories.
The processor 710 may include one or more processing units. Optionally, the processor 710 may integrate an application processor and a modem processor. The application processor primarily processes operations involving an operating system, user interface, application program, or the like. The modem processor primarily processes radio communication signals, for example, being a baseband processor. It can be understood that the modem processor may alternatively be not integrated in the processor 710.
In some embodiments, the first device is a terminal. The processor 710 is configured to: send a first instruction to a second device, instructing the second device to collect and report training data for training a specific AI model; receive training data reported by the second device; and construct a data set using the training data and train the specific AI model.
In some embodiments, the processor 710 is specifically configured to: select N second devices from M candidate second devices according to a preset first screening condition, and unicast the first instruction to the N second devices, where M and N are positive integers, and N is less than or equal to M; or
In some embodiments, the processor 710 is further configured to: receive first training data and/or a first parameter reported by the candidate second devices, where the first parameter may be a judgment parameter for the first screening condition.
In some embodiments, the processor 710 is configured to receive only the first training data reported by the candidate second devices and determine the first parameter based on the first training data.
In some embodiments, the first parameter includes at least one of the following:
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
In some embodiments, the processor 710 is further configured to send the trained AI model and a hyper-parameter to L inference devices, where L is greater than M, equal to M, or less than M.
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter is determined by the first parameter.
In some embodiments, the hyper-parameter includes at least one of the following:
In this embodiment, the first device may be a network-side device, and the second device may be a terminal. Alternatively, the first device is a network-side device, and the second device is a network-side device, for example, a scenario in which a plurality of network-side devices gather training data to one network-side device for training. Alternatively, the first device is a terminal, and the second device is a terminal, for example, a scenario in which a plurality of terminals gather training data to one terminal for training.
In addition, the candidate second device may be a network-side device or a terminal. The inference device may be a network-side device or a terminal. In some embodiments, the second device is a terminal. The processor 710 is configured to: receive a first instruction from a first device, where the first instruction is used to instruct the second device to collect and report training data for training a specific AI model; and collect training data and report the training data to the first device.
In some embodiments, the processor 710 is configured to: receive the first instruction unicast by the first device, where the second device is a second device selected by the first device from candidate second devices according to a preset first screening condition; or
In some embodiments, the processor 710 is configured to: in a case that the second device has received the first instruction unicast by the first device, collect and report the training data through the second device; or
In some embodiments, the candidate second devices report first training data and/or a first parameter to the first device, where the first parameter may be a judgment parameter for the first screening condition.
In some embodiments, the candidate second devices report only the first training data to the first device, where the first training data is used for determining the first parameter.
In some embodiments, the first parameter includes at least one of the following:
In some embodiments, the processor 710 is further configured to send a first request to the first device, requesting collection and reporting of training data.
In some embodiments, the unicast first instruction includes at least one of the following:
In some embodiments, the broadcast first instruction includes at least one of the following:
In some embodiments, an inference device receives the trained AI model and a hyper-parameter sent by the first device.
In some embodiments, the AI model is a meta-learning model, and the hyper-parameter is determined by the first parameter.
In some embodiments, the hyper-parameter includes at least one of the following:
In some embodiments, the processor 710 is configured to: perform performance verification on the AI model; and in a case that a performance verification result satisfies a preset first condition, use the AI model for inference.
In some embodiments, the AI model subjected to performance verification is an AI model sent by the first device or a model obtained through fine-tuning on the AI model sent by the first device.
In a case that the first device and/or the second device is a network-side device, an embodiment of this application further provides a network-side device including a processor and a communication interface. This network-side device embodiment corresponds to the foregoing network-side device method embodiment. All processes and implementations in the foregoing method embodiment are applicable to this network-side device embodiment, with the same technical effects achieved.
Specifically, an embodiment of this application further provides a network-side device. As shown in
The method executed by the network-side device in the foregoing embodiments may be implemented on the baseband apparatus 83. The baseband apparatus 83 includes a baseband processor.
The baseband apparatus 83 may include, for example, at least one baseband board, where a plurality of chips are disposed on the baseband board. As shown in
The network-side device may further include a network interface 86, where the interface is, for example, a common public radio interface (CPRI).
Specifically, the network-side device 800 in this embodiment of the present invention further includes: instructions or a program stored in the memory 85 and executable on the processor 84. The processor 84 invokes the instructions or program in the memory 85 to execute the foregoing data collection method, with the same technical effects achieved. To avoid repetition, details are not described herein again.
An embodiment of this application further provides a readable storage medium, where the readable storage medium has a program or instructions stored thereon, and when the program or instructions are executed by a processor, the processes of the foregoing data collection method embodiment are implemented, with the same technical effects achieved. To avoid repetition, details are not described herein again.
The processor is a processor in the terminal described in the foregoing embodiment. The readable storage medium includes a computer-readable storage medium such as a computer read-only memory ROM, a random access memory RAM, a magnetic disk, or an optical disc.
An embodiment of this application further provides a chip. The chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to run a program or instructions to implement the foregoing data collection method embodiment, with the same technical effects achieved. To avoid repetition, details are not repeated herein.
It should be understood that the chip mentioned in this embodiment of this application may also be referred to as a system-level chip, a system chip, a chip system, a system-on-chip, or the like.
An embodiment of this application further provides a computer program/program product, where the computer program/program product is stored in a storage medium, and the computer program/program product is executed by at least one processor to implement the processes of the foregoing data collection method embodiment, with the same technical effects achieved. To avoid repetition, details are not described herein again.
An embodiment of this application further provides a data collection system including a first device and a second device, where the first device may be configured to execute the steps of the foregoing data collection method, and the second device may be configured to execute the steps of the foregoing data collection method.
It should be noted that in this specification, the terms “include”, “comprise”, or any of their variants are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a series of elements includes not only those elements but also other elements that are not expressly listed, or further includes elements inherent to such process, method, article, or apparatus. Without more constraints, an element preceded by “includes a . . . ” does not preclude the presence of other identical elements in the process, method, article, or apparatus that includes the element. In addition, it should be noted that the scope of the method and apparatus in the implementations of this application is not limited to functions being performed in the order shown or discussed, but may further include functions being performed at substantially the same time or in a reverse order, depending on the functions involved. For example, the described method may be performed in an order different from the order described, and steps may be added, omitted, or combined. In addition, features described with reference to some examples may be combined in other examples.
Based on the above description of embodiments, persons skilled in the art can clearly understand that the method in the foregoing embodiments can be implemented through software on a necessary hardware platform or certainly through hardware only, but in many cases, the former is the more preferred implementation. Based on such an understanding, the technical solutions of this application essentially or the part contributing to the prior art may be implemented in a form of a computer software product. The computer software product is stored in a storage medium (such as a ROM/RAM, a magnetic disk, or an optical disc), and includes several instructions for instructing a terminal (which may be a mobile phone, a computer, a server, an air conditioner, a network device, or the like) to perform the methods described in the embodiments of this application.
The foregoing describes the embodiments of this application with reference to the accompanying drawings. However, this application is not limited to the foregoing specific embodiments. The foregoing specific embodiments are merely illustrative rather than restrictive. As instructed by this application, persons of ordinary skill in the art may develop many other manners without departing from principles of this application and the protection scope of the claims, and all such manners fall within the protection scope of this application.
Number | Date | Country | Kind |
---|---|---|---|
202111540035.0 | Dec 2021 | CN | national |
This application is a continuation of International Application No. PCT/CN2022/138757 filed on Dec. 13, 2022, which claims priority to Chinese Patent Application No. 202111540035.0 filed on Dec. 15, 2021, which are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CN2022/138757 | Dec 2022 | WO |
Child | 18742208 | US |