Aspects pertain to wireless communications. Some aspects relate to wireless networks including 3GPP (Third Generation Partnership Project) networks, 3GPP LTE (Long Term Evolution) networks, 3GPP LTE-A (LTE Advanced) networks, (MulteFire, LTE-U), and fifth-generation (5G) networks including 5G new radio (NR) (or 5G-NR) networks, 5G networks such as 5G NR unlicensed spectrum (NR-U) networks and other unlicensed networks including Wi-Fi, CBRS (OnGo), etc. Other aspects are directed to Open RAN (O-RAN) architectures and, more specifically, techniques for providing federated learning services for artificial intelligence (AI) and machine learning (ML) learning in non-real-time (Non-RT) radio access network (RAN) intelligent controllers (RICs) (Non-RT RICs) and Near-RT RICs.
Mobile communications have evolved significantly from early voice systems to today's highly sophisticated integrated communication platform. With the increase in different types of devices communicating with various network devices, usage of 3GPP LTE systems has increased. The penetration of mobile devices (user equipment or UEs) in modern society has continued to drive demand for a wide variety of networked devices in many disparate environments. Fifth-generation (5G) wireless systems are forthcoming and are expected to enable even greater speed, connectivity, and usability. Next generation 5G networks are expected to increase throughput, coverage, and robustness and reduce latency and operational and capital expenditures. 5G new radio (5G-NR) networks will continue to evolve based on 3GPP LTE-Advanced with additional potential new radio access technologies (RATs) to enrich people's lives with seamless wireless connectivity solutions delivering fast, rich content and services. As current cellular network frequency is saturated, higher frequencies, such as millimeter wave (mmWave) frequency, can be beneficial due to their high bandwidth.
Potential LTE operation in the unlicensed spectrum includes (and is not limited to) the LTE operation in the unlicensed spectrum via dual connectivity (DC), or DC-based LAA, and the standalone LTE system in the unlicensed spectrum, according to which LTE-based technology solely operates in the unlicensed spectrum without requiring an “anchor” in the licensed spectrum, called MulteFire. MulteFire combines the performance benefits of LTE technology with the simplicity of Wi-Fi-like deployments.
Further enhanced operation of LTE and NR systems in the licensed, as well as unlicensed spectrum, is expected in future releases and 5G systems such as O-RAN systems. Such enhanced operations can include techniques for AI and ML for O-RAN networks.
In the figures, which are not necessarily drawn to scale, like numerals may describe similar components in different views. Like numerals having different letter suffixes may represent different instances of similar components. The figures illustrate generally, by way of example, but not by way of limitation, various aspects discussed in the present document.
The following description and the drawings sufficiently illustrate aspects to enable those skilled in the art to practice them. Other aspects may incorporate structural, logical, electrical, process, and other changes. Portions and features of some aspects may be included in or substituted for, those of other aspects. Aspects outlined in the claims encompass all available equivalents of those claims.
A technical problem is how to collect and use data to train AI/ML models to use in an O-RAN architecture. Some embodiments address the technical problem with federated learning. O-RAN WG2 AI/ML models may be deployed in differently in O-RAN. For example, an AI/ML model can be trained in a Non-RT RIC and deployed into a Near-RT RIC for inference. One Non-RT RIC may service multiple Near-RT RICs so that Non-RT RIC (or OAM) may obtain an AI/ML models across a large number of Near-RT RICs (or CUs) without collecting all the raw data from RAN. The local data stays within corresponding Near-RT RICs to reduce data transportation cost and to enhance privacy. Some embodiments use the A1 interface, e.g., as described herein and in [R03] and [R04], for communication in federated learning between Non-RT RIC and Near-RT RICs in the O-RAN architecture. Some embodiments provide management services for AI/ML model federated learning between Non-RT RICs and Near-RT RICs.
The management portion/side of the architectures 200 includes the SMO Framework 202 containing the non-RT RIC 212, and may include the O-Cloud 206. The O-Cloud 206 is a cloud computing platform including a collection of physical infrastructure nodes to host the relevant O-RAN functions (e.g., the near-RT RIC 214, O-RAN Central Unit-Control Plane (O-CU-CP) 221, O-RAN Central Unit-User Plane O-CU-UP 222, and the O-RAN Distributed Unit (O-DU) 215, supporting software components (e.g., OSs, VMMs, container runtime engines, ML engines, etc.), and appropriate management and orchestration functions.
The radio portion/side of the logical architecture 200 includes the near-RT RIC 214, the O-DU 215, the O-RAN Radio Unit (O-RU) 216, the O-CU-CP 221, and the O-CU-UP 222 functions. The radio portion/side of the logical architecture 200 may also include the O-e/gNB 210.
The O-DU 215 is a logical node hosting Radio Link Control (RLC), media access control (MAC), and higher physical (PHY) layer entities/elements (High-PHY layers) based on a lower layer functional split. The O-RU 216 is a logical node hosting lower PHY layer entities/elements (Low-PHY layer) (e.g., FFT/iFFT, PRACH extraction, etc.) and RF processing elements based on a lower layer functional split. Virtualization of O-RU 216 is FFS. The O-CU-CP 221 is a logical node hosting the RRC and the control plane (CP) part of the PDCP protocol. The O-CU-UP 222 is a logical node hosting the user plane part of the PDCP protocol and the SDAP protocol.
An E2 interface terminates at a plurality of E2 nodes. The E2 nodes are logical nodes/entities that terminate the E2 interface. For NR/5G access, the E2 nodes include the O-CU-CP 221, O-CU-UP 222, O-DU 215, or any combination of elements as defined in Reference [R15]. For E-UTRA access the E2 nodes include the O-e/gNB 210. As shown in
The Open Fronthaul (OF) interface(s) is/are between O-DU 215 and O-RU 216 functions (see References [R16] and [R17].) The OF interface(s) includes the Control User Synchronization (CUS) Plane and Management (M) Plane.
The F1-c interface connects the O-CU-CP 221 with the O-DU 215. As defined by 3GPP, the F1-c interface is between the gNB-CU-CP and gNB-DU nodes (see References [R07] and [R10].) However, for purposes of O-RAN, the F1-c interface is adopted between the O-CU-CP 221 with the O-DU 215 functions while reusing the principles and protocol stack defined by 3GPP and the definition of interoperability profile specifications.
The F1-u interface connects the O-CU-UP 222 with the O-DU 215. As defined by 3GPP, the F1-u interface is between the gNB-CU-UP and gNB-DU nodes (see References [R07] and [R10]). However, for purposes of O-RAN, the F1-u interface is adopted between the O-CU-UP 222 with the O-DU 215 functions while reusing the principles and protocol stack defined by 3GPP and the definition of interoperability profile specifications.
The NG-c interface is defined by 3GPP as an interface between the gNB-CU-CP and the AMF in the 5GC (see Reference [R06]). The NG-c is also referred as the N2 interface (see Reference [R06]). The NG-u interface is defined by 3GPP, as an interface between the gNB-CU-UP and the UPF in the 5GC (see Reference [R06]). The NG-u interface is referred as the N3 interface (see Reference [R06]). In O-RAN, NG-c and NG-u protocol stacks defined by 3GPP are reused and may be adapted for O-RAN purposes.
The X2-c interface is defined in 3GPP for transmitting control plane information between eNBs or between eNB and en-gNB in EN-DC. The X2-u interface is defined in 3GPP for transmitting user plane information between eNBs or between eNB and en-gNB in EN-DC (see e.g., [O05], [O06]). In O-RAN, X2-c and X2-u protocol stacks defined by 3GPP are reused and may be adapted for O-RAN purposes.
The Xn-c interface is defined in 3GPP for transmitting control plane information between gNBs, ng-eNBs, or between an ng-eNB and gNB. The Xn-u interface is defined in 3GPP for transmitting user plane information between gNBs, ng-eNBs, or between ng-eNB and gNB (see e.g., References [R06] and [R08]). In O-RAN, Xn-c and Xn-u protocol stacks defined by 3GPP are reused and may be adapted for O-RAN purposes
The E1 interface is defined by 3GPP as being an interface between the gNB-CU-CP (e.g., gNB-CU-CP 3728) and gNB-CU-UP (see e.g., [O07], [O09]). In O-RAN, E1 protocol stacks defined by 3GPP are reused and adapted as being an interface between the O-CU-CP 221 and the O-CU-UP 222 functions.
The O-RAN Non-Real Time (RT) RAN Intelligent Controller (RIC) 212 is a logical function within the SMO framework 102, 202 that enables non-real-time control and optimization of RAN elements and resources; AI/machine learning (ML) workflow(s) including model training, inferences, and updates; and policy-based guidance of applications/features in the Near-RT RIC 214.
The O-RAN near-RT RIC 214 is a logical function that enables near-real-time control and optimization of RAN elements and resources via fine-grained data collection and actions over the E2 interface. The near-RT RIC 214 may include one or more AI/ML workflows including model training, inferences, and updates.
The non-RT RIC 212 can be an ML training host to host the training of one or more ML models. The ML data can be collected from one or more of the following: the Near-RT RIC 214, O-CU-CP 221, O-CU-UP 222, O-DU 215, O-RU 216, external enrichment source 110 of
In some implementations, the non-RT RIC 212 provides a query-able catalog for an ML designer/developer to publish/install trained ML models (e.g., executable software components). In these implementations, the non-RT RIC 212 may provide discovery mechanism if a particular ML model can be executed in a target ML inference host, and what number and type of ML models can be executed in the target ML inference host. The Near-RT RIC 214 is a managed function (MF). For example, there may be three types of ML catalogs made discoverable by the non-RT RIC 212: a design-time catalog (e.g., residing outside the non-RT RIC 212 and hosted by some other ML platform(s)), a training/deployment-time catalog (e.g., residing inside the non-RT RIC 212), and a run-time catalog (e.g., residing inside the non-RT RIC 212). The non-RT RIC 212 supports necessary capabilities for ML model inference in support of ML assisted solutions running in the non-RT RIC 212 or some other ML inference host. These capabilities enable executable software to be installed such as VMs, containers, etc. The non-RT RIC 212 may also include and/or operate one or more ML engines, which are packaged software executable libraries that provide methods, routines, data types, etc., used to run ML models. The non-RT RIC 212 may also implement policies to switch and activate ML model instances under different operating conditions.
The non-RT RIC 22 is able to access feedback data (e.g., FM, PM, and network KPI statistics) over the O1 interface on ML model performance and perform necessary evaluations. If the ML model fails during runtime, an alarm can be generated as feedback to the non-RT RIC 212. How well the ML model is performing in terms of prediction accuracy or other operating statistics it produces can also be sent to the non-RT RIC 212 over O1. The non-RT RIC 212 can also scale ML model instances running in a target MF over the O1 interface by observing resource utilization in MF. The environment where the ML model instance is running (e.g., the MF) monitors resource utilization of the running ML model. This can be done, for example, using an ORAN-SC component called ResourceMonitor in the near-RT RIC 214 and/or in the non-RT RIC 212, which continuously monitors resource utilization. If resources are low or fall below a certain threshold, the runtime environment in the near-RT RIC 214 and/or the non-RT RIC 212 provides a scaling mechanism to add more ML instances. The scaling mechanism may include a scaling factor such as a number, percentage, and/or other like data used to scale up/down the number of ML instances. ML model instances running in the target ML inference hosts may be automatically scaled by observing resource utilization in the MF. For example, the Kubernetes® (K8s) runtime environment typically provides an auto-scaling feature.
The A1 interface is between the non-RT RIC 212, which is within—the SMO 202) and the near-RT RIC 214. The A1 interface supports three types of services as defined in Reference [R14], including a Policy Management Service, an Enrichment Information Service, and ML Model Management Service. A1 policies have the following characteristics compared to persistent configuration as defined in Reference [R14]: A1 policies are not critical to traffic; A1 policies have temporary validity; A1 policies may handle individual UE or dynamically defined groups of UEs; A1 policies act within and take precedence over the configuration; and A1 policies are non-persistent, i.e., do not survive a restart of the near-RT RIC.
The application layer protocol is based on a RESTful approach with JSON 402, 403 for data interchange 414. The references [R03] and [R04] provide details regarding A1-P 316.
The method 600 begins at operation 604 with the Non-RT RIC 212 sending a global model download that includes data 602. The data 602 may include the global AI/ML model 616.
The method 600 continues at operation 606 with the Near-RT RIC 214 updating the local AI/ML model 616, which may be initially a copy of the global AI/ML model 616, or the local AI/ML model 616 may be updated based on the received global AI/ML 616. The Near-RT RIC 214 receives the global AI/ML model 616 and trains the received the global AI/ML model 616 using its locally available training data set. The trained global AI/ML model 616 is then regarded as or termed its local model.
The method 600 continues at operation 610, which includes data 610, with the Near-RT RIC 214 uploading the local AI/ML model 616 to the Non-RT RIC 212. The data 610 may include the local AI/ML model 616, a portion of the AI/ML model 616, or the gradients for model updates.
The method 600 continues at operation 612 with the non-RT RIC 212 updating the global AI/ML model 616 based on the received data 610, which include the local AI/ML model 616 or the gradients. The method 600 may include operation 614 where the method 600 iterates until a termination criteria is met for the training such as error thresholds or changes to weights being below a threshold.
The method 600 is described with one Near-RT RIC 214 but there many be more than one Near-RT RIC 214 interacting with the Non-RT RIC 212. The O-RAN uses O1 interface for deployment of a trained and tested ML model from the Non-RT RIC 212 to the Near-RT RIC 214. However, the model updates in FL may not be the full AI/ML model transmission. Exchanges between FL clients, e.g., Near-RT RIC 214, and the central server, e.g., the Non-RT RIC 212, can be portions, gradients, or compressed AI/ML models.
A FL session object consists of a global model (/globalModel) 712 and a local model (/localModel) 714. Models are identified by their IDs and a model resource object specifies the format of the model updates (model update or gradient update), the content of the model update, and the status of the model (/globalModelStatus) 716 or (/localModelStatus) 718. The status of a model indicates whether the model needs to be updated. Table 1 describes the
In some embodiments, JSON objects are used. In one embodiment, the following JSON objects are used in the service operation of the A1-ML service for FL. FLSessionObject: The FL session object is the JSON representation of a FL session. A FL session is identified by its unique session ID, which is assigned by Non-RT RIC 212. A FL session links a global model and a local model for federated learning. GlobalModelObject: The global model object is the JSON representation of the global model in the FL. In one embodiment, the GlobalModelObject is defined as described in Table 2. The global model object acts as a notification to the Near-RT RIC 214 that the model file is ready. The model file may be transferred using FTP, FTPeS, or SFTP, or another transfer protocol.
In one embodiment, the ModelUpdateType has the enumeration as described in Table 3.
The GlobalModelStatusObject is the status object of the global model and is the JSON representation indicating whether the model is timely updated. Table 4 is an enumeration of the GlobalModelStatusObject, in accordance with some embodiments.
In one embodiment the GMNotificationReasonType has the enumeration as described in Table 5.
The local model object, e.g., LocalModelObject, is the JSON representation of the local model in the FL. The local model object acts as a notification to the Near-RT RIC 214 that the model file is ready. The model file is transferred to the Non-RT RIC 212 using FTP, FTPeS, or SFTP, or another transfer protocol. The local model object may have an enumeration as described in Table 6.
The local model status object, e.g., LocalModelStatusObject, of the local model may be a JSON representation indicating whether the model is timely updated. Table 7 is an enumeration of the local model status object.
In one embodiment the local model (LM) notification reason type, e.g., LMNotificationReasonType, an enumeration as described in Table 8.
The ML capabilities query queries the A1-ML producer 312 of the Near-RT RIC 214 for the capabilities of the A1-ML services of the Near-RT RIC 214. The Non-RT RIC 212 can query for all supported ML capabilities in the Near-RT RICs. or it can query a specific ML capability (e.g., support of FL). The A1-ML consumer 306 uses HTTP GET request, in some embodiments, to solicit a get response from A1-ML producer 312.
The method 800 continues at operation 804 with the A1-ML producer 312 sending an HTTP response of “200 OK (array(mlCapID))” message with data 808. The data 808 is the message “200 OK (array(mlCapID))”, in accordance with some embodiments. The data 808 includes for a query of all ML capabilities an array of the ML capability identifiers supported by the Near-RT RIC 214.
The ML capabilities query queries the A1-ML producer 312 of the Near-RT RIC 214 for a specific ML capability of the A1-ML services of the Near-RT RIC 214. The A1-ML consumer 306 uses HTTP GET request, in some embodiments, to solicit a get response from A1-ML producer 312.
The method 900 continues at operation 904 with the A1-ML producer 312 sending an HTTP response of “200 OK (array(mlCapObject))” message with data 910. The data 910 is the message “200 OK (array(mlCapObject))”, in accordance with some embodiments. The data 910 includes for a query of specific ML capability a ML capabilities object (mlCapObject) that identifies the requested capabilities. The data 910 of the HTTP response (operation 904) for a query of a single ML category contains the JSON resource object of indicated ML capability object.
The A1-ML consumer 306 of the Non-RT RIC 212 sends a HTTP PUT request (operation 1002) to the A1-ML producer 312 of the Near-RT RIC 214 to set up a session for FL between a global model in the Non-RT RIC 212 and a local model in the Near-RT RIC 214 for FL. The PUT request message, e.g., data 1006, includes a FL session object.
The method 1000 continues at operation 1004 with the A1-ML producer 312 responding with an HTTP response code “201” if the creation is success. Operation 1004 includes data 1010, which may include “201 created (FLSesssionObject)”. This method 1000 links the global and local models for FL.
The method 1100 continues at operation 1104 with “200 OK”, which includes data 1110. The data 1110 may includes the message “200 OK (GlobalModelObject).” The A1-ML producer 312 response is HTTP response code “200” if the GlobalModelObject is successfully received.
The method 1100 continues at operation 1106 with data 1112 with <FTP or SFTP> model file transfer where the A1-ML consumer 306 transfers the model to the A1-ML producer 312.
mlCaps/{mlCapId}/flSessions/{flSessionID}/globalModel/status” message. This operation 1202 is used to query the status of the global model in a FL session. The GET request message body is empty.
The method 1200 continues at operation 1204 with the A1-ML producer 312 responding with HTTP response code “200” with the model status object for the global model in the message body. The data 1210 is the HTTP response with the model status.
The method 1300 continues at operation 1304 with the A1-ML producer 312 responding with data 1310. If the status object indicates that there is an available update for the local model in the Near-RT RIC 214, then Non-RT RIC 212 can initiate a local model upload procedure which is defined below. The GET request message body is empty, e.g., the data 1308 is only the get request message. The A1-ML producer 312 responds with operation 1304, which may be “HTTP response code ‘200’” with the status object for local model in the message body, e.g., data 1310.
The method 1400 continues at operation 1404 with the A1-ML producer 312 responding with an OK response with data 1410. The data 1410 is the HTTP response with code “200” and the local model object in the message body. The model object indicates the model update type (gradient or compressed model). The model object also contains information such as file location (URL), file size, encoding schemes, so that Non-RT RIC 212 can upload/update the model from Near-RT RIC 214. The method 1400 continues at operation 1406 including data 1412. The data 1412 is the model file being transferred using FTP, SFTP, or another file transfer protocol.
The method 1500 continues with operation 1504 with the A1-ML producer 214 responding with an HTTP response code “204” with an empty message body, if the procedure is success. Operation 1504 includes data 1510 which is the response message to operation 1502.
The POST request from the Near-RT RIC 212 targets the resource URI “notificationDestination”, which is the URI query parameter given during the creation of the FL session. The message body of the post request contains a status object for the global model. The method 1600 continues with the A1-ML consumer 306 responding at operation 1604 with an HTTP response code “204” with empty message body. The data 1610 is the HTTP response.
The method 1800 begins at operation 1802 with query ML capability with data 1850. For example, this may be the same or similar as operation 802.
The method 1800 continues at operation 1804 with sending the support of FL with data 1852. For example, this may be the same or similar as operation 804. The method 1800 continues at operation 1806 with creating FL session with data 1854. The non-RT RIC 212 creates the FL session as described herein.
The method 1800 continues at operation 1882 with the global model being downloaded. The global model may be downloaded based on the Non-RT RIC initiated 1884 download or the Near-RT RIC initiated download 1886.
For the Non-RT RIC initiated 1884 download, the method 1800 begins at operation 1810 with data 1856 where, optionally, the Near-RT RIC 214 and/or the Non-RT RIC 212 query one another regarding the global model status. For example, operation 1202 is an example of a query the status of the global model.
The method 1800 continues at operation 1812 with data 1858 with the global model being downloaded. For example, method 1100 provides a method to download the global model.
The global model download includes two options Non-RT RIC initiated 1884 and Near-RT RIC initiated 1886. The Near-RT RIC initiated 1886 includes operations 1814, 1816, and 1818.
The method 1800 continues with the Near-RT RIC initiated 1886 download of the global model. The Near-RT RIC initiated 1886 method 1800 begins at operation 1814 with sending a message of notifying the global model status, which is included in data 1860. For example, method 1200 illustrates a method 1200 for querying global model status. The method 1800 continues at operation 1816 with sending a message to query global model status, which is included in the data 1862. The method 1800 continues at operation 1818 with downloading the global model, which is included in the data 1864. For example, method 1200 illustrates a method 1200 for querying global model status.
The method 1800 includes a local model download 1888, which may be Non-RT RIC initiated 1890 or Near-RT RIC initiated 1892. The Non-RT RIC initiated 1890 comprises the following two operations. The method 1800 begins, optionally, at operation 1820 with sending a message to query local model status, which is included in the data 1866. The method 1800 continues at operation 1822 with sending a message to local model upload, which is included in data 1868. Method 1300 provides a method for querying a local model status. Method 1300 provides a method for uploading or updating a local model.
The method 1800 includes the Near-RT RIC initiated 1892 local model upload. The Near-RT RIC initiated 1892 local model upload begins at operation 1824 with sending a message to notify local model status, which is included in data 1870. The method 1800 continues, optionally, at operation 1826 with send a query for local model status, which is included in data 1872. Method 1700 provides a method for notifying local model status. Method 1400 provides a method for upload/update a local model. Method 1200 provides a method for query a local model status.
The method 1800 continues at operation 1830 with sending a message to delete (or notify of deletion of) FL session, which is included in data 1876. Method 1400 provides a method for deleting a FL session.
The methods described in conjunction with
The term “application” may refer to a complete and deployable package, environment to achieve a certain function in an operational environment. The term “AI/ML application” or the like may be an application that contains some AI/ML models and application-level descriptions.
The term “machine learning” or “ML” refers to the use of computer systems implementing algorithms and/or statistical models to perform specific task(s) without using explicit instructions, but instead relying on patterns and inferences. ML algorithms build or estimate mathematical model(s) (referred to as “ML models” or the like) based on sample data (referred to as “training data,” “model training information,” or the like) in order to make predictions or decisions without being explicitly programmed to perform such tasks. Generally, an ML algorithm is a computer program that learns from experience with respect to some task and some performance measure, and an ML model may be any object or data structure created after an ML algorithm is trained with one or more training datasets. After training, an ML model may be used to make predictions on new datasets. Although the term “ML algorithm” refers to different concepts than the term “ML model,” these terms as discussed herein may be used interchangeably for the purposes of the present disclosure.
The term “machine learning model,” “ML model,” or the like may also refer to ML methods and concepts used by an ML-assisted solution. An “ML-assisted solution” is a solution that addresses a specific use case using ML algorithms during operation. ML models include supervised learning (e.g., linear regression, k-nearest neighbor (KNN), decision tree algorithms, support machine vectors. Bayesian algorithm, ensemble algorithms, etc.) unsupervised learning (e.g., K-means clustering, principle component analysis (PCA), etc.), reinforcement learning (e.g., Q-learning, multi-armed bandit learning, deep RL, etc.), neural networks, and the like. Depending on the implementation a specific ML model could have many sub-models as components and the ML model may train all sub-models together. Separately trained ML models can also be chained together in an ML pipeline during inference. An “ML pipeline” is a set of functionalities, functions, or functional entities specific for an ML-assisted solution; an ML pipeline may include one or several data sources in a data pipeline, a model training pipeline, a model evaluation pipeline, and an actor. The “actor” is an entity that hosts an ML assisted solution using the output of the ML model inference). The term “ML training host” refers to an entity, such as a network function, that hosts the training of the model. The term “ML inference host” refers to an entity, such as a network function, that hosts model during inference mode (which includes both the model execution as well as any online learning if applicable). The ML-host informs the actor about the output of the ML algorithm, and the actor takes a decision for an action (an “action” is performed by an actor as a result of the output of an ML assisted solution). The term “model inference information” refers to information used as an input to the ML model for determining inference(s); the data used to train an ML model and the data used to determine inferences may overlap, however, “training data” and “inference data” refer to different concepts.
The following describe further examples. Example 1 includes where an A1-ML (A1 machine learning model management service) is used to support federated learning between Non-RT RICs and Near-RT RICs in an O-RAN architecture. In one embodiment, an A1-ML consumer (in Non-RT RIC) and A1-ML producer (in Near-RT RIC) both have a HTTP server and client. A1-ML supports the following service operations: ML capability query; Federated learning session creation; Federated learning session deletion; Global model download/update; Local model upload/update; Global model status query; Local model status query; Global model status notification; and, Local model status notification.
In Example 2, the subject matter of Example 1 optionally includes where Non-RT RIC queries Near-RT RIC's ML capability via A1-ML using HTTP GET method. ML capability is identified by the ML capability identifier.
In Example 3, the subject matter of Examples 1 and 2 optionally includes where a federated learning session includes a global model in Non-RT RIC and a local model in Near-RT RIC. It is identified by a unique session ID, which is assigned by Non-RT RIC.
In Example 4, the subject matter of Examples 1-3 optionally includes where a Non-RT RIC creates the federated learning session (FLSessionObject) via A1-ML using HTTP PUT method. A callback URI (notificationDestination) is provided to Near-RT RIC when a FL session is created. Near-RT RIC uses it for notification posting.
In Example 5, the subject matter of Examples 1-4 optionally includes where a ML model object (global or local) is identified by its model ID. In one embodiment, the object contains a field to indicate the model update type: a model gradient or a compressed model. The model object also includes model file related information, e.g., the location (path or URL) of the model file, the size of the model file, the encoding method of the model file, etc. A global model objection optionally indicates an expiration timer for the following updates. If the times expires, then Near-RT RIC sends notification indicating that an update for global model is missing.
In Example 6, the subject matter of Examples 1-5 optionally includes where Non-RT RIC sends the global model (GlobalModelObject) to Near-RT RIC via A1-ML using HTTP PUT method, which is followed by model file transfer using FTP, SFTP, or FTPeS, etc. Near-RT RIC updates the model accordingly.
In Example 7, the subject matter of Examples 1-6 optionally includes where the Non-RT RIC inquires the status of the global model via A1-ML using HTTP GET method. Near-RT RIC replies with the status (Global Model Status Object), showing the timestamp of most recent model update. Non-RT RIC decides whether a global model download is needed or not.
In Example 8, the subject matter of Examples 1-7 optionally includes where the Near-RT RIC sends a notification (GlobalModelStatusObject) via A1-ML using HTTP POST method. In addition to the update timestamp, the notification includes the reason of sending this notification. In one embodiment, the type of the reason includes: global model not get updated, model id mismatch, etc. Non-RT RIC decides whether it should update the global model.
In Example 9, the subject matter of Examples 1-8 optionally include where the Non-RT RIC inquires the status of the local model via A1-ML using HTTP GET method. Near-RT RIC replies with the status (Local Model Status Object), showing the timestamp of most recent local model update. Non-RT RIC decides whether a local model upload is needed.
In Example 10, the subject matter of Examples 1-9 optionally include where the Non-RT RIC requests the update of local model from Near-RT RIC via A1-ML using HTTP GET method. Near-RT RIC sends the local model object (LocalModelObject) in the response. The model object contains model file related information, e.g., file location, file size, file encoding method, etc. The model file is transferred over FTP, SFTP, or FTPeS, following the file information provided in LocalModelObject. Non-RT RIC update the model accordingly.
In Example 11, the subject matter of Examples 1-10 optionally include where the Near-RT RIC sends a notification (LocalModelStatusObject) via A1-ML using HTTP POST method. In addition to the update timestamp, the notification includes the reason of sending this notification. In one embodiment, the type of the reason includes: new update on local model available, local model got terminated, model id mismatch, etc. Non-RT RIC decides whether to upload the local model.
In Example 12, the subject matter of Examples 1-11 optionally include where the Non-RT RIC deletes the federated learning session via A1-ML using HTTP DELETE method. In some embodiments the models or learning may be referred to as artificial intelligence (AI)/ML models or AI/ML learning, respectively.
Although an aspect has been described with reference to specific exemplary aspects, it will be evident that various modifications and changes may be made to these aspects without departing from the broader scope of the present disclosure. Accordingly, the specification and drawings are to be regarded in an illustrative rather than a restrictive sense. This Detailed Description, therefore, is not to be taken in a limiting sense, and the scope of various aspects is defined only by the appended claims, along with the full range of equivalents to which such claims are entitled.