5G is quickly moving from an idealized future to a very real present. The first 5G-ready iPhone has already been released. As with all generation upgrades, 5G promises significant speed improvements over its predecessor. 4G-LTE offers a peak download rate of 100Mbit/s, with the average being 25-50Mbit/s. In stark contrast, 5G offers up to 1.8Gbps, an improvement of almost 20x. Additionally, 5G has lower latency, with the primary latency contributor being airtime. Verizon has reported a latency of under 30ms in early deployment. There are additional gains in user mobility, energy efficiency, and number of simultaneous connections.
These changes do not come for free, however. 5G deployment requires extensive changes to existing infrastructure to handle new technologies for higher frequencies, beamforming, edge computing, and more. Luckily, these infrastructure improvements also enable previously unrealizable applications. For instance, 5G has the potential to make augmented reality much more tenable.
The sheer amount of data generated by 5G systems enables data-driven control architectures, powered by machine learning, to make 5G even more powerful and efficient. In this article, we will discuss an architecture to apply machine learning to a 5G system. New architectures and algorithms can lead to huge service improvements and cost savings for 5G systems as they become more widespread in the coming years.
New 5G technologies
We will first mention a few of the new technologies coming with 5G that allow the data-driven architecture to become a reality. The primary drivers relate to mobile edge computing (MEC) and radio access networks (RANs).
MEC moves computing from centralized servers closer to the mobile data users in different areas. Normally, data is forwarded from the base station to a central server. The central server handles the data and sends a response back to the base station. Given that the central server can be halfway across the country from the base station, the roundtrip time for data is of the order of tens to hundreds of milliseconds, which limits how responsive the cellular network can be.
MEC brings decentralized computing to cellular networks. Instead of a single central server, compute devices are distributed around the country, with one or multiple per service area. This reduction in processing latency enables much more sophisticated algorithms than previously possible, especially real-time and region-specific algorithms.
The second key driver for data-driven 5G architectures is radio access network (RAN) improvements. The RAN is responsible for transferring data from user devices to the core network. 5G technologies add multiple frequency bands, beamforming, and massive MIMO to RANs. These allow huge reconfigurability in how data is delivered to users, but they present challenges in orchestration. This reconfigurability enables, for example, crowded concerts to have hugely improved service.
In the first full set of 5G standards, the 3GPP specifies splitting up the base station of previous generations into separate units as the 5G standard. They suggest splitting up the base station into centralized units, distributed units, and radio units (or CUs, DUs, and RUs, respectively). The decentralized, flexible nature of the 5G RAN enables sophisticated control schemes based on data from the thousands of units in each service area.
Data-driven cellular architecture
One possible data-driven architecture to take advantage of the distributed RAN consists of the following:
a cloud controller which manages RAN controllers for a given service area
RAN controllers that orchestrate the centralized and decentralized units to handle user device actions, like RAN transfers and load balancing
centralized and distributed units that handle data-delivery operations
radio units that control the RF transceivers that put data over the air
This architecture is presented and described in greater detail by Polese et al. In the paper, they propose an edge-controller-based architecture for cellular networks and evaluate its performance with real data from hundreds of base stations of a major U.S. operator. They provide insights on how to dynamically cluster and associate base stations and controllers, according to the global mobility patterns of the users.
Both cloud and RAN controllers and even the centralized units can be deployed in the MEC. The distribution of components provides a separation of concerns for different layers in the protocol stack and, therefore, allows the cloud and RAN controllers to make higher level decisions without having to worry about low-level operations like channel coding and beamforming.
For example, the RAN controller can aggregate data from all of its corresponding centralized and distributed units and run machine learning algorithms to optimize service in real time. The cloud controller can then aggregate data from multiple RAN controllers and determine which algorithms are performing the best. It can also create an estimate of user behavior and can monitor network congestion in different areas throughout the day.
Polese et al. tested their architecture on real 4G-LTE data from a cellular provider in California. LTE architecture is fully distributed, but it does not have the aggregation and data sharing extant in 5G architectures. Their study found that, in comparison to the LTE architecture, a controller-based 5G architecture, as described above, greatly improves prediction accuracy by aggregating data from multiple sources in the cloud and RAN controllers. This access to information from a plethora of sources makes the architecture a great candidate for new data-driven strategies and machine learning. Algorithms can be run on the cloud and RAN controllers, which can disseminate decisions to their respective CUs, DUs and RUs.
In terms of machine learning algorithms, Polese et al. experimented with random forest, Bayesian ridge, and Gaussian process regressors. The authors used these algorithms to predict different key performance indicators. The authors also experimented with a cluster-based approach in contrast to a local-based one. The cluster-based approach tries to group controllers based on location or data. The authors found that data-based clustering was more effective. The data-based clusters must be updated periodically in response to network activity, which requires network overhead to coordinate between clusters, but the authors found that daily updates had comparable performance to 15-minute updates.
The most successful algorithm, in terms of RMSE (root mean-squared error), was the cluster-based Gaussian process regressor, followed by cluster-based random forest and local-based Bayesian ridge. The cluster-based GPR outperformed all other algorithms for all time lags, from 1 to 10 minutes. Additionally, using a cluster-based approach brings a 53% reduction in RMSE when compared to a local-based approach, directly showing the potential improvement of the 5G architecture over LTE.
As 5G continues to roll out, new applications that take advantage of the unique properties of 5G networks will be needed to fully realize performance and efficiency gains. Using data-driven techniques like machine learning, RAN controllers can orchestrate how decentralized base stations provide service. The simple addition of data-sharing to the fully decentralized LTE architecture can bring a 53% reduction in RMSE to regression algorithms.The ability to forecast load, throughput, and outage duration with this accuracy is highly beneficial to manage the network efficiently.
Alex Saad-Falcon is a published research engineer at an internationally acclaimed research institute, where he leads internal and sponsored projects. Alex has his MS in electrical engineering from Georgia Tech and is pursuing a PhD in machine learning. He is a content writer for Do Supply Inc..