Network Slice Admission Using Reinforcement Learning and Information-Centric Networking

Information-Centric Networking (ICN) and Mobile Networks

Network Slicing

Network slicing enables the network operator or infrastructure provider (InP) to facilitate different service providers (or tenants) in the network by dynamically allocating resources to tenants [26]. The network slicing concept has been considered at different scales, abstraction levels and network segments in the context of 5G multi-tenant networks [26]-[29].

Contributions

Klautau, “Pilot Projects for GSM Shared Networks in the Amazon”, presented at the ICT Workshop for. Klautau, “A Techno-Economic Study for Connecting the Amazon Using Community Cellular Networks,” presented at the ICT for Development Workshop (WTICp/D), Belém, Brazil: Brazilian Computer Society, 2017.

Summary of the Remaining Content

With the aim of introducing the concepts used in the development of this thesis, this chapter starts by describing the two technologies that were integrated in the development of the first part of the thesis: ICN and Evolved Packet System (EPS). These concepts are introduced in the context of a simplified version of the slice access problem studied in Section 4.

Information-Centric Networking

If the content is in CS, it is stored in this node's cache. In a data packet waiting, if the data matches a request present in the PIT, it is stored in the CS according to the caching strategy and forwarded to the interface from which the interest arrived, which is provided by the PIT.

Fourth Generation (4G) Mobile Networks

User Plane

General Packet Radio Service Tunneling Protocol (GTP) . 15

Control Plane
Attachment of User Equipment (UE) to the Evolved Packet System
Identifiers and Tunnels of the Subscriber Attached to the Network . 23

Step 6o The message in step six completes the creation of user plane tunnels in the EPC. The message sent to the eNodeB contains the F-TEID of the S-GW to which the eNodeB should forward the user plan subscriber data. This step completes the user plan configuration in EPS and the user attachment is completed.

The MBMS-GW is responsible for sending the data stream, in the user plane, to the eNodeB.

Reinforcement Learning

Agent and Environment Interaction

The goal of the agent is determined by the reward signal, i.e. the agent should perform actions that increase his collection of rewards. In addition, tenants request new slices every 4 hours, and the agent is rated based on the revenue generated at the end of the day, which is called an episode. For this interaction, the agents' decisions were made randomly, with equal probability of accepting or rejecting a slice.

The trajectory shows that the agent accepted the slice at t = 0 even though there was no slice in the system.

Reinforcement Learning Components

The slice received at t = 5 is rejected, and R5 =−1 is given with the leave of the third slice. For the defined slice acceptance problem, the optimal policy π∗ obtained empirically is π(Reject|s) = 1 for s >2 and π(Reject|s) = 0 otherwise, which implies π(Accept|s) = 0 for s > 2 and π(Accept|s) = 1 otherwise. A greedy policy is common and is defined based on an action-value function, meaning that it is the best action.

And in Q-learning, when the action value function is represented by an ANN, the algorithm is known as deep Q-learning (DQN) [51].

First-Visit Monte Carlo Control Algorithm

We can then understand reinforcement learning algorithms as different forms of learning a policy, or, in some cases, learning the value function of the action from which we can derive the best policy. Therefore, this is reflected in the estimated action-value function, which now predicts a higher value for St = 2 andAt = Reject, reversing the initial policy, which favored acceptance in that case. From these data, a new version of the estimation of the action value function is assembled and presented in Table 8.

The policy derived from the action value function in Table 9 is still the same as shown in Table 5 and, as we presented earlier, is not the optimal policy.

Content Delivery in Current Mobile Networks

The chapter defines ICN in the radio access network, discusses the advantages and associated challenges. Another consequence of the GTP tunnel structure is the unicast nature of information flow in the backhaul. Using this architecture, typically the closest a potential service provider can position the content to the user is in the core network.

Other solutions have been proposed in the literature to intercept GTP traffic and deploy services closer to users.

Related Work

Kim [14] discusses the integration of ICN into an EPS network in the context of UE communication. The simplest communication occurs in the device-to-device (D2D) case, where the producer publishes its content directly to the consumer (13)-(14). In the second case, when D2D is not possible, the consumer sends a request to the L-CRC (1), which forwards the interest to the A-CRC (2)-(3), which then routes the interest to the local network of producers (4)-(6).

An acknowledgment is sent to the L-CRC (6') to ensure that the producer is in the correct eNodeB.

Advantages of ICN with Focus on ICN-RAN

However, the authors claim that the ICN packet can be processed as close to the user as in the eNodeB. Finally, the inclusion of CS in packet processing implies that each node in the network is able to cache content, which can reduce the delay for content retrieval. In this way, most of the gains from cache are realized when ICN is adopted throughout the network.

Interest aggregation can also lead to reductions in backhaul traffic by avoiding sending requests for the same content across the network.

Main Issues of ICN-RAN

Furthermore, the possibility of having identical requests increases with the number of requests [72], which means that in fact more profits can be obtained from interest aggregation for higher-level nodes of the network, i.e. core elements. This restriction is not necessary in ICN, as content can be dynamically requested regardless of the interface. Another dimension of the ICN-RAN deployment is how to reuse the current IP network.

The need for reuse of current infrastructure is one of the assumptions that guided the ICN-RAN deployment and testbed presented in this work.

Previously Proposed ICN-RAN Deployments

Developed Testbed

Adopted ICN Edge Deployment

Its main advantage is to gain some of the advantages of ICN, such as aggregation in the reverse link, while maintaining compatibility with 4G UEs. This means that the user still needs an anchor and IP configuration, which is not needed in a flat* deployment. Current infrastructure can be reused in the backend and core network, and ICN can be deployed in dual-stack routers.

Nevertheless, the network benefits from multicast routing and caching in the backhaul and core network, just as in the flat* deployment.

Main Software Used in the Testbed

However, mobility is not fully available with edge deployment because the core network still needs to configure the user's IP. The two main tools needed to enable the development of this testbed are one for the EPS network and one for the ICN. One of the main technologies that enabled the development of OAI was software-defined radio [75].

Therefore, we can achieve a complete test bed for prototyping new mobile network technologies, such as the one proposed in this thesis.

Testbed Implementation Details

In the downlink flow, ICN packets sent by the ICN network are routed to the previously mapped PDCP entity, which forwards the packet to the user who requested the content. According to the edge placement, multiple aggregation routers may be present between the eNodeB and the S-GW. The equipment for the LTE network consists of a personal computer (i7-5930K processor), an SDR (universal radio peripheral software, or USRP, B210), an LTE dongle (E3272), a Motorola Moto G3 2015 smartphone (not shown in the picture) and the antennas.

Another virtual machine running on the PC connects to the dongle using a USB 2.0 cable to run the applications over the LTE network.

Proofs of Concepts With the Testbed

Latency for Accessing Content at the Network Edge

It is observed that the edge deployment outperforms the overlay in all the calculated criteria. This result indicates that when low latency is required, ICN in the eNodeB is a good option. It is important to note that this is a best case scenario for overlay deployments because in a real scenario the backhaul is subject to congestion, which does not happen in this test.

Multimedia Delivery Over ICN-RAN

Specifically: in the first step, only user 2 requests the video; in the second, the two users; and in the third only the user 1. It is possible to observe that when a local breakout is used (Figure 27b), the traffic in the backhaul is not affected by how many users request the video, i.e. the video is not transmitted more than once in backhaul. At step 3, after approximately 98 seconds, it is observed that the eNodeB is no longer using backhaul.

Furthermore, when dealing with popular content, the bottleneck of the system becomes the delivery of content at the edge of the network, as the content is not sent redundantly in the backhaul.

Real-Time ICN-RAN

The backhaul is not reused until user 1 requests data not previously viewed by user 2, after approximately 118 seconds. As a result, the inclusion of a new user directly affects backhaul usage, i.e. approximately the total traffic of the two users. An example is the video player developed in section 3.7.2, which can only display low-resolution videos in real time.

We begin by describing some network disk management studies considered in the literature, followed by our approach to tackling the disk admission problem.

Slice Admission Related Work

In this section, we present the details of the considered system model for the proposed reinforcement learning agent for slice access control. In the general management loop presented in Section 4.2, one of the decisions that can be optimized is the slice admission. The overall modeling of experience acquisition in the network slicing system is shown in Figure 30.

85], i.e. the resource requirements for each slice are given by the time of day and the type of slice. Future work may include evaluating the policy agent in a more challenging system context where network and slice dynamics are increased, e.g. the proportion of tenants. Kutscher, "It's the network: Towards better security and transport performance in 5G," in 2016 IEEE Conference on Computer Communications Workshops (INFO-COM WKSHPS), Apr.