DETECTING NETWORK-UNFRIENDLY MOBILES WITH THE RANDOM NEURAL NETWORK

Mobile networks are universally used for personal communications, but also increasingly used in the Internet of Things and machine-to-machine applications in order to access and control critical services. However, they are particularly vulnerable to signaling storms, triggered by malfunctioning applications, malware or malicious behavior, which can cause disruption in the access to the infrastructure. Such storms differ from conventional denial of service attacks, since they overload the control plane rather than the data plane, rendering traditional detection techniques ineffective. Thus, in this paper we describe the manner in which storms happen and their causes, and propose a detection framework that utilizes traffic measurements and key performance indicators to identify in real-time misbehaving mobile devices. The detection algorithm is based on the random neural network which is a probabilistic computational model with efficient learning algorithms. Simulation results are provided to illustrate the effectiveness of the proposed scheme.


INTRODUCTION
In mobile communications, signaling refers to the message exchanges that occur between mobile devices and a network to setup, maintain and release connections. It provides basic functions such as mobility management, radio resource control (RRC), authentication, accounting, etc., which form the control plane of the network. Recently, the number of mobile devices and applications requiring constant access to the Internet has been growing exponentially, placing greater demands on the data and signaling infrastructures of service providers. While operators benefit from such growth in data and billing-related signaling, since it directly correlates to their increased revenues and can be handled effectively through capacity engineering, they are struggling with RRC-based signaling storms caused by malfunctioning applications, malware and malicious behavior. Such storms can cause disruption in the access to the infrastructure, through sudden overload in the signaling backbone of mobile networks [4,40] and potentially the wireless bandwidth of users, and may also deplete the battery power of mobile devices [21] and increase the energy consumption of base stations and core networks (CNs) [36].
Mobile networks are also increasingly used in the Internet of Things (IoT) and machineto-machine (M2M) applications in order to access and control critical industrial and commercial environments. As such they are a key part of our critical infrastructure which can be compromised by these signaling storm effects. On the other hand, IoT and M2M applications could also be responsible for degrading the performance of mobile networks, due to the large number of devices to be supported that may act in a synchronized manner inherent in signaling storms.
These and other challenges [5] require the development of new technical solutions to make mobile networks more resilient and reliable. This is particularly the case with signaling storms which are difficult to detect using traditional denial of service (DoS) defense mechanisms [3,35,47,51,52], since they overload the control plane while leaving the data plane mostly unaffected.
Thus, this paper introduces a signaling storm detection system based on the random neural network (RNN) introduced by Gelenbe in [23]. The RNN is a probabilistic computational model which was inspired by the spiking behavior of neurons, and which has a well-developed mathematical theory [23,24,31] and efficient learning algorithms for recurrent networks [25,33,38]. The RNN has been successfully applied to several problems in engineering and information sciences, including pattern recognition [9,10,30,34], classification [32], image/video processing and compression [15][16][17]29,37], DoS attack detection [35,51,52], and others that can be found in numerous reviews on the subject [26,27,61].

Contributions of the Paper
In this paper, we develop a supervised RNN-based approach for detecting mobile devices that generate excessive RRC signaling, without directly monitoring the control plane itself. In contrast to signaling-based techniques [19,28] which can be more effective but costly, the present approach intercepts packets at the edge of the mobile network using standard monitoring technologies. This offers the advantages of not requiring to decode lower radio related layers, lack of network encryption, and fewer number of nodes to monitor [60]. Moreover, the algorithm relies mainly on timestamps and packet header information to classify users, and does not require knowledge of the application generating a packet nor its service type, thus eliminating the need to use a commercial deep packet inspection tool which may result in considerable overhead. It also interacts with existing network management systems to reduce computational overhead, storage requirements and false alarm rate. The use of a supervised learning RNN is motivated by its capability of classifying known patterns such as signaling storms whose characteristics and root causes are well understood [4,5,21,40], and also its previous success in detecting traditional DoS attacks in the Internet [35,52].
The rest of this paper is organized as follows. We discuss the characteristics and causes of signaling storms, and review related work in Section 2. Section 3 provides a brief summary of the RNN model as applied to our problem of distinguishing between normal and misbehaving mobile devices. The core of the detection technique is presented in Section 4, including the classification process, the choice of input features, and the parameters that can influence the performance of the algorithm. In Section 5, we evaluate our detection mechanism using data generated by a detailed discrete-event mobile network simulator [39,40]; we describe the user and attack models, and present experimental results. Finally, we summarize our findings in Section 6.

The RRC Protocol
In mobile networks, the RRC protocol is used to manage resources in the radio access network (RAN). It performs functions such as setup, configuration, maintenance and release of radio bearers between the user equipment (UE) (i.e., mobile device) and the network, and carries non-access stratum signaling to the CN for mobility, session and identity management. In order to perform these functions, the RRC protocol associates to each UE a state machine in which different states have different amounts of radio resources and power consumption levels. State promotions occur when a UE sends or receives traffic, while state demotions are triggered by inactivity timers. The state machine is designed to allow efficient use of available spectrum and battery power of UEs, by freeing up resources when they are not being used, but the cost in terms of signaling load is paid during state transitions.
Typically, there are at least two RRC states: idle and connected. In the idle mode, the UE does not have a signaling connection with the network, consumes negligible amount of energy, and its location in not known precisely by the network. Thus, traffic destined for a UE in idle mode will require paging in order to locate the UE at the cell level. In the connected mode, the UE has a signaling connection, its location is known at the level of a single cell, and it can communicate at a data rate which depends on traffic load, quality of service requirements, mobility, etc. There can be multiple sub-states within the connected mode, depending on the mobile technology employed and the specific implementation of each network operator. Figure 1 shows two possible implementations of the RRC state machines in 3G/UMTS and 4G/LTE systems, and the typical number of signaling messages exchanged within the RAN for each transition. One can observe that promotions from idle to connected are quite expensive in terms of signaling, thus motivating the introduction of sub-states in the connected mode. In UMTS, there are usually three sub-states: a low-energy cell PCH state which allows the UE to stay in the connected mode without being able to transfer data, a low bandwidth cell FACH state, and a high bandwidth cell DCH state. In LTE, the UE has the ability to go into short and long discontinuous reception (DRX) states while in the connected mode, where it sleeps most of the time and periodically wakes up to check if there is data to be transferred, with longer sleep periods in long DRX than in short DRX.

Causes of Signaling Storms
The vulnerability of mobile networks to deliberate signaling DoS attacks is not new, and much work has been done to identify intrinsic characteristics of the networks' normal operations that could be exploited, for example, paging [58], service requests [62] and RRC [45,57]. However, such threats remained largely unsubstantiated, due to (i) the lack of financial incentives for cyber-criminals to bring down the infrastructure that they use to launch profitable attacks, and (ii) the difficulty of spreading malware onto a sizeable number of devices before the proliferation of application marketplaces. This situation changed with the advent of smart devices and applications, and many operators have since experienced unintentional signaling storms that have the same effect as a DoS attack. Such storms occur when a large number of mobile devices make successive connection requests that time-out because of inactivity, triggering repeated RRC signaling to allocate and de-allocate radio channels and other resources in the network.
Poorly designed mobile applications are one of the most common triggers of signaling overloads [7] that lead to performance degradation and even network outages [18,22]. Such applications constantly poll the network even when users are inactive in order to enable continuous connectivity [50], user behavior measurement and advertisements [14]. A common issue with those "chatty", signaling-intensive applications is that developers are not familiar with the control plane of mobile networks, which prompted the mobile industry to promote best practices for developing network-friendly applications [8,20,41,43]. Similar problems have been reported with M2M systems which periodically transmit small amounts of data [44,59], motivating the development of new standards for M2M communications [1].
However, industry guidelines for developers are not sufficient, since well-designed applications could also trigger a storm, when an unexpected event occurs in the Internet. Examples of such events include outages in cloud services [12,55] and VoIP peer-to-peer networks [13], where a large number of mobile devices attempt to recover connectivity to the application servers, generating significantly more keep-alive messages [6] and an unexpectedly high signaling load in the process.
In addition, signaling storms may occur as a by-product of malicious activity that is not intended to cause a signaling DoS incident. For example, unwanted traffic in the Internet [56] (e.g., port scans, spam campaigns, etc.) can create a storm upon reaching a mobile network, which is possible because many operators [54,63] allow mobiles to be probed from the Internet, by either assigning them public IP addresses, allowing IP spoofing, or permitting device-to-device probing within the network. Large-scale mobile malware infections may also trigger a storm, if the malware exhibit frequent communications as in premium SMS diallers, spammers and adware which are among the top encountered threats on smart devices [48]. This is confirmed by a recent analysis of mobile subscribers' traffic in China [46] which indicated a positive correlation between the frequency of signaling-intensive traffic and malicious activities such as private data upload and billing fraud. Finally, signaling storms may follow and hence prolong network outages from cyber-attacks, due to the large number of user devices that will attempt to reconnect after the service is restored [19].

Prior Work on Storm Detection and Mitigation
Online detection of deliberate signaling attacks was first studied in [45], where connection inter-setup times for each mobile are estimated from IP metrics in order to detect the intention of a remote host to launch an attack. A general framework for anomaly detection was presented in [13] based on time-series analysis and change detection algorithms. While the goal of [13,45] is to identify large-scale events by aggregating and analyzing statistics from all hosts and mobile users, respectively, our approach aims to identify users that are contributing to a problem, namely signaling overload, rather than detect the problem itself. The work in [42] considered the detection of mobile-initiated signaling attacks via a supervised learning approach, which monitors transmissions that trigger a radio access bearer setup procedure, and extracts from the corresponding packets features relating to destination IP and port numbers, packet size and response-request ratio. We utilize similar attributes in our approach, but we do not assume knowledge of the effect that a packet has on the control plane (i.e., whether it has triggered a connection setup procedure), thus simplifying the deployment of our solution in operational networks. In a previous work, [28], we developed a technique which directly monitors the control plane of each active mobile device; it counts the number of successive signaling transitions that do not utilize allocated bandwidth, and temporarily blocks devices that exceed a certain threshold to avoid overloading the network. Although such a signaling-based approach can be more effective in detecting and mitigating storms, it requires changes to network equipment and/or protocols [19,49] which can be slow and costly to implement.
A number of commercial solutions also started to appear in response to recent incidents of signaling storms, which can be classified into three groups. First, anomaly detection and mitigation systems [19] such as [28] and the one presented in this paper. Second, air interface optimization solutions which aim to increase the number of simultaneously connected devices in the access network. Such solutions are constantly evolving with new standards, specifications and proprietary admission/congestion control and scheduling algorithms added all the time; our approach operates on top of and is complimentary to such air interface technologies. Third, dedicated signaling infrastructures to handle the expected growth in CN signaling due to policies, charging, mobility management and other new services offered for the first time in LTE networks. However, it is expected that routing, congestion management and load balancing in the CN will be less of an issue, with the trend towards network functions virtualization that will enable dynamic resource scaling as required by network load.

THE RNN
The RNN is a biologically inspired computational model, introduced by Gelenbe [23], in which neurons exchange signals in the form of spikes of unit amplitude. In RNN, positive and negative signals represent excitation and inhibition respectively, and are accumulated in neurons. Positive signals are canceled by negative signals, and neurons may fire if their potential is positive. A signal may leave neuron i for neuron j as a positive signal with probability p + ij , as a negative signal with probability p − ij , or may depart from the network with probability Thus, when neuron i is excited, it fires excitatory and inhibitory signals to neuron j with rates: The steady-state probability that neuron i is excited is given by: where Λ i and λ i denote the rates of exogenous excitatory and inhibitory signal inputs into neuron i, respectively. A gradient descent supervised learning algorithm for the recurrent RNN has been developed in [25]. For a RNN with n neurons, the learning algorithm estimates the n × n weight matrices W + = {w + ij } and W = {w − ij } from a training set comprising K input-output pairs (X, Y). The set of successive inputs to the algorithm is (k) ) are the pairs of exogenous excitatory and inhibitory signals entering each neuron from outside the network: The successive desired outputs are Y = (y (1) , . . . , y (K) ), where the kth vector y (k) = (y The update procedure requires a matrix inversion operation for each neuron pairs (i, j) and input k which can be done in time complexity O(n 3 ), or O(mn 2 ) if m-step relaxation method is used, and O(n 2 ) for feed-forward networks. Figure 2 shows the basic architecture of the packet-switched domain of mobile networks, which consists of the following elements: the UEs; the RAN which is connected to the CN through a backhaul network; the mobile gateway which allows packets to be exchanged with external networks such as the Internet; and the operation and support system (OSS) which provides network management functions such as monitoring, configuration, service provisioning, etc. Also shown in Figure 2 is the positioning of the proposed detection system within the mobile network, which intercepts packets directed to/from the network gateway; in 3GPP standards, the user data transported over this interface are encapsulated in GTP-U (a simple IP-based tunneling protocol) packets. The detector also utilizes information from the OSS to reduce search space and optimize performance, and periodically produces a list of anomalous users to the OSS for root cause analysis and mitigation. The detection algorithm consists of three data processing stages: (i) user filtering and parameter selection based on network configuration settings and key performance indicators (KPIs) related to signaling load on various network components, (ii) feature generation and (iii) user classification with a trained RNN model. For reasons that should become apparent, we describe these different data processing steps in a logical order rather than the order in which they happen during run time.

Online RNN Classification
The RNN-based algorithm monitors the activity of a set of mobile devices, specified by the data filter, and calculates expressive features that describe various characteristics of the users' behavior. Time is divided into slots, each of duration Δ seconds, in which summary statistics of several quantities related to the IP traffic of each user are collected. The algorithm stores the most recent w set of measurements, and uses them to compute the current values of the input features, that is, the features for time slot τ are computed from measurements obtained for time slots τ, τ − 1, . . . , τ − (w − 1) so that the observation window of the algorithm is W = wΔ. Let z (τ ) denote a measured or calculated quantity for time slot τ , then the ith input feature x (τ ) i is obtained by applying a statistical function φ i of the following form: Hence, by employing different operators φ i on different statistics z stored over the observation window of w slots, it is possible to capture both instantaneous (i.e., sudden) and long-term changes in the traffic profile of a user. In our experiments, we have applied a number of simple statistical functions including: • The mean and standard deviation of z across the entire window.
• An exponential moving average filter in which the current feature is computed as x where α is some constant 0 < α < 1 typically close to 1, with higher values discounting older observations faster.
• Shannon entropy which measures the uncertainty or unpredictability in the data; it is defined as x is the probability of observing data item z (t) within the window, which can be estimated from the histogram of the data. Entropy is typically interpreted as the minimum number of bits required to encode the classification of a data item, thus a small entropy indicates deterministic behavior which is often associated with signaling anomalies [22,55].
• Anomaly score based on how close the measured quantities are to a range of values considered to be suspicious.
Once the input features for a slot have been computed, they are fused using a trained feed-forward RNN architecture such as the one presented in Figure 3 to yield the final decision. The input neurons receive the features computed for the current time slot as exogenous excitatory signals, while all exogenous inhibitory signals are set to zero. The output neurons correspond to the probabilities of the input pattern belonging to any of two traffic classes (i.e., attack or normal). The final decision about the traffic observed in the time slot is determined by the ratio of the two output nodes, which is q 14 /q 15 in the figure: it is classified as attack if the ratio is greater than 1 and normal otherwise. We have used an implementation of the RNN provided in [2].

Feature Selection
Selecting highly informative features for any classification problem is one of the most important parts of the solution. The features that we wish to use should capture the RRC signaling dynamics of users, be easy to measure or calculate without high computational or storage cost, and reflect both the instantaneous behavior and the long term trend of the traffic. For each mobile under observation, we extract information related to inter-arrival times, lengths and destination IP addresses of packets, which have been suggested previously [13,42,56] as good indicators of signaling misbehavior. The specific features that we have used are described below.

Inter-arrival Time.
RRC signaling occurs whenever the UE sends or receives packets following an inactivity period that exceeds an RRC timer. Thus, the volume of traffic exchanged by a UE does not map directly into signaling load which is more influenced by the frequency of intermittent transmissions. To capture this coupling between the data and RRC signaling planes, we define a burst as a collection of packets whose inter-arrival times are less than δ seconds, where δ is smaller than the RRC timers, typically in the order of few seconds. Thus, for a sequence of packets whose arrival instants are {t 1 , t 2 , . . .}, we group all packets up to the nth arrival into a single burst, where n = inf{i : t i − t i−1 > δ}, and then proceed in a similar manner starting from the (n + 1)th packet arrival. Note that a burst may not necessarily generate signaling, even if it arrives after the time-out, due to potential network delays that may modify inter-arrival times of packets. However, packets within a single burst are likely not to trigger any control plane messages, while inter-arrival times of bursts will be correlated with the actual signaling load generated by the UE. In this manner, we remove any bias regarding the volume of traffic sent or received by the UE, and put more emphasis on the frequency of potentially resource-inefficient communications.
The features based on the times between bursts are then calculated as follows. The algorithm stores the mean and standard deviation of the inter-burst times in each slot then, using the most recent w values, it computes (i) entropy of these average values, (ii) moving average of the standard deviations and (iii) moving average of an anomaly score for the average values computed based on the RRC timer T in the high bandwidth state. In particular, the anomaly score a(z (t) ) of the average inter-burst time in slot t is set to zero when z (t) < T , reflecting the fact that such shortly spaced bursts may not have generated many RRC transitions; it is high when z (t) is slightly larger than T , indicating potentially resource-inefficient bursts; and it drops quickly when z (t) is few seconds larger than T . We obtain this effect using for example a Pareto or gamma density functions that assert z (t) must be greater than T − , but not too much greater, which can be controlled by adjusting the parameters of the density function.

Packet
Size. The packet size distribution for a normal device can be markedly different from that of a device that runs a misbehaving application. For example, when signaling storms occur due to unexpected events in the Internet such as cloud outages [13,55], the client application attempts to reconnect to its servers more frequently, causing significant increase in the number of TCP SYN packets sent by the user. This in turn changes the randomness associated with the size of packets, and can be used to identify misbehaving mobiles in the event of a storm. Our algorithm computes the average size of packets sent by a UE within each slot, and evaluates a feature based on the entropy of the most recent w measurements.

Burst Rate.
Another obvious characteristic of signaling storms is the sudden sustained rate acceleration of potentially harmful bursts generated by a misbehaving user. Moving average of the burst rate per slot and entropy of the rates across the observation window are used as features in order to capture, respectively, the frequent and repetitive nature of nuisance transmissions. Furthermore, a misbehaving application may change the traffic profile of a user in terms of the ratio of received and sent bursts (known as response ratio), as in the case of the outage induced storm described above where many SYN packets will not generate acknowledgments. Hence, we also use as a feature the mean of the response ratios within the window of w slots.

Destination
Address. The number of destination IP addresses for a normally functioning mobile device can be very different from that of an attacker [42], whether the attack originates from the mobile network due to a misbehaving application, or from the Internet as in the case of unwanted traffic reaching the mobile network [56]. In the former, the number of destination IP addresses will be very small compared to the frequency of bursts, while in the latter this number is high. Thus, we calculate the percentage of unique destination IP addresses contacted within each time slot, and use the average of the most recent w values as a feature.

User Filtering and Parameter Selection
Information about the "health" of network servers is typically available to mobile operators in the form of KPIs, which can be fed to the algorithm to determine the users that should be monitored (e.g., those attached to signaling overloaded parts of a network). Also, using KPIs the detector can be switched off when signaling loads are below a certain threshold, effectively eliminating the need to continuously analyze users' traffic. In the following, we summarize the parameters of the RNN algorithm and discuss how they should be selected adaptively, based on both KPIs and RRC configurations, and also how the choice of each parameter influences the performance of the detector: • Slot size Δ: This defines the resolution of the algorithm and the frequency at which classification decisions are made. It should be long enough for the measured statistical information to be significant, but not too long to make the algorithm react slowly to attacks. In our experiments we set Δ = 1 min. • Window size W = wΔ: This determines the amount of historical information to be included in a classification decision. The choice of the window size presents a trade-off between speed of detection and false alarm rate, since a small window makes the algorithm more sensitive to sudden changes in the traffic profile of a user, which in turn increases both detection and false alarm rates. This trade-off can be optimized by adjusting W according to the level of congestion in the control plane, with shorter windows for higher signaling loads to enable the algorithm to quickly identify misbehaving UEs. The value of w used in our experimental results is 5, but we also experimented with other values which confirmed the aforementioned observations. • Maximum packet inter-arrival time within a burst δ: This should be selected based on the RRC timers, so that potentially resource-inefficient transmissions can be tracked. In our simulations of a UMTS network, the timers in cell DCH and cell FACH states are set to, respectively, T 1 = 6s and T 2 = 12s based on [53]. We have evaluated different values of δ in 0.5 min(T 1 , T 2 ) < δ < min(T 1 , T 2 ), which all led to similar detection performance, but training time tends to drop as δ is increased within this range. This is because the time between malicious transmissions in our training dataset is slightly greater than min(T 1 , T 2 ), so that the difference between attack and normal traffic becomes more pronounced in the features as δ approaches this value, leading to shorter training times. However, while large values of δ will reduce computational overhead, they may result in the loss of valuable information due to traffic aggregation.

EXPERIMENTS AND RESULTS
In this section, we evaluate the performance of our detection technique using the mobile network simulator developed in [39,40]. We first present the traffic models that characterize the normal user behavior, and two attack models that represent both malicious and misbehaving UEs. Then we discuss the results of applying the algorithm on the dataset produced by the simulator.
Since the impact of signaling storms on mobile networks has been analyzed extensively in [4,21,40], the objective of the present simulation setup is to evaluate the performance of our detection algorithm, and therefore only a small scenario has been considered. In particular, we simulated 200 3G/UMTS UEs in an area of 2 × 2 km 2 which is covered by 7 base stations connected to a single radio network controller (RNC). The CN consists of the SGSN and the GGSN (the mobile gateway) which is connected to 37 Internet hosts acting as application servers, five of which for instant messaging (IM), and two are contacted by the attacking UEs.

Model of the User
The user model consists of three popular mobile services that are active simultaneously in order to create realistic traffic profiles. The model can also support a diurnal pattern for UE behavior, where the UE is active for a certain duration every 24 h, and is inactive the rest of the time during which the user does not generate or respond to traffic. This pattern represents the day/night cycle of users, and can be varied from one user to another based on a random distribution.

Web
Browsing. The interactive web browsing behavior is based on the selfsimilar traffic model described in [40] and assumes Zipf-like distribution for web server popularity, which has been widely used in the literature since it was first suggested in [11].

Instant
Messaging. IM applications are characterized by frequent, small data transmissions and a long tail distribution representing messages with media rich contents such as videos and photos. The IM application model consists of two distinct but related parts: message generator and responder. Each UE generates messages to chosen destinations, and also responds to received messages with a given probability. The message generator works based on sessions and waves. A session represents the duration that the user is actively generating messages, and consists of one or more waves where the messages are actually sent. At each wave, the user generates and sends one or more messages, the number and length of which are configurable with random distributions, to a single destination (mobile user) chosen at random. The time between waves within a session, the session duration and the time between user sessions are all given by random distributions. On the other hand, the UE responds to each received message with a given probability, and this response behavior is independent of message generation, and can occur both inside and outside of the user's IM sessions.
The final destination of a message can be another mobile in the same network (explicitly simulated) or a mobile in another network; mobiles in different networks are represented by one or more servers in the simulation, which act on behalf of these users. Regardless of its final destination, each message passes through an Internet chat server, which forwards the message to its final destination, that is, another mobile user. We simulate multiple chat servers representing popular chat applications and services such as WhatsApp, Skype, etc., and currently assume that each message belongs to a chat application that is chosen uniformly at random from the available applications. The simulation model supports more generic message-to-application assignment based on other random distributions.

Short Message Service.
The SMS application operates in the same manner as the IM application, but differs in that there is a single intermediate server within the mobile network that handles all SMS messages for that network, that is, the SMSC server. SMS messages are also different than IM messages in their types, which can be in-network mobile, out-network mobile, premium, etc. In-network mobiles are naturally represented by the UEs explicitly simulated, while all other destinations are represented by servers outside the simulated mobile network, with one or more servers representing each class. Therefore, the type of a sent or received SMS can be inferred from its source and destination addresses (numbers). The type of the SMS message the UE generates is chosen at random based on the parameters of the SMS application. Note that while SMS traffic affects the signaling behavior of users, it is not monitored by our detection system.

Attack Model
We consider two types of cell DCH attacks which overload the control plane by causing superfluous promotions to the high bandwidth cell DCH state. The first attack is aggressive in the sense that a malicious device knows when an RRC state transition occurs, and launches the next attack once a demotion from high to low bandwidth states is detected. To perform the attack, we assume that the attacker has inferred the values of the RRC timers, and is monitoring the user's activity in order to estimate when a transition occurs so as to trigger a new one immediately afterwards. However, there could be an error between the actual transition time and the estimated one, which we represent by an exponentially distributed random variable with mean 2s. When the attacker "thinks" that a transition has occurred, it sends a high data rate traffic to one of its Internet servers in order to cause a promotion to the high bandwidth state. This model is used mainly for training the RNN.
The second attack type is based on a poorly designed mobile application or operating system that sends periodic messages whenever the user is inactive, with the transmission period set to be slightly larger than the cell DCH timer in order to increase the chances of triggering state transitions. This behavior represents, for example, the case where a pull mechanism is used to fetch updates periodically, and the update period happens to "synchronize" with the RRC timer. However, unlike aggressive attackers, this behavior does not guarantee the generation of signaling traffic for each data transfer, since (i) it only starts when local user activity stops but there can be downlink traffic that may have restarted the timer at the signaling server; and (ii) the data volume may not be large enough to trigger a promotion to cell DCH state. In both cases, the periodic transmissions may become completely out of sync with the RRC state machine, therefore not generating significant signaling traffic.
The two distinct attack models allow us to represent both malicious and benign behaviors that may lead to a storm, but the first is well distinguishable and separable from the behavior of a normal user, in terms of both temporal and traffic volume characteristics. On the other hand, the second attack model captures the signaling behavior of legitimate applications and operating systems that are much more similar to an "attack" rather than to a "normal" behavior, but are difficult to detect from user plane dynamics. Thus we use this model to test the performance of our algorithm.

Results
The RNN algorithm provides at the end of a time slot the probabilities that the input features belong to an attack and normal behavior, and the final decision about the traffic is then determined by the ratio of the two output nodes: It is classified as attack if the ratio is greater than 1 and normal otherwise. Figure 4 shows the classifier output (top) and the actual RRC state transitions (bottom) of a misbehaving UE as captured during a simulation run. It can be observed that when the malfunctioning application is active, the number of state transitions significantly increases, with most transitions occurring between the cell FACH and cell DCH states in this attack scenario. This alternating behavior causes excessive signaling load in the mobile network, while predominantly generating normal traffic volume, rendering traditional DoS defense techniques ineffective. However, our detection mechanism is able to track very accurately the RRC state transitions of the UE, and to identify quickly when excessive signaling is being generated, despite the fact that it does not directly monitor these transitions but rather infers them from the features that we have described. One can also observe that the classifier's output sometimes drops close to 1 during an attack epoch, which is attributed to other normal applications generating traffic in those time instants, thus reducing the severity of the attack. As mentioned earlier, the detection speed and tolerance to signaling misbehavior can be adjusted by modifying the size of the observation window, which in this scenario is set to 5 min. Figure 5 shows results when there is no attack, where the number of state transitions in a given period are small and due to normal traffic generated and received by the UE. In this case, the classifier does not generate any alarms regarding the signaling behavior of the UE as one would expect.
Next we examine in Figure 6 how our algorithm performs when presented with a normal user that generates moderately more state transitions than the average normal user in the simulations. Interestingly enough, the classifier outputs a single alarm (out of 360 samples) when the corresponding state transitions are indeed excessive. Since the detection algorithm is supposed to be active only when there is a signaling overload condition, such classification decisions may not always be considered as false alarms, as the goal would be to identify users that are causing congestion, regardless of whether they are attacking deliberately or not.
Finally, Figure 7 illustrates the accuracy of our classifier, namely the proportion of correct decisions (both true positives and true negatives) out of all test samples. The figure shows results for 50 UEs, where each data point represents the average of 360 classification decisions taken during the simulation experiment which lasted for 6 hours (note  the resolution of the detector is Δ = 1 min). The evaluation is based on a strict criterion whereby for each UE, we assume that if it generates at least 1 attack packet within a time slot, then the corresponding output of the classifier should be greater than 1, otherwise a false classification decision is declared. The results indicate an accuracy between 88% and 98% with an average of 93% over the 50 test cases. This fluctuation can be attributed to the fact that our algorithm does not classify an attack as such until few time slots have passed (depending on the number of slots w within the window), and therefore misbehaving UEs with many silent periods will produce higher false positives; fortunately, these less aggressive UEs will generate lower signaling load. In fact, the fraction of normal instances that have been mistakenly classified as attack (false positive rate) is zero for all but one case, while the fraction of attack instances that have been correctly identified (true positive rate) is on average 90% which can be improved, at the cost of higher false positives, by reducing the window size W .

CONCLUSIONS
This paper proposed an online approach for detecting mobile devices that contribute to signaling overload, based on the RNN [23,25]. The method relies on the analysis of data packets traversing the mobile CN, which can be performed using standard traffic monitoring tools, in order to infer the signaling dynamics of users. This offers the advantages of not requiring to decode/decrypt lower control plane layers, and fewer number of nodes to monitor. In the algorithm, summary statistics about the behavior of each observed mobile device are collected and stored in a moving window at fixed time intervals (slots), from which a number of features are calculated to capture both sudden and long term changes in the user's signaling behavior. The features for the most recent time slot are subsequently fused using a trained RNN to produce the final classification decision. Using a discrete-event mobile network simulator, we have shown that our technique achieves a very high detection rate with almost zero false alarms. The proposed approach is also flexible, providing a number of parameters to optimize the trade-off between detection speed, accuracy and overhead, based on signaling overload conditions in the network.