Scalable Traffic Classification in Internet of Things (IoT) for Network Anomaly Detection

TECHNICAL REPORT

Grantee	Victoria University of Wellington
Project Title	Scalable Traffic Classification in Internet of Things (IoT) for Network Anomaly Detection
Amount Awarded	USD34,000
Dates covered by this report:	2018-12-31 to 2018-12-31
Report submission date	2019-09-30
Economies where project was implemented	Taiwan, New Zealand
Project leader name	Winston Seah
Project Team	Adrian Pekar [email protected] Bryan Ng [email protected] Alvin Valera [email protected]
Partner organization	National Chiao Tung University, Taiwan; University of Cauca, Columbia

Project Summary

This project focuses on accurate traffic classification in the Internet of Things (IoT). The IoT comprises large numbers of heterogeneous simple devices running single applications, often with little to no security features making them easily compromised and used as tools in cyberattacks. As we become more connected and reliant on the Internet, any form of disruption in connectivity due network anomalies can result in adverse consequences, ranging from loss of productivity and revenue, to destruction of critical infrastructure and loss of life. In the last decade, cyberattacks have increased at an alarming rate, even just based on the reported incidents. We need to be able to classify new traffic types coming from IoT devices accurately and promptly, so that anomalous traffic can be identified and dealt with quickly. Payload-based (PB) techniques although can reach high accuracy, but suffers from several limitations. The limitations of PB classification are expected to be addressed by statistical-based (SB) techniques. SB approaches are based on flow features and the traffic is classified using Machine Learning algorithms (MLAs). SB classification assumes that specific flow-level features such as flow duration, inter-arrival time, transmitted bytes, packet length and packet size can distinguish different types of traffic flows. We studied how unsupervised machine learning can be applied to network anomaly detection in the dynamic IoT environment where previously unencountered traffic types and patterns are regularly emerging and need to be identified and classified. This project involves the study and selection of appropriate MLAs (to be implemented as a proof-of-concept prototype) and identification of those flow features which have the highest impact on the traffic classification accuracy. This project contributes to making safer cyber-physical systems that are an integral component of the IoT.

Project factsheet information
Background and justification
Project Implementation
Project Evaluation
Gender Equality and Inclusion
Project Communication Strategy
Recommendations and Use of Findings
Bibliography

Background and Justification

As we become more connected and reliant on the Internet, any form of disruption in connectivity due network anomalies can result in adverse consequences, ranging from loss of productivity and revenue, to destruction of critical infrastructure and loss of life. In the last decade, cyberattacks have increased at an alarming rate, even just based on the reported incidents. In 2016, DDoS attacks generated more than 1 Tbps of traffic and shut down several popular online services (Frost & Sullivan, Jan2017). The more worrying issue is that the Distributed Denial-of-Service (DDoS) attacks are originating from Internet of Things (IoT) devices (Rodriguez, 2016). The IoT comprises large numbers of heterogeneous simple devices running single applications, often with little to no security features making them easily compromised and used as vectors in DDoS attacks, exactly what happened in the September 20, 2016, DDoS attack against a popular information security blog called “Krebs On Security” (Kerbs, 2016). This spawns an entirely new challenge to network anomaly detection and response.

Firewalls are the main line of defence against cyberattack by filtering traffic based on pre-specified policies and identifying traffic patterns that match known threats. With the diverse applications used in today’s networks, generating voluminous heterogeneous traffic data, organizations are also turning to big data and analytics for anomaly detection (Borthick, 2015). However, new threats can only be detected after the damage is already done, making it more useful for forensics than real-time response. Firewalls are increasingly expected to perform accurate network anomaly detection, to find patterns in traffic data that do not conform to normal behaviour that may be an indication of potential threats (Frost & Sullivan, Mar2017). Network anomalies can be non-malicious but still have the potential to disrupt network operations. E.g. a misconfigured network device can cause broadcast storms, or a temporary link failure can cause traffic to be re-routed leading to transient congestion. Network tomography, a popular technique for inferring internal network characteristics from data obtained at the endpoints, is increasing being used for such purposes (Mardani and Giannakis, 2016).

To deal with these new challenges, we need more intelligent and ideally self-learning approaches. In this respect, we intend to study how unsupervised machine learning can be applied to network anomaly detection in the dynamic IoT environment where previously unencountered traffic types and patterns are regularly emerging and need to be identified and classified. This proposal fits within the Digital Futures theme of Victoria University of Wellington, which is one of the eight, distinctive, multi-disciplinary themes that span the university. We work closely with other academic institutions (viz. National Chiao Tung University, Taiwan and University of Cauca, Columbia) and the industry to ensure that our research outputs are shared and benefit the community. As with the current practise, software prototypes are open-sourced so that interested parties can utilize and customize them to meet their requirements. In return, they are expected to also reciprocate by sharing their outcomes. Training of postgraduate students will be an important element of this proposal. Throughout the project, students pursuing postgraduate degrees will be heavily involved in the research as well as the implementation of the proof-of-concept prototypes.

Project Implementation

This project proposed to achieve three objectives:

To be able to learn to classify new traffic type as they emerge and evolve; this is achieved through unsupervised machine learning that does not require labelled data sets for training;
To perform flow-based classification using flow features without having to inspect the contents of the packets; this approach can scale to the immense volume of envisaged IoT traffic;
Through the previous two objectives, to be able to detect network anomalies early and allow network providers / administrators to take remedial actions before the anomalies develop into full blown problems.

Objective 1 relates to a technique that is continuously increasing in popularity for traffic classification (TC), namely, machine learning (ML). It involves the coupling of flow statistics with learning-based approaches to classify network traffic (Zhang et al. 2015). Statistical classification uses an underlying probability model to calculate the probability of a case belonging to a class. Most of the state-of-the-art ML-based TC approaches claim high accuracy and performance (Kim et al. 2008; Williams et al. 2006; Auld et al. 2007). However, there are generally two issues:

They achieve these results in a static, offline environment;
They need to be trained on labelled data sets, i.e. data sets that have been pre-processed to identify the types of network anomalies to detect.

To address the first issue, we assessed the effectiveness of ML approaches in real network scenarios, we evaluated five statistical classifiers on a physical network testbed and observed that these classifiers did not classify traffic as accurately as when they were tested using offline datasets. It is important to note that the poor classification of traffic concerns the classifiers’ abilities to detect malicious flows within the dataset. Instead, each classifier was successful in identifying non-malicious flows. Looking beyond the performance of the classifiers, it appears that more attention needs to be paid to the networking environments where ML is deployed because not enough thought is given to how these ML algorithms are affected by the network environment they operate in. The results of this study was presented in the IM2019 conference.

The second issue is more difficult to address but more critical as new traffic types are constantly introduced into the network by IoT devices and the task was undertaken by a new PhD student Murugaraj Odiathevar (or Muru, in short.) In order to be able to identify new traffic types, we proposed two-stage dynamic filter to rapidly identify new and unknown network anomalies. We refer to this as a Hybrid Online Offline System for Network Anomaly Detection, involving the following two stages:

Stage 1: Incoming data to the network is passed through a large offline reference data set that contains information on known malware parameters. A ML algorithm determines if the incoming data shows similar characteristics to known malware and provides an initial response. This is the first filter.
Stage 2: The incoming data is then passed through a smaller online dataset (a dynamic repository of new signals and indicators of malware). If the online dataset also indicates a malware character signal, the incoming data is prevented from entering the network. If the online dataset identifies new characteristics of malware, these parameters are then updated in the offline dataset.

The idea has been presented at the ICCCN 2019 conference and also filed for IP protection with the Australian Patent Office.

Objective 2 aims to develop more efficient and scalable TC approaches. The first step in TC is to measure and evaluate various characteristics of the traffic. This provides options to examine and classify the network traffic from various aspects. Classic classifiers such as port-based classification or Deep Packet Inspection (DPI) - also known as payload-based classification - rely on capturing and analysing individual packets. Port-based classification uses less resources than DPI, but has become inaccurate due to the increasing number of applications using dynamic ports, e.g. P2P applications like BitTorrent, (Nguyen and Armitage, 2008). DPI, while not affected by the issue of dynamic ports, has other drawbacks. Protocol encapsulation, encrypted transmission and privacy violation make the deployment of DPI highly limited or inaccurate (Dyer et al. 2013). And in both cases, collecting information about traffic on a per-packet basis in current networks is a resource demanding process. Scaling up tools to function in heavily trafficked networks involves making heavy trade-offs between efficiency and accuracy. The most resource intensive part of the classification process is capturing information from each packet in the system. The resource overheads could be dramatically reduced by looking at groups, or flows, of packets rather than isolated ones.

We first created a network traffic flow feature measurement tool, called flowRecorder, to extract flow information from raw network data capture. This tool is available from Github - https://github.com/drnpkr/flowRecorder. We then addressed the problem of heavy-hitter traffic identification which is regarded as a prevalent network anomaly. Heavy-hitter (HH) detection has two major processes: flow measurement, in which packets are organised into flows, and flow marking that assigns the ‘HH’/‘non-HH’ labels to the individual flows. Label assignment is based on a threshold. If a flow exceeds the threshold, it will be marked as a HH. Otherwise, a non-HH label will be assigned to the flow. While this detection is relatively simple, in practice, several challenges can raise such as the relatively high overhead caused by the measurement and the granularity of traffic flow semantics for achieving high accuracy (Mogul et al. 2010). Another challenge is threshold estimation. The existing HH detection systems use either a static (Curtis et al. 2011) or adaptive (Liu et al. 2017) threshold for classifying the flows. There is an ongoing discussion regarding the accuracy and efficacy of various thresholds that existing approaches utilise. The threshold selection strategy of one approach or the other is usually conditioned by the traffic and performance characteristics that are very specific to their networks. However, to date, there is no generally accepted and widely recognised uniform threshold for HHs detection and there is no systematic approach taken to determine such a threshold(s). We analysed four different datasets, specifically UNIV1, UNIV2, CAIDA2016, and CAIDA2018, to specify a common threshold for HH detection that performs well in a number of network and traffic conditions. Our analysis followed the six-steps Knowledge Discovery Process (KDP) methodology developed by developed by Cios et al. (2007). We employed Silhouette analysis (Wang et al. 2017) to determine the optimal number of clusters and used K-means for clustering. Based on the obtained results, there is no threshold that clearly separates flows into HHs and non-HHs. The flow sizes have a diverse character that lead to more than two natural clusters. We stress that threshold selection should include a detailed analysis of the network and its traffic. A threshold (Benson et al. 2011) that performs optimally in one network may underperform in another. TCP and UDP HH flows should be also classified using different threshold values. Furthermore, classification (Awduche et al. 2002) accuracy can be optimised with the combination of more than one threshold. Our approach to identify HH is based on per-flow packet size distribution (PSD) and template matching (TM). PSD is able to capture the behaviour and dynamics of network traffic, while TM (Gorcin and Arslan 2012) is used for pattern recognition. The figure below shows the system architecture of our approach.

As network traffic continues to grow with the emergence of the IoT, information visualisation is gaining popularity as a useful tool for TC by turning all forms of potentially obscure data into an image-based representation. In this respect, we apply the angular histogram visualisation technique (Geng et al. 2011) on network traffic flow measurement data in order to derive an information-rich overview of large data sets for improving the interpretation and understanding of network traffic flow measurement data.

To the best of our knowledge, no such work currently exists in the domain of network and traffic management.

Objective 3 focuses on initiating contact with end users of the outcome from this project to understand their specific needs and customize the techniques and tools that we have developed to meet their needs. In this respect, we have already established contact with our university’s IT Services (ITS). Based on our discussions, they are keen to explore how our Hybrid Online Offline Detection system can detect new anomalous traffic that has been let through by their firewall. While the system is designed for detecting network anomalies, it is a generic design that can be customised for other applications. For example, we are working with our Taiwan collaborators to apply it to anomaly detection based on host behaviours (logs collected from all hosts to be protected.)

Lack of interest in pursuing postgraduate studies is a prevalent problem here in Wellington, New Zealand, as the IT industry provides abundant job opportunities. Consequently, majority of the postgraduate students in our school are from overseas. To overcome this problem, as soon as we signed the contract for this project with ISIF Asia, we advertised the opportunities offered by this project in various locations, e.g. emails on relevant mailing lists, on our university’s scholarships website, NZ scholarships website for international students, research group’s website, among others. We also offered smaller projects to undergraduate students within our faculty. There were many overseas applicants interested in the offered scholarship (regardless of the research topic) but these were all not qualified. Alejandra Duque Torres contacted us expressing interest to work on HH detection using ML techniques as an intern in our research group. Alejandra is a Masters student from University of Cauca, Colombia and eventually spent 9 months with us. We supported her expenses during her stay in NZ with the funds from this project.

Project Activities	Input	Output	Timeline	Status w.r.t. project
ML-based approach for classifying new unknown network anomalies	PhD student funded by Victoria Doctoral Scholarship	Hybrid Online Offline System (HyOnOffSys) for Network Anomaly Detection; 1 conference paper; 1 patent filed.	Jul 2018 – Dec 2018	Work for this project has completed;
Flow-based traffic classification – flow measurement tool to extract flow information from raw network data	Postdoctoral Fellow (10% time); funded by another grant	flowRecorder - network traffic flow feature measurement tool; proof-of-concept available via Github	Jan 2018 – Mar 2018	Completed
Flow-based traffic classification – network traffic visualization	4th year student in Bachelor of Engineering programme	Angular Histogram based Visualization (AHV) of network traffic flow; 1 conference paper	Mar 2018 – Nov 2018	Completed
Flow-based traffic classification – HH detection	Intern from University of Cauca, CO	HH detection using template matching (HHdTM); 1 conference paper (best paper award candidate) and 2 journal papers under review.	Jul 2018 – Dec 2018 (funded by ISIF Asia) Jan 2019 – Mar 2019 (funded by another grant)	Completed
Application of network anomaly detection tools to real networks	Tools from this project: HyOnOffSys; HHdTM; AHV;	Contacted university’s IT Services to apply these tools for early detection of network anomalies, especially those that have not been detected by the firewall.	Oct 2018 – Dec 2018	Completed engagement stage;
	Tools from this project: HyOnOffSys; HHdTM; AHV;		Post project	Customizing tools for end users’ needs.

Project Evaluation

Overall, this project has been a success and great value to Victoria University of Wellington (VUW), in particular, the School of Engineering and Computer Science (ECS). This project has also helped us established a new valuable linkage with University of Cauca, Colombia, which was not originally planned.

Despite the initial delays in recruiting the team members, we have been able to achieve all our objectives and, in some cases, exceeded our objectives. In terms of manpower training, we managed to train three students instead of the original plan for only one. The research outcomes in term of publications has also exceeded our original plan.

Significant contribution to the project outcomes has been from the two female members, namely, Alejandra Duque Torres (intern from University of Cauca) and Mona Ruan (ECS final year undergraduate student). This is a major achievement in terms of gender equality as STEM is known to attract fewer female than male students. From the perspective of ethnicity and nationality diversity, this project again performed well (as noted in the Gender Equalty and Inclusion section.)

The work completed by this project has only scratched the surface of the problem and there is opportunity for further development. The VUW PhD student Muru is working in two parallel strands: (i) extending the Online Offline Model from a centralized (cloud-based) architecture to a distributed (Edge-based) architecture, and (ii) customising features for different applications. The Masters student Alejandra Torres is planning to continue on her PhD studies in Victoria University of Wellington after completing her Masters candidature in University of Cauca, Columbia. The undergraduate student, Mona Ruan, who worked on the project as part of her Bachelor of Engineering (BE) degree in ECS/VUW has joined a local company that utilizes her skills gained from participating in the project.

The outputs of this project will be shared with the community through the dissemination of academic publications and sharing of source codes; with regard to software, they are being released gradually via Github. At the same time, the knowledge and skills developed will be channelled back into the larger research programme.

As noted in the implementation section, recruiting the required skilled manpower for the project is a key issue. More time should be set aside for manpower recruitment, possibly starting as soon as the project has been awarded, even before the contracting was completed. Waiting for the contracting to be completed was a conservative approach but cost the project valuable time. Fortunately, we were able to allocate some manpower from other ongoing projects, e.g. Dr Adrian Pekar, to work on parts of the project at the early stages.

As a whole, this project fits within the Digital Futures theme of Victoria University of Wellington, which is one of the eight, distinctive, multi-disciplinary themes that span the university. This project is also part of a larger research effort lead by the Project Leader and the outcomes have helped in our skills development.

Indicators	Baseline	Project activities related to indicator	Outputs and outcomes	Status
How do you measure project progress, linked to the your objectives and the information reported on the Implementation and Dissemination sections of this report.	Refers to the initial situation when the projects haven’t started yet, and the results and effects are not visible over the beneficiary population.	Refer to how the project has been advancing in achieving the indicator at the moment the report is presented. Please include dates.	We understand change is part of implementing a project. It is very important to document the decision making process behind changes that affect project implementation in relation with the proposal that was originally approved.	Indicate the dates when the activity was started. Is the activity ongoing or has been completed? If it has been completed add the completion dates.
Evaluate the accuracy and applicability of Machine Learning (ML) methods in live networks	Published work mostly report results from ML methods tested in offline (non-live) scenarios for detecting malicious traffic	Studied the performance of various ML methods in a live network testbed.	Results showed that ML applied in live networks are good for detecting non-malicious network traffic.	Completed in Mar 2018 and submitted paper to IM2019; paper was accepted and presented at conference.
Ability of ML techniques in adapting to changing network conditions and detect new unknown network anomalies.	Current ML methods need to be re-trained with new data of new unknown anomalies to be able to detect them.	Designed a new approach that can dynamically learn to detect new unknown anomalies while adapting to changing normal conditions.	Our approach achieves over 95% accuracy on known anomalies and over 60% detection rate on most of the unknown anomalies.	Work done during project (Jul-Dec 2018) described in paper that was presented at ICCCN2019; more research being carried out.
Heavy-hitter traffic detection remains a challenge with changing network traffic conditions.	Current HH detection relies on thresholds, yet there is no consistent way to determine thresholds.	Design HH detection that does not rely on thresholds, and generalizable over different network conditions.	A method that uses the template matching on per-flow packet size distribution to identify HHs and achieved 96% accuracy by using only the first 14 packets of a flow.	Completed. Paper submitted to LCN2019 has been selected as a Best Paper Award candidate. Two more papers submitted to journals.
For flow-based classification, flow features need to be extracted from raw network traffic data.	Flow feature extraction done in ad hoc manner with tools customized for specific needs.	Developed tool to extract key fields in headers to identify flows, and compute other key info like duration, packet size distributions, etc.	flowRecorder tool that is able to extract key information on packet size distribution of flows.	Available from Github.

Gender Equality and Inclusion

The key participants in this project come from diverse cultural and ethnic backgrounds, ranging from New Zealand Pakeha (European ethnicity) and NZ Asian, Singaporean and Malaysia Chinese, Filipino, Hungarian/Slovakian, and South American (Columbia). The two students who worked on this project are female students who contributed significantly to the outcomes. The key challenge in NZ is the lack of good manpower especially in the IT industry. The NZ candidate who was originally selected to work on this project decided to join the industry instead of pursuing postgraduate studies. We were fortunate enough to find suitable replacements, viz. Mona B.H. Ruan (NZ Asian) and Alejandra Duque Torres (Columbia), both females, to work on the project.

Project Communication Strategy

The project is also listed on the Wireless Networks Research Group’s website. See: https://ecs.victoria.ac.nz/Groups/WiNe/WirelessNetworksResearchGroup#Projects.

This being a research project, our key communication and dissemination strategy is to present our work in conferences and publish in relevant technical journals. In this respect, we have the following:

Project Leader Winston Seah presented at the APRICOT 2019 conference on Tuesday at 16:30 to 18:00 (UTC +09:00) during the session on Tools. The title of the presentation is “Clustering-based Analysis for Heavy-Hitter Flow Detection” which discusses the research done during the first half of this project where we identified the issues and challenges of heavy-hitter flow detection. Link to Tools session: https://2019.apricot.net/program/schedule/#/day/9/tools.
A paper on the use of visualization for traffic classification has been presented at the 33rd International Conference on Advanced Information Networking and Applications (AINA-2019), Kunibiki Messe, Matsue, Japan, March 27-29, 2019. This paper “Angular Histogram-Based Visualisation of Network Traffic Flow Measurement Data” is based on the Bachelor of Engineering degree Honours (4th) Year project of Mona B.H. Ruan who studied the use of network visualization techniques for traffic classification. Link: https://doi.org/10.1007/978-3-030-15032-7_30.
Another paper "Traffic Classification with Machine Learning in a Live Network" describing an experimental evaluation of machine learning algorithms for traffic classification has been presented at the 16th IFIP/IEEE Symposium on Integrated Network and Service Management (IM 2019), Washington, DC, USA, April 8-12, 2019, in the Experience Session. Link: https://ieeexplore.ieee.org/document/8717890.
A paper on the concept of the “Hybrid Online Offline System for Network Anomaly Detection” was presented at the 28th International Conference on Computer Communication and Networks (ICCCN), Valencia, Spain, 29 July-1 August 2019. Link: https://doi.org/10.1109/ICCCN.2019.8847011.
Network traffic flow feature measurement tool, called flowRecorder, to extract flow information from raw network data capture. This tool is available from Github - https://github.com/drnpkr/flowRecorder.

The Masters research of Alejandra Duque Torres, an intern from the University of Cauca, Columbia, has produced significant outcomes. Her research topic is “An Adaptive Threshold to Identify Elephants and Mice Flows in SDN” which addresses a prevalent network anomaly - heavy-hitter or “elephant” flows. Her research has resulted in the following papers which are in different stages of submission and publication:

Heavy-Hitter Flow Identification in Data Centre Networks using Template Matching, one of three Best Paper Award candidates, to be presented at the 44th IEEE Conference on Local Computer Networks (LCN), October 14-17, 2019, Osnabrück, Germany. See LCN 2019 technical program, LCN Plenary 1 - Best Paper Candidates, link: https://www.ieeelcn.org/Program_technical.html.
Knowledge Discovery: Can it shed new light on a fundamental question in heavy-hitter detection? - submitted to the IEEE Transactions on Knowledge and Data Engineering, 1`3 June 2019.
Per Flow Packet Size Distribution for Heavy Hitter Flow Detection - submitted to IEEE Networks Magazine, 10 August 2019.

Recommendations and Use of Findings

There is a lot of interest (bordering hype) on the use of machine learning (ML) in various applications, including network traffic classification (TC) especially for threat detection and management. However, due to the computation complexity of ML algorithms, achieving reliable TC with ML remains poorly studied especially in a live network. Most, if not all, studies are based on offline scenarios using collected network traffic traces which do not represent the real challenges of applying the same ML techniques in a real network environment where network traffic can be lost due to congestion and for that matter, the network anomaly itself that is being monitored.

The first step would be dealing with the challenges of a real network, where there is a need to balance/tradeoff the response time (how fast an anomaly can be detected) and the network traffic data acquisition process (how much traffic data to collect for processing). There are a lot of lessons to be learnt and experience gained from setting up a live network testbed and conducting experiments on as realistic a testbed as possible.

As the new traffic types and anomalies are appearing all the time, it is a challenge to know what is considered normal and what is anomalous. The TC system must be able to detect and learn new traffic profiles, and not have to be manually re-trained all the time. In this respect, sharing of network traffic datasets is critical. It would be beneficial for projects to share their data with other projects to use, and in return get the the results from the use of their shared datasets.

On an administrative perspective, the report format is somewhat onerous for the amount of funding provided, at least in the New Zealand context, where a yearly report of less detail is required for much larger funding amounts. It would be better to have different reporting styles/formats to suit the different project types. The reporting system should also have function to download/print a copy of the report in PDF format, showing all the details, as the current system limits the view of the contents to the boxes.

Bibliography

T. Auld, A. W. Moore, and S. F. Gull, “Bayesian Neural Networks for Internet Traffic Classification,” IEEE Transactions on Neural Networks, vol. 18, no. 1, pp. 223–239, Jan 2007.
D. Awduche, A. Chiu, A. Elwalid, I. Widjaja, and X. Xiao, “Overview and Principles of Internet Traffic Engineering,” in Proc. 21th IEEE Int. Conf. on Computer Communications Workshops, ser. NOMEN’02, 2002, pp. 357–362.
T. Benson, a. Anand, a. Akella, and M. Zhang, “Microte: fine grained traffic engineering for data centers,” Proceedings of the Seventh of the Seventh COnference on Emerging Networking EXperiments and Technologies (CoNEXT), pp. 1–8, 2011.
S. Borthick, “Countering Cyber Attacks with Big Data and Analytics,” Big Data & Analytics (BDA), vol. 3, no. 6, Frost & Sullivan, June 2015. Frost & Sullivan, “Cyber Security Predictions for 2017—an Asia-Pacific Perspective,” March 2017.
K. J. Cios, W. Pedrycz, R. W. Swiniarski, and L. A. Kurgan, Data Mining: A Knowledge Discovery Approach. Berlin, Heidelberg: Springer-Verlag, 2007.
R. Curtis, W. Kim, and P. Yalagandula, “Mahout: Low-overhead datacenter traffic management using end-host-based elephant detection,” in Proc. 30th IEEE Int. Conf. on Computer Communications, ser. INFOCOM’11, Apr. 2011, pp. 1629–1637.
K. P. Dyer, S. E. Coull, T. Ristenpart, and T. Shrimpton, “Protocol misidentification made easy with format-transforming encryption,” in Proceedings of the 2013 ACM SIGSAC Conference on Computer and Communications Security, CCS ’13, (New York, NY, USA), pp. 61–72, ACM, 2013.
Frost & Sullivan, “The Global Network Firewall Market – The Expanding Role of Firewall Sustains Market Growth,” Market Engineering, K140-74, January 2017.
Gardner, “Gardner says 8.4 Billion Connected "Things" Will Be in Use in 2017, Up 31 Percent From 2016,” February 7, 2017. Online: http://www.gartner.com/newsroom/id/3598917
Z. Geng, Z.S. Peng, R. Laramee, J.C. Roberts and R. Walker, “Angular histograms: Frequency-based visualizations for large, high dimensional data,” IEEE Transactions on Visualization and Computer Graphics 17(12), 2572–2580, 2011. Online: https://doi.org/10.1109/MILCOM.2012.6415770
A. Gorcin and H. Arslan, "Template matching for signal identification in cognitive radio systems," MILCOM 2012 - 2012 IEEE Military Communications Conference, Orlando, FL, USA, 2012, pp. 1-6.
Matt Hayes, Bryan Ng, Adrian Pekar and Winston K.G. Seah, “Scalable Architecture for SDN Traffic Classification”, IEEE Systems Journal, 18 April 2017.
Kerbs, “KrebsOnSecurity Hit With Record DDoS,” 21 September 2016, online: http://krebsonsecurity.com/2016/09/krebsonsecurity-hit-with-record- ddos/.
H. Kim, K. Claffy, M. Fomenkov, D. Barman, M. Faloutsos, and K. Lee, “Internet Traffic Classification Demystified: Myths, Caveats, and the Best Practices,” in Proceedings of the 2008 ACM CoNEXT Conference, Madrid, Spain, 9-12 December 2008, pp. 11:1–11:12.
Z. Liu, D. Gao, Y. Liu, H. Zhang and C. H. Foh, “An adaptive approach for elephant flow detection with the rapidly changing traffic in data center network,” Int. J. of Network Management, vol. 27, no. 6, e1987, Jul. 2017, 1987.
M. Mardani and G. B. Giannakis, “Estimating traffic and anomaly maps via network tomography,” IEEE/ACM Transactions on Networking, vol. 24, no.3, 1533-1547, June 2016.
J. C. Mogul, J. Tourrilhes, P. Yalagandula, P. Sharma, A. R. Curtis and S. Banerjee, “Devoflow: Cost-effective flow management for high performance enterprise networks,” in Proc. 9th ACM SIGCOMM Workshop on HotNets, ser. HotNets’10, Monterey, California: ACM, 2010, pp. 1–6.
T. T. T. Nguyen and G. Armitage, “A survey of techniques for internet traffic classifica- tion using machine learning,” IEEE Communications Surveys & Tutorials, vol. 10, pp. 56–76, Fourth 2008.
C. Rodriguez, “IoT Risk Becomes Real: DDoS Emerges as Primary Threat Vector for IoT,” Stratecast Perspectives & Insight for Executives (SPIE), vol. 16, no. 41, Frost & Sullivan, 11 November 2016.
F. Wang, H.-H. Franco-Penya, and Kelleher, “An analysis of the application of simplified silhouette to the evaluation of k-means clustering validity,” in 13th International Conference on Machine Learning and Data Mining MLDM, ser. MLDM’17, New York, USA, 2017.
N. Williams, S. Zander, and G. Armitage, “A Preliminary Performance Comparison of Five Machine Learning Algorithms for Practical IP Traffic Flow Classification,” SIGCOMM Comput. Commun. Rev., vol. 36, no. 5, pp. 5–16, Oct. 2006.
J. Zhang, X. Chen, Y. Xiang, W. Zhou, and J. Wu, “Robust Network Traffic Classification,” IEEE/ACM Transactions on Networking, vol. 23, no. 4, pp. 1257–1270, Aug. 2015.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License