Contribution of the French MetroSec j t t t ffi li d t ti project to traffic
Transcription
Contribution of the French MetroSec j t t t ffi li d t ti project to traffic
Contribution of the French MetroSec project j t tto traffic t ffi anomalies li d detection t ti Philippe OWEZARSKI LAAS-CNRS LAAS CNRS Toulouse, France [email protected] With the contribution of Patrice Abry, Pierre Borgnat, Nicolas Larrieu, Antoine Scherrer, Silvia Farraposo FRnOG, Paris, France, 30 mai 2008 1 Outline 4 Traffic characteristics and IDS 4 A non Gaussian and long memory model for Internet traffic with anomalies 4 Model validation with traffic traces (with and without anomalies) 4 Anomalies/DDoS attacks detection 8With the non Gaussian and long g memory y model 8Using deltoids 4 Ongoing g g and Future work FRnOG, Paris, France, 30 mai 2008 2 Motivation 4 Traffic anomalies ((on a link)) 4 One or several occurrencies that change the way traffic is flowing in the network 4 Consequences 4 Performance P f decrease d 4 QoS degradation g FRnOG, Paris, France, 30 mai 2008 Existing work 4 Several p projects j on traffic anomalies detection arised in the past 4 They rely in general on simple statistics on traffic characteristics 4 But they lack by a bad knowledge on traffic characteristics ÆLimit d efficiency ÆLimited ffi i n FRnOG, Paris, France, 30 mai 2008 Known traffic characteristics 4 Non Gaussian, non Poisson statistics 4 Long Range Dependence (LRD), Strong correlations 4 Traffic can look different according to the granularity of observation 4 And… d g y variable ! …Traffic is highly FRnOG, Paris, France, 30 mai 2008 5 Profile based IDS issues Traffic ff p profiles f in IDS do not consider such variability False p positive rate is high Æ Impossible p to fix reliable thresholds Temporal evolution of the number of TCP/SYN packets A traffic profile cannot be based only on some averages (non Gaussian) Æ High level statistics are required FRnOG, Paris, France, 30 mai 2008 6 Marginal laws 4 Distributions of empirical p probabilities p Δ=4ms 4 4 Δ=32ms LBL-TCP-3 Δ=256ms Poisson model? Exponential law? Gaussian? What aggregation level to select? FRnOG, Paris, France, 30 mai 2008 7 Traffic Correlation (SRD and LRD) Hurst parameter parameter, H = 0.641 0 641 FRnOG, Paris, France, 30 mai 2008 8 What model for a non Gaussian and long memory process ? FRnOG, Paris, France, 30 mai 2008 9 Non Gaussian with LRD model J i t modelling Joint d lli of f 11stt and d 2nd orders d statistics t ti ti 4 Packet or aggregated gg g count process: p XΔ((k)) XΔ(k) = #pkt during [kΔ, (k+1)Δ] 4 Bytes aggregated count process: WΔ(k) WΔ(k) = #bytes during [kΔ, [kΔ (k+1)Δ] 1 1st. PDF PDFs of f marginals i l as gamma laws l Note: one fit for each Δ 2nd. Covariance (or spectrum) with LRD Covariance of a farima model FRnOG, Paris, France, 30 mai 2008 10 Gamma distributions Γα , β ( x ) 1 = β Γ (α ) ⎛ ⎜x ⎜β ⎝ ⎞ ⎟ ⎟ ⎠ α −1 ⎛ x ⎜ exp ⎜ − ⎝ β ⎞ ⎟ ⎟ ⎠ Shape parameter α : can model from Gaussian to exponential ; 1/ α ≈ distance to Gaussian Scale parameter β : multiplicative factor FRnOG, Paris, France, 30 mai 2008 11 Long memory from a farima model 4 Long g range g dependence p covariance is a non-summable power-law Æ spectrum fXΔ(ν): fXΔ(ν) ∼ C|ν|−γγ, |ν|Æ0 |ν|Æ0, with 0<γ<1 4 Farima F 1. 2 2. = fractionnaly f l integrated d ARMA Fractional integration with parameter dÆLRD with γ=2d Sh t range correlation Short l ti of f an ARMA(1, ARMA(1 1) Æparameters θ, φ − i 2πν 2 1− e −i 2πν − 2 d 2 ( ) = ε 1− e 2 − i πν 2 XΔ f ν FRnOG, Paris, France, 30 mai 2008 σ θ 1 − φe 12 Monitoring platform DAG ENST LIP6 QoSMOS Jussieu RIPE TTM INT ENST-B ENS Lyon Mont de Marsan EURECOM Pau IUT GTR LAAS FRnOG, Paris, France, 30 mai 2008 13 Traces for validation Data Date (start time) T (s) Network link # Pkts (106) IAT (ms) Repository PAUG 1989-08-29(11:25) 2620 LAN(100BaseT) 1 2.6 ita.ee.lbl.gov/index.html LBL-TCP-3 1994-01-20(14:10) 7200 WAN(100BaseT) 1.7 4 ita.ee.lbl.gov/index.html AUCK-IV 2001-04-02(13:00) 1080 0 WAN(OC3) 9 1.2 wand.cs.xaikato.ac.nz/wand/wits CAIDA 2002-08-14(10:00) 600 Backbone(OC48) 65 0.01 www.caida.org/analysis/workload /oc48/ UNC 2003-04-06(16:00) 3600 WAN(100BaseT) 4.6 0.8 www-dirt.cs.unc.edu/ts METROSEC-ref1 2004-12-09(18:30) 5000 LAN(100BaseT) 3.9 1.5 www.laas.fr/METROSEC METROSEC-ref2 2004-12-10(02:00) 9000 LAN(100BaseT) 2.1 4.3 www.laas.fr/METROSEC METROSEC METROSECDDoS 2004-12-09(20:00) 2004 12 09(20:00) 9000 LAN(100BaseT) 69 6.9 13 1.3 www laas fr/METROSEC www.laas.fr/METROSEC METROSEC-FC 2005-04-14(14:30) 1800 LAN(100BaseT) 3.7 0.48 www.laas.fr/METROSEC FRnOG, Paris, France, 30 mai 2008 14 Traffic traces with anomalies TAuckland traces – NLANR project p j and MétroSec traces Anomaly Flash crowd DD S DDoS Quantity 4 4 10 3 9 12 FRnOG, Paris, France, 30 mai 2008 Tool Server web Hping H i Iperf Trinoo TFN2K TFN2K Modifié Intensity 34% - 71% 28% - 99% 15% - 58% 7% - 87% 4% - 92% 1% - 4% Γα,β – farima (φ, d, θ) model validation 4 Parameters estimation: 8 1st order: Instead of the usual moment based technique which estimates μ and σ2, we use maximum likelihood based estimates for α and β. 8 2nd order: LRD (long memory) estimated with a multiy , characterized by y d, the long g memory y resolution analysis, parameter measured on an aggregation range Δ for which the log scale diagram is linear. From this wavelet base estimation of d, we perform a fractional derivation of XΔ. This removes the long memory from the process so that only the ARMA p is left. φ and θ are easy y to estimate with an component iterative procedure based on the Gauss-Newton algorithm. FRnOG, Paris, France, 30 mai 2008 16 Γα,β – farima (φ, d, θ) model validation 4 To assess the validity y of the model with actual traffic traces, we made a comparative p analysis y of : 8Actual traces time series 8Γα,β produced by ya α β – farima (φ, d,, θ) time series p numerical generator designed for this purpose FRnOG, Paris, France, 30 mai 2008 17 AUCK-IV: Γα,β – farima (φ, d, θ) fits marginals covariances Δ=10ms Δ=100ms Δ=400ms FRnOG, Paris, France, 30 mai 2008 j=1 corresponds to 10 ms 18 METROSEC-ref1: Γα,β – farima (φ, d, θ) fits marginals covariances Δ=10ms Δ=100ms Δ=400ms FRnOG, Paris, France, 30 mai 2008 j=1 corresponds to 10 ms 19 METROSEC-DDoS & FC: Γα,β marginals fits DDoS attack Flash crowd Δ=2ms Δ=32ms FRnOG, Paris, France, 30 mai 2008 20 Logscale diagrams for METROSEC-DDoS & FC DD S DDoS Fl h Crowd Flash C d During Aft After Before FRnOG, Paris, France, 30 mai 2008 21 Estimated α and β as a function of log2Δ During After Before DDoS Flash Crowd α β FRnOG, Paris, France, 30 mai 2008 22 DDoS impact on traffic = shape p parameter, p , 1/α q quantifies the gap with a Gaussian law 4 β = scale parameter Æ decreases during DDoS attack 4α ÎDDoS attack accelerates the convergence t towards ds a G Gaussian ussi n distribution dist ibuti n of f ttraces, c s and decreases the fluctuation scale around the avera average e traffic FRnOG, Paris, France, 30 mai 2008 23 Partial conclusion 4M Model for f r characterizing ara r z ng Internet n rn traffic which works with and without anomalies 4 Some parameters change differently in the presence of a legitimate (flash g (DDoS) ( ) anomaly y crowd)) or illegitimate ÎHow tto use such ÎH h model d l for f an efficient and robust p profile based IDS? FRnOG, Paris, France, 30 mai 2008 24 Detection principles 4 Select a reference window 4 Segment the trace into sliding windows of duration T 4 For a window at time I: 8Aggregated trace at scales Δ=2j, Δ=2j j=1 j=1,...,J J 8Estimation of parameters : αΔ(I), βΔ(I) 8Compute the distance to the reference, reference between I and R: D(I) 8Selection of a threshold λ: • if D(I) ≥ λ , ⇒ anomaly FRnOG, Paris, France, 30 mai 2008 25 Selection of the best distance 4Quadratic (Basseville 89) distance on parameters 1 J Dα ( I ) = ∑ (α 2 j ( I ) − α 2 j ( R)) 2 j j =1 1 J Dβ ( I ) = ∑ ( β 2 j ( I ) − β 2 j ( R)) 2 J j =1 4Divergence of Kullback-Leibler; p1 and p2 are 2 p.d.f. DK ( p1, p 2) = ∫ ( p1( x) − p 2( x)(ln p1( x) − ln p 2( x))dx giving g g a distance with one or two scales: K Δ(1D ) ( I ) = DK ( pΔ , I , pΔ , R ) K (2D) Δ ,Δ ' ( I ) = DK ( pΔ , Δ ' , I , pΔ , Δ ' , R ) FRnOG, Paris, France, 30 mai 2008 26 Ex. 1 : Denial of Service attack Dα(I) Dβ(I) FRnOG, Paris, France, 30 mai 2008 27 Ex. 2: Multiplicative increase of traffic Dα(I) Dβ(I) FRnOG, Paris, France, 30 mai 2008 28 Ex. 3: Comparison between distances KL 1D, j=4 KL 1D, j=7 Dα KL 2D, j=4,7 FRnOG, Paris, France, 30 mai 2008 29 Statistical performance: ROC curves 4 ROC curves: detection pprobabilityy accordingg to the fixed probability of false alarms 4 PD=f(P ( FA) or PD=f(λ), ( ), PFA=f(λ) ( ) FRnOG, Paris, France, 30 mai 2008 30 Conclusion on anomalies/attacks detection of the Γα,β α β – farima (φ, d,, θ) model change differently depending on the type yp of anomaly y 4 Kullback- Leibler distance allows a robust detection of attacks, even when they represent less than 1% of the traffic (and is not sensitive to an artificial increase of the amount of traffic) 4 Parameters ÎBUT: it is not possible with this method to identify anomaly constituting packets / flows FRnOG, Paris, France, 30 mai 2008 31 Objectives 4 Define an approach pp to 8Detect 8Classify y 8Identify traffic anomalies (One or more occurrences that change the normal flowing of data over a network) 4 Define f a signature g f for each traffic ff anomaly, based on “simple” parameters Æmust be easy to handled by network administrators Æmust permit the design of IPS FRnOG, Paris, France, 30 mai 2008 32 The NAD Algorithm … 4 Multi-scale concept 4 Tomography-based concept 4 Generic multi-criteria 8 Uses simple mathematical functions functions, as volume parameters, to detect anomalous flows • Number of packets per unit of time • Number N b of fb bytes per unit i of f time i • Number of new flows per unit of time 8 Uses IP features (addresses and ports) to identify dent fy the anomalies FRnOG, Paris, France, 30 mai 2008 33 The NAD Algorithm … (2) Multi-Scale 600sec 300sec 300sec ... 60sec 60sec 60sec 60sec 60sec 60sec 60sec 60sec 60sec 60sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec 30sec FRnOG, Paris, France, 30 mai 2008 34 The NAD Algorithm … (3) Tomography omography FRnOG, Paris, France, 30 mai 2008 35 Formal Definition 4 To detect an anomaly y it must be responsible p for a significant variation in one of the p parameters Æ deltoid based method Let, X = {x1,x2,...,xn}, xi = #{packet | byte | flows} and packet Δ = time _ granularit y X = {x1 , x 2 ,..., x n }, xi = {# packets |# bytes |# flows }/ Δ P = {p1 , p 2 ,..., p n −1 }, pi = xi +1 − xi ⎧ pi ≥ E ( p ) + k σ , select ⎨ ⎩ pi < E ( p ) + k σ , reject FRnOG, Paris, France, 30 mai 2008 36 Some Types of Anomalies Port ID Port Scan Other type Network Scan Src IP DDoS Dst IP Long flow The distribution of points in plots can give a clue about the type of anomaly! FRnOG, Paris, France, 30 mai 2008 Flash Crowd Dst IP 37 DDoS : N attackers Æ 1 target FRnOG, Paris, France, 30 mai 2008 38 DDoS: N attackers Æ 1 target (cnd) n sP : n dP IP Source IP Source/Port IP Source/Port IP Source/Port IP Source/Port IP Source/Port IP Source/Port n sP : 1 dP IP Source FRnOG, Paris, France, 30 mai 2008 39 FRnOG, Paris, France, 30 mai 2008 Destination Port ation/Port IP Destina IP Desttination IP Desttination Flash crowd IP De estination IP Destiination/Port Destination Port IP P Destination/Po ort Destination Porrt IP De estination IP Destination IP Destination n Network scan FRnOG, Paris, France, 30 mai 2008 IP De estination IP Desttination/Port Destin nation Port IP P Destination/Po ort Destination Port IP D Destination IP Destination IP Destination Ports scan FRnOG, Paris, France, 30 mai 2008 FRnOG, Paris, France, 30 mai 2008 Desstination Port IP De estination/Port IP Destination IP Destination Brute force attack DDoS example FRnOG, Paris, France, 30 mai 2008 Flash crowd example FRnOG, Paris, France, 30 mai 2008 Network scan example FRnOG, Paris, France, 30 mai 2008 Brute force attack example FRnOG, Paris, France, 30 mai 2008 NAD tool assessment FRnOG, Paris, France, 30 mai 2008 48 Comparison with other tools NADA Gamma-Farima PHAD FRnOG, Paris, France, 30 mai 2008 49 Conclusion 4 Experimental platform with monitoring and measurement capabilities 4 IDS assessment methodology (KDD is dead : RIP)) 4 Its related database of traces with anomalies l 4Unfortunately not publicly available : cf. CNIL 4 Original anomalies detection, classification and identification algorithms 4Which 4 h h proved d to be b efficient ff and d accurate 4Which raised many interests : FT, WIDE, … 4 Traffic T ffi generator t FRnOG, Paris, France, 30 mai 2008 50 More information http://www.laas.fr/METROSEC p FRnOG, Paris, France, 30 mai 2008 51