Habilitation à diriger des recherches +1cm Resampling methods for
Transcription
Habilitation à diriger des recherches +1cm Resampling methods for
Habilitation à diriger des recherches Resampling methods for periodic and almost periodic processes Anna Dudek Université Rennes 2 [email protected] Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 1 / 45 p Main research areas: 1 stochastic processes with some periodic/almost periodic structure; 2 resampling methods. Today: why periodic processes why bootstrap overview of bootstrap methods for periodic processes almost periodic case some open problems and current projects Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 2 / 45 p Examples Number of incoming packets in one hour non-overlapping bins. The measurement was conducted at the border between the network of University of Waikato and the internet provider. Time of observation is 20 workdays (480 hours). 6 4·10 6 3·10 6 2·10 6 1·10 100 200 300 400 Data from R. Nelson and B. Jones from Univ. of Waikato. Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 3 / 45 p Anna Dudek ( Examples Volumes of energy traded hourly on the Nord Pool Spot Exchange (6 July - 31 August 2010, N = 984 records) - without weekends Source: http://www.npspot.com Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 4 / 45 p Examples Mean monthly flow of Fraser River (1913-1990) Source: http://www.umass.edu/statdata/statdata/stat-time.html Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 5 / 45 p Examples Garden blower signal Source: http://www.reliableplant.com Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 6 / 45 p Characteristics of interest Periodic time series Definition Time series Xt , t ∈ Z is strictly periodic with period d , if for each t, τ, . . . , τr ∈ Z L Xt , Xt+τ1 , . . . , Xt+τr = Xt+d , Xt+τ1 +d , . . . , Xt+τr +d for each r ∈ Z. Definition Time series Xt , t ∈ Z is weakly periodic of order r with period d , if E|Xt |r < ∞ for each t, τ, . . . , τr −1 ∈ Z E Xt Xt+τ1 . . . Xt+τr −1 = E Xt+d Xt+τ1 +d . . . Xt+τr −1 +d . Source: Synowiecki (2007) Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 7 / 45 p Characteristics of interest Periodic time series First and second order characteristics: overall mean seasonal means seasonal variances autocovariance function Fourier coefficients of the mean and the autocovariance functions Problems: usually parameters of interest are multidimensional −→ asymptotic simultaneous confidence intervals; they are difficult to obtain: issues with asymptotic variance and calculation of quantiles. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 8 / 45 p Characteristics of interest Periodic time series Why bootstrap to construct pointwise confidence intervals to construct simultaneous confidence intervals for testing (e.g., detection of significant frequencies) Which bootstrap method should be used? Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 9 / 45 p Bootstrap methods MBB Moving Block Bootstrap (MBB) algorithm - Künsch (1989), Liu and Sight (1992) (X1 , . . . , Xn ) - observed sample b = bn - block length n = lb, l ∈ N Block Bi is defined as Bi = (Xi , . . . , Xi+b−1 ), where i = 1, . . . , n − b + 1. 1 2 3 12 15 24 36 n-b+1 n B1 B2 B3 Anna Dudek ( Bn-b+1 Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 10 / 45 p Bootstrap methods MBB Moving Block Bootstrap algorithm From the set {B1 , . . . , Bn−b+1 } select randomly with replacement l blocks. 1 , i, j = 1, . . . , n − b + 1. P Bi∗ = Bj = n−b+1 Joining the blocks we get X ∗ = (B1∗ , . . . , Bl∗ ). Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 11 / 45 p Bootstrap methods MBB MBB - disadvantages method designed for stationary time series destroys the periodic structure of the considered time series its application for periodic time series turned out to be limited to the overall mean case (Synowiecki (2007)) WE NEED METHODS DESIGNED FOR PERIODIC DATA! (X1 , . . . , Xn ) - sample from the periodic (in distribution or moments) time series with period d Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 12 / 45 p Bootstrap methods SBB Seasonal Block Bootstrap algorithm - Politis (2001) 1 d 2d 3d 4d 5d B1 Bd+1 B2 d+1 1.5 B1* B2* B3* 1.0 0.5 50 100 150 200 250 -0.5 -1.0 -1.5 Figure: B1∗ = Bd+1 , B2∗ = B1 , B3∗ = Bd+1 Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 13 / 45 p Bootstrap methods SBB SBB - disadvantages minimal block length is equal to period length block length is always an integer multiple of the period length Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 14 / 45 p Bootstrap methods PBB Periodic Block Bootstrap algorithm - Chan et al. (2004) 1.5 A1 B1 C1 A2 B2 C2 A3 B3 C3 1.0 0.5 50 100 150 -0.5 -1.0 -1.5 1.5 A*1 B*1 C1* A*2 B*2 C2* A*3 B*3 C3* 1.0 0.5 50 100 150 -0.5 -1.0 -1.5 Figure: A∗1 = A2 , B1∗ = B1 , C1∗ = C2 , A∗2 = A3 , B2∗ = B2 , C2∗ = C2 , A∗3 = A1 , B3∗ = B3 , C3∗ = C3 Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 15 / 45 p Bootstrap methods PBB PBB - disadvantages is designed for periodic time series that have long periodicities, since it is assumed that the block length b is much smaller compared to the period d Leśkow and Synowiecki (2010) showed the consistency of PBB procedure assuming that the period length d → ∞ as the sample size n → ∞ increases Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 16 / 45 p Bootstrap methods What did we need? a new bootstrap procedure that is suitable for periodic time series with fixed length periodicities of arbitrary size as related to block size and sample size Since 2011 we introduced: Generalized Seasonal Block Bootstrap (Dudek et al. (2014a)) Generalized Seasonal Tapered Block Bootstrap (Dudek et al. (2016)) Extension of Moving Block Bootstrap (Dudek (2015)) Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 17 / 45 p Bootstrap methods Any second order random process that is generated by the mixing (in the workings of a system) of randomness and periodicity will likely have the structure of PERIODIC CORRELATION. Definition A random sequence {Xt , t ∈ Z} with finite second moments is called periodically correlated (PC) or cyclostationary with period d if it has periodic mean and covariance functions e.g. E (Xt ) = E (Xt+d ) and Cov (Xt , Xs ) = Cov (Xt+d , Xs+d ) for each t, s ∈ Z. To avoid ambiguity, the period d is taken as the smallest positive integer such that above holds. Source: Gladyshev (1961), Hurd and Miamee (2007); Applications: Gardner, Napolitano, Paura (2006), Antoni (2009), Napolitano (2016). Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 18 / 45 p Bootstrap methods GSBB GSBB - Generalized Seasonal Block Bootstrap (Dudek et al. (2014a)) (X1 , . . . , Xn ) - a sample from the periodic (in distribution or moments) time series with period d b = bn - block length n = lb, l ∈ N Block Bi is defined as Bi = (Xi , . . . , Xi+b−1 ), where i = 1, . . . , n − b + 1. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 19 / 45 p Bootstrap methods GSBB 1.5 1.0 0.5 10 20 30 40 50 60 -0.5 -1.0 d = 12, b = 15, n = 60 Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 20 / 45 p Bootstrap methods 1 12 13 15 24 25 GSBB 27 36 37 39 48 51 60 B1 B13 B25 B37 1 4 12 16 18 24 28 30 36 40 42 48 54 60 B4 B16 B28 B40 First step choice: Anna Dudek ( B1∗ = B13 . Second step choice: B2∗ = B4 . Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 21 / 45 p Bootstrap methods 1 7 12 19 21 24 GSBB 31 33 36 43 45 48 57 60 B7 B19 B31 B43 1 10 12 22 24 34 36 46 48 60 B10 B22 B34 B46 Third step choice: Anna Dudek ( B3∗ = B31 . Fourth step choice: B4∗ = B46 Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 22 / 45 p Bootstrap methods 1.5 B*1 GSBB B*2 B*3 B*4 1.0 0.5 10 20 30 40 50 60 -0.5 -1.0 -1.5 Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 23 / 45 p Bootstrap methods GSBB Comments: GSBB is a generalization of existing before bootstrap methods: Seasonal Block Bootstrap (block length is an integer multiple of the period length) - Politis (2001) Periodic Block Bootstrap (block length is smaller than the period length) - Chan et al. (2004) GSBB: there is no relation between block and period lengths. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 24 / 45 p Bootstrap methods GSBB GSBB consistency results PC time series: overall mean, seasonal means (Dudek et al. (2014a)) seasonal variances, autocovariance function (Dudek (2016) submitted) Fourier coefficients of the mean and the autocovariance functions (Dudek et al. (2014b)) Period d → ∞ as sample size n → ∞: overall mean (Dudek et al. (2014a)) Fourier coefficients of the mean and the autocovariance functions (Dudek (2016)) Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 25 / 45 p Bootstrap methods TBB TBB - Tapered Block Bootstrap (Paparoditis and Politis (2001)) Method designed for stationary time series; modification of MBB-Moving Block Bootstrap (Künsch (1989)) (X1 , . . . , Xn ) - a sample from the stationary time series b = bn - block length n = lb ∈ N Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 26 / 45 p Bootstrap methods TBB Tapered Block Bootstrap algorithm e = X − X , where X = 1/n center the data, i.e. let X t t Pn i=1 Xt ; e ,...,X e from the set {B1 , . . . , Bn−b+1 }, where Bi = (X i i+b−1 ) select randomly with replacement l blocks. Probability of choosing any block is 1/(n − b + 1); join selected blocks to get X ∗ = (B1∗ , . . . , Bl∗ ); for m = 0, . . . , l − 1 √ ∗ Ymb+j := wb (j) b X∗ , ||wb ||2 mb+j where wn (t) = w t−0.5 , w : R → [0, 1], symmetric about t = 0.5, n nondecreasing for t ∈ [0, 0.5], w(t) > 0 for t in a neighbourhood of 0.5. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 27 / 45 p Bootstrap methods TBB TBB advantages In the setting of linear or approximately linear statistics, usually block resampling methods (MBB, CBB, Stationary Block Bootstrap) have a MSE of variance estimator of order O(n−2/3 ), while TBB has a MSE of order O(n−4/5 ), outperforming all methods. HOW TO APPLY TBB FOR PERIODIC DATA? Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 28 / 45 p Bootstrap methods GSTBB Generalized Seasonal Tapered Block Bootstrap algorithm - seasonal means case Let (X1 , . . . , Xn ) be a sample from the periodic (in distribution or moments) time series with period d; b block length, n = lb. e =X −µ b<t> , where < t >= (t mod d) denotes the define X t t season associated with t; the bootstrap sample X1∗ , . . . , Xn∗ is generated using GSBB to the e ,...,X e ; sample X n 1 for m = 0, . . . , l − 1 √ ∗ Ymb+j Anna Dudek ( := wb (j) b X∗ . ||wb ||2 mb+j Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 29 / 45 p Bootstrap methods GSTBB Comments In practical applications, the GSTBB should not be used with b ≤ d such that d = kb for k ∈ N especially for simultaneous confidence intervals. In such a case, the GSTBB provides too high or too low coverage probabilities. If b = d observations from the first and the last season are used with lower weights. By contrast, if b = 2d, then lower weights are no longer assigned to all observations from aforementioned seasons, and this negative effect disappears. Tapering idea is applied to residuals obtained after removing seasonal means from the data. The usual TBB was designed for stationary time series and hence to have zero-mean observations it was enough to subtract the overall mean from each observation. In a case of other characteristics (e.g., Fourier coefficients of the autocovariance function) further modifications of algorithm are essential. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 30 / 45 p Bootstrap methods GSTBB GSTBB consistency results PC time series: overall mean, seasonal means (Dudek et al. (2016)) Fourier coefficients of the mean and the autocovariance functions (Dudek et al. (2016)) - modified algorithm! GSTBB advantages: it provides often the actual coverage probability curves (comparing to the GSBB ones) flatter and closer to the nominal coverage level; especially for short samples; Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 31 / 45 p Bootstrap methods EMBB Problems GSBB requires knowledge of period length, but sometimes it may happen that period length is not known or considered signal is a composition of two components with incommensurable periods. To model such data almost periodically correlated (APC) processes are used. These are processes which mean and autocovariance functions are almost periodic. The MBB method in contrary to the GSBB does not keep the periodic structure of the original sample. Applied directly to PC data it provides often inconsistent estimators. The main problem is that having the MBB sample one cannot identify which seasons the bootstrap observations come from. The only case for which the MBB can be used for PC/APC time series is the overall mean case. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 32 / 45 p Bootstrap methods EMBB Extension of MBB (Dudek(2015)) Idea: modify the MBB to make it suitable for PC and APC data. EMBB algorithm Let {Xt , t ∈ Z} be PC or APC time series. Define a bivariate series Yi = (Xi , i) and then do the MBB on the ∗ ∗ sample (Y1 , . . . , Yn ) to obtain Y1 , . . . , Yn . Comments: In the second coordinate of the series Y1∗ , . . . , Yn∗ we preserve the information on the original time indices of chosen observations. Using it one may define consistent estimators of time and frequency domain characteristics of PC/APC time series. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 33 / 45 p Bootstrap methods EMBB EMBB consistency results PC time series: seasonal means, seasonal variances, autocovariance function (Dudek et al. (2016) - submitted) PC/APC time series: Fourier coefficients of the mean and the autocovariance functions (Dudek (2015)) Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 34 / 45 p Bootstrap methods Circular versions of GSBB, GSTBB, EMBB Usual MBB: Block Bi is defined as Bi = (Xi , . . . , Xi+b−1 ), 1 2 3 12 15 where 24 i = 1, . . . , n − b + 1. 36 n-b+1 n B1 B2 B3 Bn-b+1 Problem: Observations that are in the center of the considered sample are present in b different blocks, while the observations from the beginning and the end of the sample appear more rarely e.g., X1 belongs only to B1 . This often leads to increase of the bias of the considered estimator. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 35 / 45 p Bootstrap methods Circular versions of GSBB, GSTBB, EMBB Circular Block Bootstrap (CBB) - Politis and Romano (1992) the CBB is a modification of the MBB; data are treated as wrapped on the circle; each observation is present in the same number of blocks; we have exactly n blocks of the length b. 1 2 3 b n-b+1 n B1 Bn-b+1 Bn-b+5 Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 36 / 45 p Bootstrap methods Circular versions of GSBB, GSTBB, EMBB Circular versions of GSBB, GSTBB, EMBB: the only change in algorithms is the set of possible block length choices. Now it is of the form {B1 , . . . , Bn }. All rules how the blocks can be select remain unchanged. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 37 / 45 p Bootstrap methods Circular versions of GSBB, GSTBB, EMBB For stationary time series: b the CBB is providing usually the unbiased estimator E∗ (θb∗ ) = θ. PC/APC time series: If n = lb the circular EMBB estimators of the overall mean, the seasonal means, the seasonal variances and Fourier coefficients of the mean function (PC and APC case) are unbiased; If additionally n = wd, then the corresponding cGSBB estimators are also unbiased; b∗ (λ, τ ) for τ 6= 0 is Independently on the bootstrap approach a always biased. In general: Circular block bootstrap algorithms allow to reduce computational cost comparing with the standard approaches; Circular bootstrap algorithms are easier and more clear. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 38 / 45 p Bootstrap methods Some open problems modification of Stationary Bootstrap for PC/APC time series (joint project with D. Politis and E. Paparoditis); block length choice; bootstrap applicability for data with jitter effect (joint project with D. Dehay and M. Elbadaoui); testing (joint project with Ł. Lenart) choice of bootstrap approach for particular problem; heavy-tailed data; ... Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 39 / 45 p Bootstrap methods Other projects bootstrap in testing smoothness parameter of a density function (joint project with B. Ćmiel) application of harmonizable processes for EEG data analysis (joint project with J. Aston, D. Dehay and J.M. Freyermuth); bootstrap confidence sets in matrix completion problem (joint project with A. Carpentier, J.M. Freyermuth and M. Zhilova); Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 40 / 45 p Bootstrap methods References ANTONI, J. (2009). Cyclostationarity by examples. Mech. Syst Sig Process., 23(4), 987-1036. CHAN, V., LAHIRI, S.N., and MEEKER, W.Q (2004). Block bootstrap estimation of the distribution of cumulative outdoor degradation. Technometrics, 46, 215-224. DUDEK, A.E. (2015). Circular block bootstrap for coefficients of autocovariance function of almost periodically correlated time series. Metrika, 78(3), 313-335. DUDEK, A.E. (2016). First and second order analysis for periodic random arrays using block bootstrap methods. Electron. J. Statist., 10(2), 2561-2583. DUDEK, A.E. (2016). GSBB and MBB for periodic characteristics of PC time series - submitted. DUDEK, A.E., LEŚKOW, J., PAPARODITIS, E. and POLITIS, D. (2014a). A generalized block bootstrap for seasonal time series. J. Time Ser. Anal., 35, 89-114. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 41 / 45 p Bootstrap methods References DUDEK, A.E., MAIZ, S. and ELBADOUI, M. (2014b). Generalized Seasonal Block Bootstrap in frequency analysis of cyclostationary signals. Signal Process., 104C, 358-368. DUDEK, A.E., PAPARODITIS, E. and POLITIS, D. (2016). Generalized Seasonal Tapered Block Bootstrap, Statistics and Probability Letters, 115, 27-35. GARDNER, W.A., NAPOLITANO, A., PAURA, L. (2006). Cyclostationarity: half a century of research. Signal Processing, 86, 639–697. GLADYSHEV, E.G. (1961). Periodically Correlated Random Sequences. Soviet mathematics, 2, 385–388. HURD, H.L., MIAMEE, A.G. (2007).Periodically Correlated Random Sequences: Spectral. Theory and Practice. Wiley. KUNSCH, H. (1989) The jackknife and the bootstrap for general stationary observations, Ann. Statist., 17, 1217-1241. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 42 / 45 p Bootstrap methods References LENART, Ł., LEŚKOW, J. and SYNOWIECKI, R. (2008). Subsampling in testing autocovariance for periodically correlated time series. J. Time Ser. Anal., 29 995-1018. NAPOLITANO, A. (2016). Cyclostationarity: New trends and applications. Signal Process., 120, 385-408. PAPARODITIS, E. and POLITIS, D. (2001). Tapered block bootstrap Biometrika, 88, 1105-1119. POLITIS, D.N. (2001). Resampling time series with seasonal components, in Frontiers in Data Mining and Bioinformatics: Proceedings of the 33rd Symposium on the Interface of Computing Science and Statistics, Orange County, California, June 13-17, pp. 619-621. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 43 / 45 p Bootstrap methods References Politis, D.N. and Romano, J.P. (1992). A circular block-resampling procedure for stationary data. Exploring the Limits of Bootstrap, Wiley Ser. Probab. Math. Statist. Probab. Math. Statist. Wiley, New York, pp 263-270. SYNOWIECKI, R. (2007). Consistency and application of moving block bootstrap for nonstationary time series with periodic and almost periodic structure. Bernoulli, 13(4), 1151-1178. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 44 / 45 p Bootstrap methods This project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 655394. Anna Dudek ( Université Rennes Habilitation 2 à[email protected] diriger des recherches Resampling )methods for periodic and almost periodic 45 / 45 p