An Application of Bootstrapping to Stein`s Two
Transcription
An Application of Bootstrapping to Stein`s Two
An Application of Bootstrapping to Stein's Two-Stage Sequential Sampling Procedure Basil M. de Silva Royal Melbourne Institute of Technology, Department of Statistics and Operations Research GPO Box 2476V Melbourne 3001, Australia [email protected] Arpana Roy Royal Melbourne Institute of Technology, Department of Statistics and Operations Research GPO Box 2476V Melbourne 3001, Australia [email protected] 1. Introduction Stein's two-stage procedure (Stein, 1945) is considered to be the most interesting and practical sequential procedure for hypotheses testing and estimation. This procedure estimates the required sample size, N using a small number of observations, m. However, the procedure tends to oversample and the amount of oversampling increases with increasing (N , m). To overcome this oversampling problem, Mukhopadhyay (1980) introduced a modied two-stage procedure, where m is computed to suit the preset tolerance values. It was found that this modied two-stage procedure is asymptotically rst order ecient (Ghosh and Mukhopadhyay, 1981) but it is not second order ecient (Mukhopadhyay and Solanky, 1994). Thus the modied procedure is also tends to oversample. In this paper we apply bootstrap resampling method to the original two-stage procedure given by Stein (1945). This study examines the application of bootstrap to two-stage procedure using three specic cases: xed- width condence interval, minimum risk point estimation and linear regression problem. The asymptotic results and the simulation results of these cases, show that a signicant reduction in oversampling can be achieved by applying bootstrapping technique. Due to page limitation, we limit our discussion to the xed-width interval estimation for the population mean. 2. Fixed-Width Condence Interval Let X1; X2; ; be a sequence of independent and identically distributed (iid) random variables from a distribution with unknown mean . For any xed n, and d(> 0), a 100(1 , )% (0 < < 1) condence interval for is given by In;d = (X n , d; X n + d) with P ( 2 In;d) 1 , : Thus, the optimal sample size, n required to achieve the condence interval is given by 2 2 ,2 (1) n = = n 2 d where (=2 ) = (1 , =2) and (z) is the cumulative distribution function of N (0; 1). When is unknown, the two-stage procedure estimates the required sample size, N from an initial random sample of size m as follows: n o , 1 2 (2) N = max m; =2 Smd +1 where Sm2 is the sample variance of the initial sample and < k > represents the largest integer less than k. The stopping rule in (2) is valid for both Stein's two stage procedure and modied two-stage procedure. However in Stein's case m is a predetermined xed value where as in modied two-stage, it is computed from the following (Mukhopadhyay,1980): (3) n m = max 2; =2 o , 1 2=( +1) d +1 3. Application of Bootstrapping Note that the two-stage procedures described above are based on normal approximation and the accuracy of the stopping rule will depend on this approximation. Thus we replace . Consider a bootstrap sample, X ; X ; ; X of =2 by the bootstrap critical point = 1 2 m 2 2 variance of the bootstrap X1; X2; ; Xm. Let X m and Sm be the sample mean and sample is the (1 , )th quantile of the distribution of jpm(X , X m )=S j. sample. Now = m m 2 Table 1. Simulation Results of the 95% Condence Interval for using the mixture population 0:6N (0; 1)+0:4N (5; 1) (15000 Simulations and 1000 Bootstraps) n 1000 500 200 50 d .164 .232 .367 .733 Two-Stage Modied with Bootstrap N sN p N sN p N sN p 1138 1.81 .957 1041 .94 .951 1000 1.73 .944 571 .91 .956 537 .63 .955 501 .86 .942 229 .36 .956 234 .44 .957 201 .35 .939 58 .09 .948 57 .33 .945 51 .08 .933 REFERENCES Ghosh, M. and Mukhopadhyay, N. (1981). Consistency and asymptotic eciency of two-stage and sequential estimation procedures. Sankhya A 43, 220-227. Mukhopadhyay, N. (1980). A consistent and asymptotically ecient two-stage procedure to construct xed-width condence intervals for the mean. Metrica 27, 281-284. Mukhopadhyay, N. and Solanky, T. K. S. (1994). Multistage Selection and Ranking Procedures Second Order Asymptotics. Marcel Dekker. New York. Stein, C. (1945). A two sample test for a linear hypothesis whose power is independent of the variance. Annals of Mathematical Statistics 43, 243-258. RE SUME La methodologie du \bootstrap" est appliquee a la procedure a deux etapes de Stein au moyen de trois cas particuliers: intervalle de conance a largeur xe, estimation ponctuelle de risque minimum et probleme de regression lineaire. II est bien connu que la procedure a deux etapes surechantillonne mais nous demontrons dans cet expose que la quantite de surechantillonnage peut ^etre reduite de facon signicative en applicant le \bootstrap". Des resultats asymptotiques furent etablis pour la procedure utilisee avec le bootstrap. Les comportements d'echantillons moyens et petits furent analyses au moyen d'une etude de simulation a grande echelle.