An Application of Bootstrapping to Stein`s Two

Transcription

An Application of Bootstrapping to Stein`s Two
An Application of Bootstrapping to Stein's Two-Stage
Sequential Sampling Procedure
Basil M. de Silva
Royal Melbourne Institute of Technology, Department of Statistics and Operations Research
GPO Box 2476V
Melbourne 3001, Australia
[email protected]
Arpana Roy
Royal Melbourne Institute of Technology, Department of Statistics and Operations Research
GPO Box 2476V
Melbourne 3001, Australia
[email protected]
1. Introduction
Stein's two-stage procedure (Stein, 1945) is considered to be the most interesting and
practical sequential procedure for hypotheses testing and estimation. This procedure estimates
the required sample size, N using a small number of observations, m. However, the procedure
tends to oversample and the amount of oversampling increases with increasing (N , m). To
overcome this oversampling problem, Mukhopadhyay (1980) introduced a modied two-stage
procedure, where m is computed to suit the preset tolerance values. It was found that this
modied two-stage procedure is asymptotically rst order ecient (Ghosh and Mukhopadhyay,
1981) but it is not second order ecient (Mukhopadhyay and Solanky, 1994). Thus the modied
procedure is also tends to oversample.
In this paper we apply bootstrap resampling method to the original two-stage procedure
given by Stein (1945). This study examines the application of bootstrap to two-stage procedure
using three specic cases: xed- width condence interval, minimum risk point estimation and
linear regression problem. The asymptotic results and the simulation results of these cases,
show that a signicant reduction in oversampling can be achieved by applying bootstrapping
technique. Due to page limitation, we limit our discussion to the xed-width interval estimation
for the population mean.
2. Fixed-Width Condence Interval
Let X1; X2; ; be a sequence of independent and identically distributed (iid) random
variables from a distribution with unknown mean . For any xed n, and d(> 0), a 100(1 , )%
(0 < < 1) condence interval for is given by In;d = (X n , d; X n + d) with P ( 2 In;d) 1 , :
Thus, the optimal sample size, n required to achieve the condence interval is given by
2
2 ,2
(1)
n =
= n
2 d
where (=2 ) = (1 , =2) and (z) is the cumulative distribution function of N (0; 1). When
is unknown, the two-stage procedure estimates the required sample size, N from an initial
random sample of size m as follows:
n
o ,
1 2
(2)
N = max m; =2 Smd
+1
where Sm2 is the sample variance of the initial sample and < k > represents the largest integer
less than k. The stopping rule in (2) is valid for both Stein's two stage procedure and modied
two-stage procedure. However in Stein's case m is a predetermined xed value where as in
modied two-stage, it is computed from the following (Mukhopadhyay,1980):
(3)
n
m = max 2; =2
o
,
1 2=( +1)
d
+1
3. Application of Bootstrapping
Note that the two-stage procedures described above are based on normal approximation
and the accuracy of the stopping rule will depend on this approximation. Thus we replace
. Consider a bootstrap sample, X ; X ; ; X of
=2 by the bootstrap critical point =
1
2
m
2
2
variance of the bootstrap
X1; X2; ; Xm. Let X m and Sm be the sample mean and sample
is the (1 , )th quantile of the distribution of jpm(X , X m )=S j.
sample. Now =
m
m
2
Table 1. Simulation Results of the 95% Condence Interval for using the
mixture population 0:6N (0; 1)+0:4N (5; 1) (15000 Simulations and 1000 Bootstraps)
n
1000
500
200
50
d
.164
.232
.367
.733
Two-Stage
Modied
with Bootstrap
N sN
p N sN
p N sN
p
1138 1.81 .957 1041 .94 .951 1000 1.73 .944
571 .91 .956 537 .63 .955 501 .86 .942
229 .36 .956 234 .44 .957 201 .35 .939
58 .09 .948 57 .33 .945 51 .08 .933
REFERENCES
Ghosh, M. and Mukhopadhyay, N. (1981). Consistency and asymptotic eciency of two-stage
and sequential estimation procedures. Sankhya A 43, 220-227.
Mukhopadhyay, N. (1980). A consistent and asymptotically ecient two-stage procedure to
construct xed-width condence intervals for the mean. Metrica 27, 281-284.
Mukhopadhyay, N. and Solanky, T. K. S. (1994). Multistage Selection and Ranking Procedures
Second Order Asymptotics. Marcel Dekker. New York.
Stein, C. (1945). A two sample test for a linear hypothesis whose power is independent of the
variance. Annals of Mathematical Statistics 43, 243-258.
RE SUME
La methodologie du \bootstrap" est appliquee a la procedure a deux etapes de Stein au
moyen de trois cas particuliers: intervalle de conance a largeur xe, estimation ponctuelle
de risque minimum et probleme de regression lineaire. II est bien connu que la procedure
a deux etapes surechantillonne mais nous demontrons dans cet expose que la quantite de
surechantillonnage peut ^etre reduite de facon signicative en applicant le \bootstrap". Des
resultats asymptotiques furent etablis pour la procedure utilisee avec le bootstrap. Les comportements d'echantillons moyens et petits furent analyses au moyen d'une etude de simulation a
grande echelle.

Documents pareils