htestimate.Rd
Calculates Horvitz-Thompson estimate with different methods for variance estimation such as Yates and Grundy, Hansen-Hurwitz and Hajek.
htestimate(y, N, PI, pk, pik, method = 'yg')
y | vector of observations |
---|---|
N | integer for population size |
PI | square matrix of second order inclusion probabilities with |
pk | vector of first order inclusion probabilities of length |
pik | an optional vector of first order inclusion probabilities of length |
method | method to be used for variance estimation. Options are |
For using methods 'yg'
or 'ht'
has to be provided matrix PI
, and for 'hh'
and 'ha'
has to be specified vector pk
of inclusion probabilities.
Additionally, for Hajek method 'ha'
can be specified pik
. Unless, an approximate Hajek method is used.
The function htestimate
returns a value, which is a list consisting of the components
is a list of call components: y
observations, N
population size, PI
inclusion probabilities, pk
inclusion probabilities of sample, pik
full inclusion probabilities and method
method for variance estimation
mean estimate
standard error of the mean estimate
Kauermann, Goeran/Kuechenhoff, Helmut (2010): Stichproben. Methoden und praktische Umsetzung mit R. Springer.
Juliane Manitz
#> id district population cases #> Min. : 1001 LK Aachen : 1 Min. : 34719 Min. : 0.00 #> 1st Qu.: 5877 LK Ahrweiler : 1 1st Qu.: 104553 1st Qu.: 9.00 #> Median : 8331 LK Aichach-Friedberg: 1 Median : 145130 Median : 27.00 #> Mean : 8468 LK Alb-Donau-Kreis : 1 Mean : 193910 Mean : 44.58 #> 3rd Qu.: 9778 LK Altenburger Land : 1 3rd Qu.: 244154 3rd Qu.: 59.00 #> Max. :16077 LK Altenkirchen : 1 Max. :1770629 Max. :410.00 #> (Other) :418# pps.sampling() set.seed(108506) pps <- pps.sampling(z=influenza$population,n=20,method='midzuno') sample <- influenza[pps$sample,] # htestimate() N <- nrow(influenza) # exact variance estimate PI <- pps$PI htestimate(sample$cases, N=N, PI=PI, method='yg')#> #> htestimate object: Estimator for samples with probabilities proportional to size #> Method of Yates and Grundy: #> #> Mean estimator: 40.36766 #> Standard Error: 8.059507 #>htestimate(sample$cases, N=N, PI=PI, method='ht')#> #> htestimate object: Estimator for samples with probabilities proportional to size #> Method of Horvitz-Thompson: #> #> Mean estimator: 40.36766 #> Standard Error: 8.227719 #># approximate variance estimate pk <- pps$pik[pps$sample] htestimate(sample$cases, N=N, pk=pk, method='hh')#> #> htestimate object: Estimator for samples with probabilities proportional to size #> Method of Hansen-Hurwitz (approximate variance): #> #> Mean estimator: 40.36766 #> Standard Error: 8.534792 #>pik <- pps$pik htestimate(sample$cases, N=N, pk=pk, pik=pik, method='ha')#> #> htestimate object: Estimator for samples with probabilities proportional to size #> Method of Hajek (approximate variance): #> #> Mean estimator: 40.36766 #> Standard Error: 8.262482 #># without pik just approximate calculation of Hajek method htestimate(sample$cases, N=N, pk=pk, method='ha')#> Warning: Without input of 'pik' just approximative calculation of Hajek method is possible.#> #> htestimate object: Estimator for samples with probabilities proportional to size #> Method of Hajek (approximate variance): #> #> Mean estimator: 40.36766 #> Standard Error: 8.244296 #># calculate confidence interval based on normal distribution for number of cases est.ht <- htestimate(sample$cases, N=N, PI=PI, method='ht') est.ht$mean*N#> [1] 17115.89lower <- est.ht$mean*N - qnorm(0.975)*N*est.ht$se upper <- est.ht$mean*N + qnorm(0.975)*N*est.ht$se c(lower,upper)#> [1] 10278.45 23953.33#> [1] 18900