University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination June 29
Celotno besedilo
(2) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. 1. (25) Suppose the population is stratified into K strata of sizes N1 , . . . , NK . Denote by µk the population mean in stratum k and by σk2 the population variance in stratum k for k = 1, 2, . . . , K. Let µ be the population mean for the whole population and σ 2 the population variance for the whole population. Suppose a stratified sample is taken with sample sizes in each stratum equal to n1 , n2 , . . . , nK . Let X̄k be the sample mean in stratum k and let K K X X Nk X̄ = X̄k = wk X̄k . N k=1 k=1 a. (5) Compute E. h. X̄k − X̄. 2 i .. Solution: We compute h 2 2 i = var X̄k − X̄ + E X̄k − X̄ E X̄k − X̄ = var(X̄k ) + var(X̄) − 2cov(X̄k , X̄) + (µk − µ)2 K σk2 Nk − nk X 2 σi2 Ni − ni + wi · · · = nk Nk − 1 n Ni − 1 i i=1 −2wk ·. σk2 Nk − nk + (µk − µ)2 . · nk Nk − 1. b. (10) Suggest an unbiased estimator for the quantity 2. γ =. K X. wk (µk − µ)2 .. k=1. Explain why the suggested estimator is unbiased. Solution: Since we have unbiased estimators for σk2 the quantity K. γ̂k2 = X̄k − X̄. 2. −. σ̂k2 Nk − nk X 2 σ̂i2 Ni − ni σ̂ 2 Nk − nk · − · + 2wk · k · wi · nk Nk − 1 ni Ni − 1 nk Nk − 1 i=1. is an unbiased estimator of (µk − µ)2 . Multiplying γk2 by wk and summing over k we get an unbiased estimator of γ 2 . c. (10) Suggest an unbiased estimator of the population variance σ 2 . Explain why your estimator is unbiased. Hint: check that 2. σ =. K X. wk σk2. k=1. +. K X k=1. 2. wk (µk − µ)2 ..
(3) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. Solution: We write 2. σ =. K X. wk σk2 + γ 2 .. k=1. Since both terms on the right can be estimated in an unbiased way we have that 2. σ̂ =. K X k=1. is an unbiased estimator of σ̂ 2 .. 3. wk σ̂k2 + γ̂ 2.
(4) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. 2. (25) Assume the data x1 , x2 , . . . , xn are an i.i.d. sample from the distribution with density α α f (x) = |x|α−1 e−|x| 2 for α > 0. a. (15) Write the equation for the MLE estimate of α. Compute the Fisher information I(α). Assume as known that Z ∞ π2 (2 − γ)γ α x2α−1 log2 x e−x dx = 3 − 6α α3 0 where γ = 0.577216 is the Euler constant. Solution: The log-likelihood function is given by `(α|x1 , . . . , xn ) = n log(α) − n log 2 + (α − 1). n X. log |xk | −. k=1. n X. |xk |α .. k=1. Setting the derivative to 0 we get the equation n n X n X |x|α log |xk | = 0 . log |xk | − + α k=1 k=1. For the Fisher information we compute `00 = −. 1 − |x|α log2 |x| . α2. We get Z 1 α ∞ 2α−1 2 α I(α) = |x| log |x|e−|x| + 2 α 2 −∞ 1 π2 (2 − γ)γ = − − . 2 2 α 12α 2α2. b. (10) Suppose you knew the MLE estimate α̂. Write explicitely the approximate 99%-confidence interval for α. Rešitev: The approximate standard error is given by s 1 se(α̂) = nI(α̂) and zα = 2.56. The approximate confidence interval is α̂ ± 2.56 · se(α̂) .. 4.
(5) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. 3. (25) Assume the observations x1 , . . . , xn are an i.i.d.sample from the Γ(2, θ) distribution with density f (x) = θ2 xe−θx for x > 0 and θ > 0. a. (5) Find the maximum likelihood estimator for the parameter θ. Solution: The log-likelihood function is ` (θ|x) = 2n log θ +. n X. log xk − θ. k=1. n X. xk .. k=1. Equating the derivative to 0 we get 2n θ̂ = Pn. k=1. xk. .. b. (10) For the testing problem H0 : θ = 1 versus H1 : θ 6= 1 find the Wilks’s test statistic λ. Describe when you would reject H0 given that the size of the test is 1 − α with α ∈ (0, 1). Solution: By definition λ = 2`(θ̂) − 2`(1) . Using the maximum likelihood estimator β̂ we get x̄ λ = −4n log + 2n (x̄ − 2) . 2 By Wilks’s theorem under H0 the distribution of the test statistic λ is approximately χ2 (1). The null-hypothesis is rejected when λ > cα where cα is such that P (χ2 (1) ≥ cα ) = α. c. (10) The function f (y) = −4n log. y . + 2n(y − 2) 2 is strictly decreasing on (0, 2) and strictly increasing on (2, ∞). Assume for all c > miny>0 f (y) you can find the two solutions of the equation f (y) = c. Can you use this information to give an exact test given α ∈ (0, 1)? Describe the procedure. No calculations are required. Hint: by properties of the gamma distribution X̄ ∼ Γ(2n, θ/n). Solution: Given the assumptions we can find such a cα that under H0 we have PH0 f (X̄) ≥ cα = α . Let x1 < x2 be the solutions of the equation f (x) = cα . The test that rejects H0 when either X̄ < x1 or X̄ > x2 is exact.. 5.
(6) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. 4. (25) Assume the regression model with Y = Xβ + where E() = 0 and var () = σ 2 Σ where Σ is an invertible known matrix and σ 2 is an unknown parameter. a. (5) Show that β̂ = XT X. −1. XT Y. is an unbiased estimate of the parameter β. Solution: We compute −1 T E β̂ = XT X X E(Y) . Since E(Y) = Xβ we have E β̂ = β .. b. (5) Show that β̃ = XT Σ−1 X. −1. XT Σ−1 Y. is an unbiased estimate of the parameter β. Solution: We compute −1 T −1 X Σ E(Y) . E β̃ = XT Σ−1 X Since E(Y) = Xβ we have E β̃ = β .. c. (5) Compute the covariance matrix cov β̂ − β̃, β̃ . Solution: Denote A = XT X and B = XT Σ−1 X. −1. −1. XT. XT Σ−1 .. In this notation cov (AY − BY, BY) = (A − B)cov(Y, Y)BT . Note that cov(Y, Y) = σ 2 Σ. It is straightforward to check that (A − B)ΣBT = 0 .. 6.
(7) Methodology of Statistical Research, 2020/2021, M. Perman. M. Pohar-Perme. d. (10) Which of the two estimators for β is better? Explain. Solution: Write as in the Gauss-Markov theorem var(β̂) = var(β̂ − β̃ + β̃) = var(β̂ − β̃) + var(β̃) + 2cov β̂ − β̃, β̃ = var(β̂ − β̃) + var(β̃) .. This means that β̃ is the better estimator of β.. 7.
(8)
POVEZANI DOKUMENTI
If a unit is selected into a sample and is asked about its type the response is not necessarily truthful.. A unit of type A will say that it is of type A with probability pAA and
β̂ Show that the estimators are unbiased and express their standard errors with quantities Pokažite, da sta cenilki nepristranski, ter izrazite njuni standardni napaki σ 2 , ρ
The sampling procedure is as follows: first a simple random sample of size k ≤ K of strata is selected.. The selection procedure is independent of the sizes
Let Ik be the indicator of the event that the k-th unit is selected and Ik,1 the indicator that the k-th unit will respond YES,YES.. Let Ik,3 be the indicator of the event that the
Solution: By independence the likelihood function is equal to Lθ = 2n1 θ2n0 +n1 1 − θn1 +2n2 where nk is the number of occurences of k among the observed values.. We
The undersigned Germ´an Augusto D´ıaz M´endez, a student at the University of Ljubljana, School of Economics and Business, (hereafter: SEB LU), author of this written final work
Utilising the second methodology noted above, an assessment of urban functions, the Globalisation and World Cities Research Centre at Loughborough University, UK has pro- duced a
Geographical research methodology of soil and vegetation in Slovenian publications (bo- oks, textbooks, lexicons, atlases, and the existing geography curricula in university