• Rezultati Niso Bili Najdeni

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 3

N/A
N/A
Protected

Academic year: 2022

Share "University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 3"

Copied!
9
0
0

Celotno besedilo

(1)University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 3rd , 2018. ID number:. io n. Instructions. s. Name and surname:. Read carefully the wording of the problem before you start. There are four problems altogeher. You may use a A4 sheet of paper and a mathematical handbook. Please write all the answers on the sheets provided. You have two hours.. a.. b.. c.. So. lu t. Problem 1. 2. 3. 4. Total. •. d.. • • •.

(2) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 1. (25) A population of size N is subdivided into K groups of equal size M so that N = KM . A sample is chosen in the following way: first a simple random sample of k groups out of K groups is chosen. On the second step a simple random sample of size m is chosen from every group chosen on the first step. The total sample size is n = km. The selection procedures on the first and the second step are assumed to be independent. Selection procedures in different groups are assumed to be independent too. Denote by µk the population average in the k-th group and by σk2 its population variance for i = 1, 2, . . . , K. Let µ be the population average. Denote σb2. K K 1 X 2 1 X 2 (µi − µ) = = µ − µ2 . K i=1 K k=1 k. Let X̄ be the sample average. a. (5) Show that X̄ is an unbiased estimator of the population mean µ. Solution: We write. K. X̄ =. 1X Ik X̄k k k=1. where X̄i is the sample average in group k and Ik is the indicator of the event that that k-th group is chosen. By assumption Ik and X̄k are independent, and E(X̄k ) = µk where µk is the population mean in group k. Furthermore, P (Ik = 1) = Kk . It follows K 1X k E(X̄) = µk = µ . k k=1 K b. (5) Define  Ik =. 1 if the k-th groups is selected 0 else.. Write. K. 1X X̄ = X̄k Ik k k=1 where X̄k is the sample average in the k-th group for k = 1, 2, . . . , K. Compute the standard error of X̄. Solution: We know that k cov(Ik , Il ) = − K. 2. . k 1− K.  ·. 1 K −1.

(3) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. We have var(X̄k Ik ) = E(X̄k2 )E(Ik ) − E(X̄k )2 E(Ik )2 k k2 = (var(X̄k ) + µ2k ) · − µ2k 2 K K and  cov(Xk Ik , Xl Il ) = E X̄k X̄l Ik Il − E(X̄k Ik )E(X̄l Il ) = µk µl E(Ik Il ) = µk µl E(Ik )E(Il ) = µk µl cov(Ik , Il ) .. It follows K 1 X 1 X var(X̄k Ik ) + 2 cov(X̄k Ik , X̄l Il ) var(X̄) = 2 k k=1 k k6=l   K K 1 X σk2 (M − m) 1 k X 2 = ++ µ 1− Kk k=1 m(M − 1) Kk K k=1 k   X k 1 1 1− · − µk µl Kk K K −1 k6=l   K 1 X σk2 (M − m) 1 k K 2 σb2 = + 1− · Kk k=1 m(M − 1) Kk K K −1. =. K 1 X σk2 (M − m) (K − k)σb2 + . Kk k=1 m(M − 1) k(K − 1). c. (10) Let X̄k be the sample average in the k-th group. Compute   E X̄k2 Ik and E X̄ 2 . Solution: By independence  k E X̄k2 Ik = (var(X̄k ) + µ2k ) K and by b. E X̄. 2. . K 1 X σk2 (M − m) (K − k)σb2 = + + µ2 . Kk k=1 m(M − 1) k(K − 1). 3.

(4) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. d. (5) Suggest an unbiased estimator of σb2 . Explain why it is unbiased. Solution: Let σ̂k2 be the unbiased estimate of σk2 . The sum K. 1X 2 γ̂ = σ̂ Ik k k=1 k is observable with expectation K X. σk2 ·. k=1. The sum. 1 . K. K. 1X 2 δ̂ = X̄ Ik k k=1 k is observable with expectation  K  1 X σk2 (M − m) 2 + µk . K k=1 M −1 It follows that M −m γ̂ M −1. δ̂ − is an unbiased estimator of. K 1 X 2 µ . K k=1 k. On the other hand the difference X̄ 2 −. (M − m) γ̂ k(M − 1). is observable and an unbiased estimator of (K − k)σb2 + µ2 . k(K − 1) The difference of the last two terms is an unbiased estimator of (K − k)σb2 + σb2 . k(K − 1) From this we can produce an unbiased estimator of σb2 .. 4.

(5) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 2. (25) Suppose the observed values x1 , y1 , x2 , y2 , . . . , xn , yn have been created as random variables X1 , Y1 , X2 , Y2 , . . . , Xn , Yn which are all independent and normally distributed. Assume Xk , Yk ∼ N(µk , σ 2 ) for k = 1, 2, . . . , n. For different k the random variables Xk , Yk may have different expectations but the variance σ 2 is always the same. a. (10) Find the maximum likelihood estimates for the parameters µ1 , . . . , µn and σ. Solution: The log-likelihood function is `(µ1 , . . . , µn , σ|x1 , . . . , xn , y1 , . . . , yn ) n X (xk − µk )2 + (yk − µk )2 = −2n log σ − n log(2π) − 2σ 2 k=1 and the partial derivatives are 2µk − xk − yk ∂` = , ∂µi σ2 n ∂` 2n X (xk − µk )2 + (yk − µk )2 =− + . ∂σ σ σ3 k=1 Set partial derivatives to 0 and solve to get xk + y k , µ̂k = 2. . n. 1 X σ̂ = (xk − yk )2 4n k=1. 1/2 .. b. (10) Which of the estimators for the parameters µ1 , . . . , µk and σ 2 (not σ) are unbiased? Can they be fixed to be unbiased? Solution: Replace the observed valuesx1 , . . . , xn , y1 , . . . , yk by random variables X1 , . . . , Xn , Y1 , . . . , Yk . We have E(Xk ) = E(Yk ) = µk and hence E(µ̂k ) = µk . The estimators for µk are unbiased. The MLE for σ 2 is n. 1 X (xk − yk )2 . σ̂ = 4n k=1   Since Xk − Yk ∼ N(0, 2σ 2 ) we have E (Xk − Yk )2 = var(Xk − Yk ) = 2σ 2 . It follows that E(σ̂ 2 ) = σ 2 /2. The MLE estimator for σ 2 is biased but can be fixed easily. The estimator n 1 X 2 (xk − yk )2 . σ̃ = 2n k=1 2. is unbiased.. 5.

(6) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. c. (5) Compute the variances of all unbiased estimators from part b. Hint: if Z ∼ N(0, 1) then var(Z 2 ) = 2. Solution: We have var(Xk ) + var(Yk ) σ2 = . var(µ̂k ) = 4 2   From the hint we infer that var (Xk − Yk )2 = 8σ 4 , and hence var(σ̃ 2 ) =. 6. 2σ 4 . n.

(7) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 3. (25) Suppose your observed values are pairs (x1 , y1 ), . . . , (xn , yn ). Assume the usual regression model that the observed values are “created” as Yk = α + βxk + k where k are assumed to be independent and equally distributed with k ∼ N (0, σ 2 ) for k = 1, 2, . . . , n. We would like to test H0 : β = 0 against H1 : β 6= 0 . a. (15) The likelihood function is L=. n  Y k=1. (yk −α−βxk )2 1 2σ 2 √ e− 2πσ.  .. Find the likelihood ratio statistic for the testing problem. Solution: In the unrestricted case the likelihood function attains its maximum when α̂ and β̂ are least squares estimates and n 1 X σ̂ = (yk − α̂ − β̂xk )2 . n k=1 2. In the restricted case we get n. α̃ =. 1X yk = ȳ n k=1. and. n. 1X σ̃ = (yk − ȳ)2 . n k=1 2. By Wilks we have n n 1 X 1 X 2 (yk − α̂ − β̂) + 2 (yk − ȳ)2 . λ = −n log(σ̂) + n log(σ̃) − 2 2σ̂ k=1 2σ̃ k=1. b. (10) What is the distribution of the Wilks’s λ statistic? Solution: In the restricted case we restric one parameter out of three. This means r = 3 − 2 = 1. It follows λ ∼ χ2 (1).. 7.

(8) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 4. (25) Suppose that we have the regression model Yi1 = α + βxi1 + i Yi2 = α + βxi2 + ηi where i = 1, 2, . . . , n and we have E(i ) = E(ηi ) = 0, var(i ) = var(ηi ) = σ 2 and cov(i , ηi ) = ρσ 2 for some correlation coefficient ρ ∈ (−1, 1). Further assume that the pairs (1 , η1 ), (2 , η2 ), . . . , (n , ηn ) are independent. a. (5) Denote . 1 1  1  X =  .. .  1 1 Is.  x11 x12   x21   ..  .   xn1  xn2. .  Y11  Y12     Y21    and Y =  ..  .  .    Yn1  Yn2.   α̂ = (XT X)−1 XT Y β̂. an unbiased estimator of the two regression parameters? Explain. Solution: By the assumptions   α E(Y) = X . β Using this and the rules for expectations it follows that the estimate is unbiased. b. (10) Suggest an unbiased estimator of σ 2 . Solution: One possibility is to use only every second observation and use the usual unbiased estimator for σ 2 . c. (10) Suppose that ρ is known and define new pairs √ √ √ √ Ỹi1 = ( 1 − ρ + 1 + ρ)Yi1 + ( 1 − ρ − 1 + ρ)Yi2 √ √ √ √ Ỹi2 = ( 1 − ρ − 1 + ρ)Yi1 + ( 1 − ρ + 1 + ρ)Yi2 √ √ √ √ x̃i1 = ( 1 − ρ + p1 + ρ)xi1 + ( 1 − ρ − 1 + ρ)xi2 √ √ √ x̃i2 = ( 1 − ρ − 1 + ρ)xi1 + ( 1 − ρ + 1 + ρ)xi2 and. √ √ √ √ ˜i = (√1 − ρ + √1 + ρ)i + (√1 − ρ − √1 + ρ)ηi . η̃i = ( 1 − ρ − 1 + ρ)i + ( 1 − ρ + 1 + ρ)ηi. Define Ỹ and X̃ accordingly. The new pairs satisfy the equations Ỹi1 = α1 + β x̃i1 + ˜i Ỹi2 = α1 + β x̃i2 + η̃i 8.

(9) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. √ where α1 = 2 1 − ρ α. Argue that this new model satisfies the usual conditions for the regression models. What is then the best linear unbiased estimator of the regression parameters α and β. Explain. Solution: We need to prove E(˜i ) = E(η̃i ) = 0 which follows easily. By a computation we prove that var(i ) = var(ηi ) = 4(1 − ρ2 )σ 2 and cov(˜i , η̃i ) = 0. The best linear unbiased estimator for α1 and β is given by the Gauss-Markov√theorem. But because α and α1 differ by a known constant it follows that α/(2 1 − ρ) is the best unbiased estimate for α.. 9.

(10)

Reference

POVEZANI DOKUMENTI

The sampling procedure is as follows: first a simple random sample of size k ≤ K of strata is selected.. The selection procedure is independent of the sizes

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination June 29th , 2021.. Name

Let Ik be the indicator of the event that the k-th unit is selected and Ik,1 the indicator that the k-th unit will respond YES,YES.. Let Ik,3 be the indicator of the event that the

Solution: By independence the likelihood function is equal to Lθ = 2n1 θ2n0 +n1 1 − θn1 +2n2 where nk is the number of occurences of k among the observed values.. We

The estimator for the total net value is then the sum of all the book values T corrected for N D ¯ where N is the number of items and D ¯ is the sample average of the differencesa.

Maja Klun is a graduate from the Faculty for Social Work, University of Ljubljana, and a member of the research group dealing with social parenthood as the crucial aspect of

In this paper, a single-verifier k- times group signature scheme is proposed as building block, where all the group signatures are verified by the only verifier, and the signatures

The first group of respondents has a sense of home that is clearly located: their (childhood) home is where their family is. They were born in small, local towns, and they had to