• Rezultati Niso Bili Najdeni

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 2

N/A
N/A
Protected

Academic year: 2022

Share "University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 2"

Copied!
8
0
0

Celotno besedilo

(1)University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination September 2nd , 2019. ID number:. io n. Instructions. s. Name and surname:. Read carefully the wording of the problem before you start. There are four problems altogeher. You may use a A4 sheet of paper and a mathematical handbook. Please write all the answers on the sheets provided. You have two hours.. a.. b.. c.. So. lu t. Problem 1. 2. 3. 4. Total. •. d.. • • •.

(2) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. 1. (25) In a population od size N there are two types of units: A and B. If a unit is selected into a sample and is asked about its type the response is not necessarily truthful. A unit of type A will say that it is of type A with probability pAA and will say that it is of type B with probability pAB and similarly for units of type B. Assume that units choose their reponses indpendently of each other and independently of the sampling procedure. Assume all the probabilities pXY are known. Suppose a simple random sample of size n is chosen from the population. Denote the proportion of units of type A by a and the proportion of units of type B by b. We would like to estimate a and b. a. (5) Let NA be the number of units in the sample of type A and NB the number of units in the sample of type B. Let MX be the number of units in the sample who respond A, and MB the number of units in the sample who respond B. Calculate E (MA |NA = nA , NB = nB ) . Solution: The units of type A in the sample respond A with probability pAA . By assumption their responses are independent so the conditional distribution of truthful responses is binomial with parameters nA and pAA . A similar argument hold for units of type B. It follows E (MA |NA = nA , NB = nB ) = nA pAA + nB pBA .. b. (5) Suggest unbiased estimates for the proportions a and b. When is it possible to estimate a and b? Solution: By the formula for total expectation we have E(MA ) = napAA + nbpBA and E(MB ) = n − E(MA ) = napAB + nbpBB . If pAA pBB − pAB pBA 6= 0 we can solve for a to get (PBB + pBA ) n1 E(MA ) − pBA . a= PAA pBB − pAB pBA The quantity MA is observed. By linearity of expectation we have that â =. (PBB + pBA ) n1 MA − pBA PAA pBB − pAB pBA. is an unbiased estimate of a, and b̂ = 1 − â is an unbiased estimate of b.. 2.

(3) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. c. (5) Compute var (MA |NA = nA , NB = nB ). Use this to compute var(MA ). Hint: NA ∼ HiperGeom(n, N a, N ). Solution: Conditionally on {NA = nA , NB = nB } the random variable MA is the sum of two independent binomial random variable. We have var (MA |NA = nA , NB = nB ) = nA pAA (1 − pAA ) + nB pBA (1 − pBA ) . Consequently  E MA2 |NA = nA , NB = nB = nA pAA (1−pAA )+nB pBA (1−pBA )+n2A p2AA +n2B p2BA and E(NA ) = na, E(NB ) = nb and E(NA2 ) =. nab(N − n) + n2 a2 N −1. and. E(NB2 ) =. nab(N − n) + n2 b2 . N −1. Unconditioning we get E(MA2 ) = napAA (1 − pAA ) + nbpBA (1 − pBA )     nab(N − n) nab(N − n) 2 2 2 2 2 + + n a pAA + + n b p2BA . N −1 N −1 We have E(MA ) = napAA + nbpBA and consequently   (N − n)(p2AA + p2BA ) − 2npAA pBA . var(MA ) = n(a + b)pAA (1 − pAA ) + nab N −1. d. (10) Compute the standard error of the unbiased estimate â. Solution: We have var(â) =. p2BA var(MA ) . (pAA pBB − pAB pBA )2. 3.

(4) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. 2. (25) Assume that the data x1 , x2 , . . . , xn are an i.i.d.sample from the discrete distribution given by P (X1 = x) = (x − 1)p2 (1 − p)x−2 for x = 2, 3, . . .. a. (10) Find the MLE estimate for the parameter p. Solution: The log-likelihood function is `(p, x) =. n X. log(xk − 1) + 2n log p + log(1 − p). k=1. ( X. xk − 2) .. k=1. Taking derivatives we get 2n ∂` = − ∂p p. Pn. − 2) . 1−p. k=1 (xk. Setting the derivative to 0 we get 2n p̂ = Pn. k=1. xk. .. b. (10) Compute the Fisher information matrix I(p). Solution: We compute the second derivatives of the log-likelihood function for n = 1. We get ∂ 2` 2 x1 − 2 =− 2 − . 2 ∂p p (1 − p)2 Replace x1 by X1 and compute the expectation. We get I(p) =. 2 E(X1 ) − 2 − . 2 p (1 − p)2. To compute E(X1 ) we either notice that X1 ∼ N egBin(2, p) and hence E(X1 ) = 2/p, ∞ X E(X1 ) = k(k − 1)p2 (1 − p)k−2 k=2. or we notice that ∞ X k=2. k(k − 1)xk−2. ∞ X. d2 = 2 dx. ! xk. k=0. which gives us the same result. Finally, I(p) =. 4. 2 . − p). p2 (1. =. 2 (1 − x)3.

(5) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. c. (5) Write the 95%-confidence interval for the parameter p based on x1 , x2 , . . . , xn . Solution: The interval is. √ p̂ 1 − p̂ p̂ ± 1.96 · √ . 2n. 5.

(6) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. 3. (25) Assume that your observations are pairs (x1 , y1 ), . . . , (xn , yn ). Assume the pairs are an i.i.d. sample from the density fX,Y (x, y) = e−x · √. (y−θx)2 1 e− 2σ2 x 2πxσ. for σ > 0, x > 0, −∞ < y < ∞. We would like to test the hypothesis H0 : θ = 0 versus H1 : θ 6= 0 . a. (10) Find the maximum likelihood estimates for θ and σ. Solution: The log-likelihood function is ` (θ, σ|x, y) =. n X k=1. n. n 1X (yk − θxk )2 log xk − − log(2π) − n log σ − 2 2 k=1 2σ 2 xk. ! .. Take partial derivatives to get n. ∂` X (yk − θxk ) = ∂θ σ2 k=1 n. ∂` n X (yk − θxk )2 =− + ∂σ σ k=1 σ 3 xk Set the partial derivatives to 0. From the first equation we have Pn yk θ̂ = Pnk=1 k=1 xk and from the second n. σ̂ 2 =. 1 X (yk − θ̂xk )2 . n k=1 xk. b. (15) Find the likelihood ratio statistic for testing the above hypothesis. What is the approximate distribution of the test statistic under H0 ? Solution: If θ = 0 the log-likelihood functions attains its maximum for n. 1 X yk2 σ̂ = . n k=1 xk 2. It follows that . Pn. λ = −n log 1 − P n. k=1. The approximate distribution od λ is χ2 (1).. 6. 2. yk Pn. k=1. xk. yk2 k=1 xk.  ..

(7) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. 4. (25) Let the regression model be given by Yk1 = α + βxk1 + k1 Yk2 = α + βxk2 + k2 for k = 1, 2, . . . , n. Assume that E(k1 ) = E(k2 ) = 0, var(k1 ) = var(k2 ) = σ 2 + τ 2 and cov(k1 , k2 ) = σ 2 . For k 6= l we have cov(ki , lj ) = 0 for i, j = 1, 2. Assume that the ratio ρ2 = τ 2 /σ 2 is known. Let p p a = ρ + 2 + ρ2 and b = ρ − 2 + ρ2 . a. (10) Compute var (ak1 + bk2 ) , var (bk1 + ak2 ) and cov (ak1 + bk2 , bk1 + ak2 ) . Solution: We compute var (Vk )   2 σ + τ2 σ2 VT = V σ2 σ2 + τ 2     2 2 2 1 1 + ρ 0 1 + ρ 1 2 1+ρ p p = σ 1 1 + ρ2 0 ρ 2 + ρ2 −1 ρ 2 + ρ2   p 1 0 2 2 2 = σ ρ(1 + ρ ) 2 + ρ 0 1 b. (10) Find explicitly the best unbiased linear estimates of the parameters α and β. Solution: For k = 1, 2, . . . , n define Ỹk1 = (1 + ρ2 )Y1k − Yk2 p Ỹk2 = ρ 2 + ρ2 Yk2 , and .  Ỹ11  Ỹ12     Ỹ21      Ỹ =  Ỹ22   ..   .    Ỹn1  Ỹn2.  2 (1 + ρ2 )x11 − x12 pρ √ ρ 2 + ρ 2 ρ 2 + ρx12    2  (1 + ρ2 )x21 − x22    pρ √  ρ 2 + ρ2 ρ 2 + ρx 22 X̃ =     . . . .   . .   2 2   ρ (1 + ρ )x − x n1 n2 p √ ρ 2 + ρ2 ρ 2 + ρxn2 . and. 7.

(8) Methodology of Statistical Research, 2018/2019, M. Perman, M. Pohar-Perme. and ˜k1 = (1 + ρ2 )1k − k2 p ˜k2 = ρ 2 + ρ2 k2 .. and. .  ˜11  ˜12     ˜21      ˜ =  ˜22   ..   .    ˜n1  ˜n2. For the model.   α Ỹ = X̃ + ˜ β. the usual assumptions for linear regression models are met. By Gauss-Markov theorem we have that    −1 α̂ = X̃T X̃ X̃T Ỹ β̂ is tha BLUE for α and β. c. (5) Give the standard error for the BLUE of β. Solution: In the new model the ˜ki are uncorrelated with variance p p var(˜ki ) = σ 2 ρ 1 + ρ2 2 + ρ2 .  −1 Denote C̃ = X̃T X̃ . By known formulae we have se(β̂) = σ. p p √ 1 + ρ2 4 2 + ρ2 c22 .. 8.

(9)

Reference

POVEZANI DOKUMENTI

If a unit is selected into a sample and is asked about its type the response is not necessarily truthful.. A unit of type A will say that it is of type A with probability pAA and

β̂ Show that the estimators are unbiased and express their standard errors with quantities Pokažite, da sta cenilki nepristranski, ter izrazite njuni standardni napaki σ 2 , ρ

The sampling procedure is as follows: first a simple random sample of size k ≤ K of strata is selected.. The selection procedure is independent of the sizes

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination June 29th , 2021.. Name

Let Ik be the indicator of the event that the k-th unit is selected and Ik,1 the indicator that the k-th unit will respond YES,YES.. Let Ik,3 be the indicator of the event that the

Solution: By independence the likelihood function is equal to Lθ = 2n1 θ2n0 +n1 1 − θn1 +2n2 where nk is the number of occurences of k among the observed values.. We

The mathematical way to say that is that the conditional distribution of the data given a set of statistics does not depend on the parameters.. This is an instance of a set of

The mathematical way to say that is that the conditional distribution of the data given a set of statistics does not depend on the parameters.. This is an instance of a set of