• Rezultati Niso Bili Najdeni

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination Fabruary 1

N/A
N/A
Protected

Academic year: 2022

Share "University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination Fabruary 1"

Copied!
7
0
0

Celotno besedilo

(1)University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination Fabruary 1st , 2018. ID number:. io n. Instructions. s. Name and surname:. Read carefully the wording of the problem before you start. There are four problems altogeher. You may use a A4 sheet of paper and a mathematical handbook. Please write all the answers on the sheets provided. You have two hours.. a.. b.. c.. So. lu t. Problem 1. 2. 3. 4. Total. • •. d. • • •.

(2) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 1. (25) Assume that every unit in a population of size N has two values of statistical variables X and Y . Denote the values by (x1 , y1 ), . . . , (xN , yN ). Assume that the 2 population mean µX and the population variance σX of the variable X are known. Suppose a simple random sample of size n is selected from the population. Denote by (X1 , Y1 ), . . . , (Xn , Yn ) the sample values. The above assumptions are that E(Xk ) = µX. and. 2 var(Xk ) = σX. for k = 1, 2, . . . , n. a. (10) Denote c = cov(X1 , Y1 ). Compute cov(Xk , Yl ) for k 6= l. Hint: what would be cov(Xk , Y1 + Y2 + · · · + YN )? Use symmetry. Solution: By symmetry the covariances cov(Xk , Yl ) are the same for all k 6= l. The covariance in the hint is 0 because the second sum is a constant. By properties of covariance we have cov(Xk , Yk ) + (N − 1) cov(Xk , Yl ) = 0 , and hence cov(Xk , Yl ) = −. c . N −1. b. (10) Assume the quantity c = cov(X1 , Y1 ) is known. We would like to estimate the population mean µY of the variable Y . The following estimator is proposed:  c µ̂Y = Ȳ − 2 X̄ − µX . σX Argue that the estimator is unbiased and compute its variance. Solution: The estimators X̄ and Ȳ are unbiased and the claim follows by linearity. We compute 2c c2 var(X̄) − 2 cov(Ȳ , X̄) 4 σX σX   2 2 2 σY N − n c σX N − n 2c c 2 = · + 4 · · − nc − (n − n) 2 n N − 1 σX n N − 1 n2 σX N −1   2 c N −n 1 = σY2 − 2 . N −1 n σX. var(Ỹ ) = var(Ȳ ) +. c. (5) Assume the quantity c = cov(X1 , Y1 ) is known. Another possible estimator of µY is µ̃Y = Ȳ which is unbiased. Under which circumstances is the estimator  c µ̂Y = Ȳ − 2 X̄ − µX . σX more accurate than the estimator µ̃Y ? Explain your answer. Solution: Both estimators are unbiased and the variance of Ỹ is always smaller than the variance of X̄ unless c = 0.. 2.

(3) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 2. (25) Let the observed values x1 , x2 , . . . , xn be generated as independent, identically distributed random variables X1 , X2 , . . . , Xn with distribution (θ − 1)x−1 θx. P (X1 = x) = for x = 1, 2, 3, . . . and θ > 1.. a. (10) Find the MLE estimate of θ based on the observations. Solution: We find `(θ, x) =. n X. ! xk − n log(θ − 1) −. k=1. n X. ! xk. log θ .. k=1. Taking the derivative we have 0. Pn. ` (θ, x) =. xk − n − θ−1. k=1. Pn. k=1. xk. θ. = 0.. It follows that n. 1X θ̂ = xk = x̄ . n k=1 b. (15) Write an approximate 99%-confidence interval for θ based on the observations. Assume as known that ∞ X 1 xax−1 = (1 − a)2 x=1 for |a| < 1. Solution: We have. x x−1 + 2. 2 (θ − 1) θ To find the Fisher information we need ∞ X (θ − 1)x−1 x E(X1 ) = . x θ x=1 `00 (θ, x) = −. Using the hint we get  −2 1 θ−1 E(X1 ) = · 1 − = θ. θ θ We have. 1 . θ(θ − 1) An approximate 99%-confidence interval is s θ̂(θ̂ − 1) θ̂ ± 2.56 · . n I(θ) =. 3.

(4) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 3. (25) Assume that your observations are pairs (x1 , y1 ), . . . , (xn , yn ). Assume the pairs are an i.i.d. sample from the density fX,Y (x, y) = e−x · √. (y−θx)2 1 e− 2σ2 x 2πxσ. for σ > 0, x > 0, −∞ < y < ∞. We would like to test the hypothesis H0 : θ = 0 versus H1 : θ 6= 0 . a. (10) Find the maximum likelihood estimates for θ and σ. Solution: The log-likelihood function is ` (θ, σ|x, y) =. n X k=1. n. n 1X (yk − θxk )2 log xk − − log(2π) − n log σ − 2 2 k=1 2σ 2 xk. ! .. Take partial derivatives to get n. ∂` X (yk − θxk ) = ∂θ σ2 k=1 n. ∂` n X (yk − θxk )2 =− + ∂σ σ k=1 σ 3 xk Set the partial derivatives to 0. From the first equation we have Pn yk θ̂ = Pnk=1 k=1 xk and from the second n. σ̂ 2 =. 1 X (yk − θ̂xk )2 . n k=1 xk. b. (15) Find the likelihood ratio statistic for testing the above hypothesis. What is the approximate distribution of the test statistic under H0 ? Solution: If θ = 0 the log-likelihood functions attains its maximum for n. 1 X yk2 σ̂ = . n k=1 xk 2. It follows that . Pn. λ = −n log 1 − P n. k=1. The approximate distribution od λ is χ2 (1).. 4. 2. yk Pn. k=1. xk. yk2 k=1 xk.  ..

(5) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. 4. (25) Assume the regression model Y1 Y2 ··· Yn. = = = =. α + βx1 + 1 α + βx2 + 1 + 2 ········· α + βx2 + 1 + 2 + · · · + n. where we assume E(k ) = 0, var(k ) = σ 2 for all k = 1, 2, . . . , n, and cov(k , l ) = 0 for k 6= l. Assume that all x1 , x2 , . . . , xn are different. a. (10) Find explicitly the best unbiased linear estimators of α and β. Solution: Define    U1 Y1  U2   Y2 − Y1       U =  U3  =  Y3 − Y2  ..   ..  .   .. . Yn − Yn−1. Un.    ,  .  1 0   Z = 0  .. .. x1 x2 − x1 x3 − x2 .. .. 0 xn − xn−1.     ,  .   α γ= , β.   1  2     =  ..  . . n. We have U = Zγ +  and the usual assumptions of the Gauss-Markov theorem are met. Denote n n−1 n X X X 2 2 2 2 xk−1 xk , xk + xn + 2 (xk − xk−1 ) = x1 + 2 Q1 :=. S1 :=. k=2. k=2. k=2 n X. (xk − xk−1 )(Yk − Yk−1 ). k=2. = x1 Y 1 + 2. n−1 X. xk Y k + xn Y n +. k=2. n X. xk−1 Yk + xk Yk−1. . k=2. and compute  1 x1 , Z Z= x1 x21 + Q21 T. .   1 x21 + Q1 −x1 Z Z = , −x1 1 Q1   Y1 T Z U= . x1 Y1 + S1 T. −1. By Gauss-Markov theorem the BLUE for the parameter γ given by   −1 T 1 Q1 Y1 − x1 S1 T γ̂ = Z Z Z U = , S1 Q1 The best unbiased linear estimators for α and β are: α̂ = Y1 −. x1 S1 , Q1. 5. β̂ =. S1 . Q1.

(6) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. b. (5) Find explicitly the standard errors of the best unbiased linear estimates of α and β. Solution: From var(Yk − Yk−1 ) = var(k ) = σ 2 ;. k = 2, 3, . . . , n. we get n σ2 X σ2 var(β̂) = 2 . (xk − xk−1 )2 = Q1 k=2 Q1. Note that α̂ = Y1 − x1 β̂. The random variables Y1 and β̂ are independent because Y1 depends on 1 only, and β̂ on 2 , . . . , n only. We have   x21 2 2 . var(α̂) = var(Y1 ) + x1 var(β̂) = σ 1 + Q1 c. (5) Suggest an unbiased estimator of σ 2 . Solution: We use the transformed model and known unbiased estimates for the standard regression model. We have 1 σ̂ 2 = kU − Zγ̂k2 n−2 T  1 U − Zγ̂ U − Zγ̂ = n−2 T  1 = Y − Xγ̂ Σ−1 Y − Xγ̂ n−2" # n X  1 2 Yk − Yk−1 − β̂(xk − xk−1 ) (Y1 − α̂ − β̂x1 )2 + = n−2 k=2 2 n  X 1 S1 = Yk − Yk−1 − (xk − xk−1 ) . n − 2 k=2 Q1 d. (5) Let α̃ and β̃ be ordinary least squares estimators of the parameters α and β. Show that the estimators are unbiased and find their standard errors explicitly.   α̃ Solution: The two estimators form the vector γ̃ = = (XT X)−1 XT Y = β̃ γ + (XT X)−1 XT η. As E(η) = 0 we have E(γ̃) = γ, so the estimators are unbiased. The standard errors are best expressed with matrices. We ned the diagonal elements of the matrix σ 2 (XT X)−1 XT ΣX(XT X)−1 . But a direct approach is quicker. Define Sx :=. n X k=1. xk ,. Sxx :=. n X. x2k. ,. SxY :=. n X k=1. k=1 6. xk Yk ,. ∆ := n Sxx − Sx2 ,.

(7) Methodology of Statistical Research, 2017/2018, M. Perman. M. Pohar-Perme. We have n. 1 X Sxx SY − Sx SxY = α̃ = (Sxx − Sx xk )Yk , ∆ ∆ k=1 n. 1 X n SxY − Sx SY = β̃ = (n xk − Sx )Yk . ∆ ∆ k=1 The random variables U1 , U2 , . . . , Un are independent so n. k. n. k. n. n. 1 XX 1 XX α̃ = (Sxx − Sx xk )Ul = (Sxx − Sx xk )Ul , ∆ k=1 l=1 ∆ l=1 k=l β̃ =. n. n. 1 XX 1 XX (nxk − Sx )Ul = (nxk − Sx )Ul ∆ k=1 l=1 ∆ l=1 k=l. The variances are n n σ X X (Sxx − Sx xk ) var(α̃) = 2 ∆ l=1 k=l 2. !2. n n n σ2 X X X (Sxx − Sx xj )(Sxx − Sx xk ) = 2 ∆ l=1 j=l k=l. n n σ2 X X = 2 min{j, k}(Sxx − Sx xj )(Sxx − Sx xk ) , ∆ j=1 k=1 !2 n n n n n σ2 X X σ2 X X X var(β̃) = 2 (nxk − Sx ) = 2 (nxj − Sx )(nxk − Sx ) ∆ l=1 k=l ∆ l=1 j=l k=l n n σ2 X X min{j, k}(nxj − Sx )(nxk − Sx ) . = 2 ∆ j=1 k=1. Standard errors are obtained by taking square roots.. 7.

(8)

Reference

POVEZANI DOKUMENTI

University of Ljubljana Doctoral Programme in Statistics Methodology of Statistical Research Written examination June 29th , 2021.. Name

Let Ik be the indicator of the event that the k-th unit is selected and Ik,1 the indicator that the k-th unit will respond YES,YES.. Let Ik,3 be the indicator of the event that the

Solution: By independence the likelihood function is equal to Lθ = 2n1 θ2n0 +n1 1 − θn1 +2n2 where nk is the number of occurences of k among the observed values.. We

Find the best linear unbiased estimate of the regression parameters α and β.. Compute the standard errors of the

The goal of the research: after adaptation of the model of integration of intercultural compe- tence in the processes of enterprise international- ization, to prepare the

The use of regression model to explain the demand for health services has been studied by Gurmu and Elder (2000) where bivariate regression model for count data was used and also

METHODS: Costs have been modelled using five different regression model: the ordinary least square regression model, the logistic regres- sion model using the median and the

A single statutory guideline (section 9 of the Act) for all public bodies in Wales deals with the following: a bilingual scheme; approach to service provision (in line with