## Sigma W Session 5 Homework

**Math 181A (Winter 2018)**

## Lecture: MWF 11am--11:50 at PCYNH 122 --- NOTE: NEW CLASSROOM IS CSB 002

Announcements: (a) This course is intended to be taken as a sequence with 181B. Applied Math and Prob/Stat majors will be required to either take both or take another stand-alone course; they are discouraged to take only 181A. (b) MATH 185, which is required for the Prob/Stat major, is accepting students without the 181B pre-requisite. (c) MATH181A will be offered again in Spring quarter. (d) MATH181B will also be offered in Spring quarter, and it is scheduled to be offered again next Fall. Instructor: | Prof. Dimitris Politis Email: dpolitis@ucsd.edu Tel.: 534-5861 Office: APM 5701 Office hours: MWF 10am-10:50am or by appointment | |
---|---|---|

TAs |
| |

Denise Rava Email: drava@ucsd.edu Office: APM 6414 Office hours: Tue 4-6pm |

**Homework**Homework will be assigned online on Wednesday evenings, and will be due the Thursday 8 days later. PLEASE LEAVE YOUR HOMEWORK IN THE DROPBOX AT THE BASEMENT BY 4PM EACH THURSDAY. Homework will no longer be collected during the TA discussions. Please write down the section number that you are enrolled in when you submit the homework and quizzes.

**Class notes**Notes from class will be posted on Wednesday evenings together with the Homework. Weekly quizes Every Friday, a 20 minute quiz will be given. The quizes are a most important part of the class; they will focus on the material presented in class during the week. Quizes can not be made up if missed; however, the two lowest quiz scores will be dropped before calculating the quiz average for the final class grade. For the quizes, please bring a blue book and your calculator; blue books can be re-used for the quizes to save some trees. Quiz grading: spell name (and ID number) right = 3pts; partially correct = 7pts; totally correct = 10pts.

Final exam The final exam will be comprehensive, covering the material from the whole course with particular emphasis on the examples discussed in class, and the homework problems. For the final exam, please bring 1-2 fresh blue books, your student ID, calculator, and a crib sheet (2 sides OK!) The 181A final will take place at 11:30am on Monday March 19 at CSB002.

**Computing**

Students should familiarize themselves with the statistical language R; Sessions on Thursday January 11th will focus on R. There are many online tutorials on R; just google "R tutorial". A gentle one is given here; see also the R handout from 2010 and the R handout from 2017. Click here to download R to your computer. Textbook

**,**

*An Introduction to Mathematical Statistics and its Applications.***, by Larsen and Marx; Prentice Hall.**

*6th ed.*THE PLAN FOR THIS QUARTER IS TO COVER CHAPTERS 5, 6, AND 7 FROM THE BOOK. HOWEVER, CH. 5.8 WILL BE POSTPONED TO LAST WEEK OF CLASSES.

**Other books**

Rice:

**, and Wackerly et al.:**

*Mathematical Statistics and Data Analysis*

*Mathematical Statistics with Applications.***Web resource**

SticiGui (an interactive statistics textbook)Course Webpage The course webpage will be updated regularly and students are advised to check it often. In particular, the webpage will list the week's assigned homework, as well as the assigned reading material.

Course Grade

**Assignments/announcements:**

**Week 1 (Jan. 8)**Review of probability and expectation. Let's make a deal. Convergence of r.v.'s. Law of Large numbers (LLN) and Central Limit Theorem (CLT). Monte Carlo simulation. Larsen/Marx: Ch. 3. SticiGui: Ch. 18 and 23. Check out the slides from first day, slides from second day, and slides from third day. [Starting on week 2, class notes will be uploaded once a week, on Wednesday evenings.]**HW 1 (due Thursday Jan 18):**1. Use Chebychev's inequality to prove the (weak) LLN for the sample mean of i.i.d. data with finite variance. 2. Use the R code to empirically check the LLN and CLT via simulation.**Week 2 (Jan. 15)****Monday is a holiday!**Parametric vs. nonparametric statistics. Estimation: Maximum Likelihood and Method of Moments. Larsen/Marx: Ch. 5.1-5.2. SticiGui: Ch. 25. Check out the slides from 2nd week and slides from 2nd Friday.**HW 2 (due Thursday Jan 25):**Larsen/Marx: ex. 5.2.6, 5.2.12, 5.2.15, 5.2.16, 5.2.19.**Week 3 (Jan. 22)**Confidence intervals. Pivotal quantities, standardized vs. studentized sample mean. Larsen/Marx: Ch. 5.3. SticiGui: Ch. 26. Check out the slides from 3rd week.**HW 3 (due Thursday Feb 1):**Larsen/Marx: ex. 5.3.1, 5.3.5, 5.3.8, 5.3.15, 5.3.19, 5.3.23. [NOTE: all assigned exercises are the same in 6th ed. of the book as in the 5th edition--the only exception is ex. 5.3.19.]**Week 4 (Jan. 29)**Monte Carlo simulation. Properties of estimators: bias, efficiency, sufficiency, consistency. Larsen/Marx: Ch. 5.4, 5.6, 5.7. [Section 5.5 will be covered next week.] SticiGui: Ch. 25. Check out the slides from 4th week.**HW 4 (due Thursday Feb 8):**Larsen/Marx: ex. 5.4.2, 5.4.15, 5.4.17, 5.6.4, 5.7.6. NOTE: in ex. 5.7.6, the (59) appearing there is a typo--please disregard it.

Also do the following:

A. The R command rnorm() can be used to generate X1,...,Xn i.i.d. N(0,1). Use it to give an approximation to the N (0,1) tables by by Monte Carlo simulation. In other words, generate X1,...,Xn for n=1,000, collect these values in a vector called X and approximate the z-table 97.5% quantile by using the R command: quantile(X, 0.975). Repeat with n=10,000 to get a better approximation.

B. The R command qt(p,m) returns the p-quantile of the t distribution with m degrees of freedom. Plot qt(0.975,m) as a function of m in order to verify that qt(0.975,m) is always bigger than 1.96 but converges to 1.96 as m increases.

C. Obtain an approximation to the t distribution by Monte Carlo simulation. To fix ideas, let n=10.

Step S: Generate X1,...,Xn i.i.d. N(0,1) and compute the t-statistic: T = sqrt{n}* barX / hatSigma where barX is the sample mean and hatSigma the sample standard deviation, i.e., the square root of the sample variance.

Repeat Step S many times (say 499 times), and collect the 499 values of T in a vector denoted by VT. Plot a histogram of VT and compare it to a plot of the t distribution with m=n-1. Also compute the Monte Carlo approximation to qt(0.975,m) by using the R command: quantile(VT, 0.975).**Week 5 (Feb. 5)**Confidence intervals based on an estimator's asymptotic distribution. Asymptotic distribution of MOM estimator via delta-method. Asymptotic normality and efficiency of the MLE. Standard errors via the parametric bootstrap. Larsen/Marx: Ch. 5.5 and 5.9. SticiGui: Ch. 25 and 26. See also the excerpt from John Rice's book. Check out the slides from fifth week and the bootstrap slide.**HW 5 (due Thursday Feb 15):**HW5 consists of Ex. 5.5.3 plus the 8 exercises described below.

Ex. 1. Setup: Data X1,...,Xn i.i.d. from Exponential (\theta) where EX=\theta >0 and Var (X)= \theta^2.

Ex. 2. Setup: Data X1,...,Xn i.i.d. from Normal (\theta, 4).

Ex. 3. Setup: Data X1,...,Xn i.i.d. from Normal (0, \sigma^2); \theta=\sigma^2, i.e., the variance.

Ex. 4. Setup: Data X1,...,Xn i.i.d. from Normal (0, \sigma^2); \theta=\sigma >0, i.e., the standard deviation.

Let n=100. In all cases, assume that you observed a sample mean equal to 5 and a sample variance (using n-1 in the denominator) equal to 9.

For Exercises 1-4: construct a 95% confidence interval for parameter \theta using the asymptotic distribution of the MLE with variance given by the inverse of Fisher information.

For Exercises 5-8: redo Exercises 1-4 but now use the parametric bootstrap in an R simulation to estimate the variance of the MLE. Compare the bootstrapped variance with the one obtained via the Fisher information.**Description of parametric bootstrap algorithm:**For each of the four Setups, do the following: Generate B=200 pseudo-samples (each of size n). From each pseudo-sample, compute the value of the estimator, and collect the 200 pseudo-estimator values in a vector; the bootstrap estimator of variance is the sample variance of the 200 pseudo-estimator values. NOTE: in order to generate the pseudo-samples, you will need to use each corresponding Setup with the MLE plugged-in instead of \theta (which is not known) in the data generating mechanism.**Week 6 (Feb. 12)**Hypothesis testing. Type I and type II errors. Simple and composite hypotheses. Larsen/Marx: 6.1, 6.2, 6.3, and the beginning of 6.4. SticiGui: Ch. 27. Check out the slides from sixth week.**HW 6 (due Thu Feb 22):**Larsen/Marx: Ex. 6.2.1, 6.2.5, 6.2.8, 6.2.9, 6.2.10, 6.3.3, and 6.3.9. [For ex. 6.3.9 part (b): just calculate the rejection probability when p equals 0.7, 0.5, and 0.3; make a rough graph of how rejection probability depends on p using the 4 points calculated. You can do this by hand or using R.]**Week 7 (Feb. 19)****Monday Feb. 19 is a holiday!**Hypothesis testing. P-values. One-sided vs. two sided tests. Larsen/Marx: 6.1, 6.2, 6.3, and 6.4. SticiGui: Ch. 27. Check out the slides from seventh week .**HW 7 (due Thu Mar 1):**Do the following four problems.**A.**Let X1,...,Xn be i.i.d. Bernoulli (p) and consider testing H0: p=1/2 vs. H1: p=3/4 at level alpha=0.05 using two possible test statistics: Sm= (X1+X2+...+Xm)/m and Sn= (X1+X2+...+Xn)/n where m=n/2. Calculate the probabilities of type I and II error for the test associated with statistic Sm and compare them with the corresponding ones using statistic Sn. What do you observe, i.e., which test statistic is preferable? Here, assume that n is a large even integer so that the normal approximation to the binomial is accurate for both Sm and Sn. For concreteness, just assume n=100.

[By the way, a rule of thumb for when the Normal approximation to Binomial(n,p) is advisable is that n is large and the interval: np plus/minus 1.96 sqrt{np(1-p)} is a subset of the range [0,n]. If n is large but the interval condition is not satisfied, a Poisson approximation to the Binomial may be better suited (than the Normal).]**B.**Let X1,...,Xn be i.i.d. N(\mu, \sigma^2) with \sigma^2=9. Assume n=100 and construct (and give a rough graph of) the power curve for the \alpha=0.05 level test of H0: \mu=10 vs. H1: \mu

OM 335 Bagchi Names: SOLUTION Group Homework-2 Case: National Cranberry Cooperative (Abridged) The following refers to the ‘process fruit’ operation at RP1 in 1981. Note that dry and wet berries share some resources (Kiwanee dumpers, dual-use holding bins and separators). In case of resource contention, assume that capacities will be allocated in a 70:30 ratio to the two types of berries to reflect the percentage of wet berries (70%) expected in 1981. Also assume that these capacities can be reallocated as necessary between dry and wet berries with little or no changeover time. Finally, assume that the actual processing time of berries is very short. Analysis: (1) Draw a process flow chart (diagram) for dry and wet berry processing. Assume that bulking and bagging capacity is more than adequate. What are the bottlenecks for dry and wet berry processing? What are the capacities (in barrels per hour) for dry and wet berry processing? Process Flow Diagram Figure 1 Figure 1 shows the process flow diagram. The following extract from the case helps us to construct the process flow diagram: (page 4, lines 1-3) “The process could be classified into several operations: receiving and testing, dumping, temporary holding, destoning, dechaffing, drying, separation, and bulking and bagging.” Truc k Que ue Kiwanee Dumpers (5) Dry Storage Bins17-24, 1-16 Wet Storage Bins17-24, 25-27 Deston e (3) Dechaff (3) Dryers (3) Weigh, test cranberrie 3000 b/ 4000-6000 bbls 1200-3200 bbls 4500 4500 600 b/ Jumbo Separators & Bailey Mills (3 1200 Bulking & Baggin Trucks D W D W W W D D Legend b/h = bbl/hr D = dry cranberries 1

## One thought on “Sigma W Session 5 Homework”