Abc Assignment Help

POPH90013 Biostatistics: Association of IgE Titres with Reaction Severity Assignment 1 Answer

Biostatistics POPH90013 Assignment 1, 2020 

The maximum mark for this assignment is 60. It forms 30% of the final grade. Your assignment will be submitted via Canvas as a Microsoft Word document. Unless you are asked to do so, please do not include any Stata output in your assignment document. Instead, format any results you want to show in a way that would be suitable for inclusion in a study report or research paper. 

Question 1 [Total 14 marks] 

Note: This question does not require Stata

The two Figures in this question were originally published in the article by Benhamou AH et al.“Correlation between specific immunoglobulin E levels and the severity of reactions in egg allergic  patients”, Pediatr Allergy Immunol (2008), issue 19, pp. 173–179. The authors used clinical data of 51 oral food challenges to egg, raw or cooked, performed between  January 2003 and December 2005. The aim of this study was to determine whether specific immunoglobulin E (IgE) titres were associated with the severity of the reaction during a standardized egg challenge. Serum was obtained for quantification of egg white IgE antibody titres on the day of the food challenge, or within a time range of <6 month before. For illustration purposes, the eggwhite specific IgE levels were used after the log transformation. 

Box and whisker plots for immunoglobulin E titres to egg white in patients

 Figure 1. Box and whisker plots for immunoglobulin E titres to egg white in patients with no clinical reaction, mild and moderate reactions, or severe reactions during a food challenge. 

a) [6 marks] Describe how the distribution of log egg white IgE differs between the three groups in Figure 1. Limit comments to comparing the location, spread and maximum/minimum levels of log egg white IgE. Note: There is no need to report the numerical value of the summary statistics you use; instead you should refer to the name of the summary statistic you are comparing (e.g., median increases/decreases). 

b) [2 marks] From Figure 1, which group (no reaction, moderate or severe) has a skewed distribution for log egg white IgE? How can you tell from the box and whisker plot? 

c) [4 marks] Using Figure 1 only, draw three new box and whisker plots for egg white IgE, instead of log egg white IgE, in patients with no clinical reaction, moderate reactions, and  severe reactions during the food challenge.From the three new box and whisker plots, which groups (no reaction, moderate or severe)  have a skewed distribution for egg white IgE? Hint: Use the box and whisker plots in Figure 1 to estimate (i.e. make an educated guess based on what you see in the visual display of data) the relevant values of the median, 25% and 75% percentiles, maximum and minimum values for each plot. Then, to get the egg white IgE, remove” the log transformation by taking the exponent of these values (i.e., by using the or the exp button on the calculator). 

d) [2 marks] Based only on Figure 2 below, if you see a patient result with value of log egg white IgE equal to 1.1 KU/l, is it more likely that the patient was exposed to a raw egg or a  

cooked egg? Why? 

Box and whisker plots for immunoglobulin E titres to raw or cooked egg white

Figure 2. Box and whisker plots for immunoglobulin E titres to raw or cooked egg white in patients with a positive food challenge. 

Question 2 [Total 10 marks] 

Note: This question does not require Stata

Table 1 (see below) gives the results of a randomised controlled trial comparing two types of weekly therapeutic venesection (blood removal) for the treatment of severely elevated serum ferritin (SF), a  biochemical marker of iron overload disease.The first therapy was whole blood donation (call this therapy “Whole Blood”). The second therapy was “plasmapheresis” where, after the initial removal and separation of whole blood, the blood cells are returned to the body, so this therapy involves the net removal of plasma only (call this therapy “Plasma Only”). The aim of the trial was to determine whether removal of bloods cells and plasma (“Whole blood”), rather than just plasma alone (“Plasma Only”), was more effective in reducing SF.Investigators based their statistical analysis of the data from this trial on a two-group (“Whole Blood” and “Plasma Only”) comparison of sample means of SF. A difference in mean SF of 100 ng/L corresponds to a clinically relevant effect. 

sample means of SF. A difference in mean

Table 1 

a) [4 marks] Using the data presented in Table 1, calculate and interpret a 95% confidence interval for the population mean difference in SF between the two therapies. 

b) [2 marks] Using the data presented in Table 1calculate and interpret the P value for the null hypothesis that there is no difference in population mean SF between the two therapies. 

c) [4 marks] What is your interpretation of the results of part (a) and part (b) above? Specifically, what do: (i) the difference between sample mean SF; (ii) the 95% confidence interval for the difference between the population mean SF; and (iii) the P value tell us about the effectiveness of “Whole Blood” versus “Plasma Only” as a treatment for severely elevated SF? 

Question 3 [Total 6 marks] 

Note: This question does not require Stata

A complete version of Table 2 (see below) was originally published in the article by Ghilotti F et al. “Obesity and risk of infections: results from men and women in the Swedish National March Cohort”, International Journal of Epidemiology, Vol. 48, Issue 6, December 2019, Pages 1783–1794.

Baseline characteristics of participants in the Swedish National March Cohort

Table 2. Baseline characteristics of participants in the Swedish National March Cohort, stratified by Body Mass Index (BMI) and gender. Numbers with percentages, or medians (Q2) with interquartile  ranges (Q3 ; Q1). 

Assume that height is normally distributed within the sample and within the population for each BMI category and that the sample mean, and the sample standard deviation computed in part (a)  

below are reasonable estimates of the corresponding population parameters and sample standard deviation of height (sfor the following two formulae: 

Sample Standard Deviation of Height formulae

b) [2 marks] Using your answers from part (a), estimate the proportion of: 

• females with BMI in the range 18.5 ≤ BMI < 25 who are less than 167.5cm tall, and 

• males with BMI in the range 18.5 ≤ BMI < 25 who are between 160cm and 180cm tall. 

c) [2 marks] Calculate a range of height values within which the middle 95% of: 

• females with BMI in the range 18.5 ≤ BMI < 25 lie within; 

• males with BMI in the range 18.5 ≤ BMI < 25 lie within. 

d) [1 mark] Calculate and interpret a 95% confidence interval for the population mean height in females with BMI in the range 18.5 ≤ BMI < 25. 

Question 4 [Total 30 marks] 

To answer this question, you will need to use the Stata dataset  vitaminD.dta  which can be downloaded from the folder “Assignment 1” in the Assessment area on Canvas.Vitamin D is critical for regulation of important minerals found in the human body and is obtained from exposure to the sun. Insufficient levels of vitamin D (i.e., vitamin D deficiency) have been linked to increased risk of cardiovascular disease, cancer and asthma, among other conditions. In general, the best way to measure the concentration of vitamin D in a human body is through a blood test. A blood level of vitamin D that is less than 30 nmol/L is considered a serious deficiency.This dataset on vitamin D levels comes from a book "Regression with Linear Predictors" by Per Kragh Andersen and Lene Theil Skovgaard published in 2010 and has been modified from the original. The dataset is a subset of a large cross-sectional observational study on vitamin D concentration conducted in Europe. The data set contains 6 variables and 213 observations.  detailed description of all the variables in the data is below: 

Data Dictionary for vitamin D data

The study investigators are interested in addressing the following research question: 

“To estimate the difference in population mean vitamin D concentration between overweight/obese individuals and individuals in the normal BMI range, separately for (i) individuals who prefer the sun, and (ii) those that avoid the sun.” 

a) [1 mark] What type of data is each variable in Table 3? 

b) [1 mark] Complete all entries in the table below. 

 population mean vitamin D concentration

c) [2 marks] How many data values are missing for the variable vitd for each category of bmicat (e.g., of all the individuals that are overweight/obese, what is the number of data values missing for the variable vitd)? 

d) [2 marks] What is the mean vitamin D level (vitd) of all female participants? What is the mean vitamin D level (vitd) of all female participants with normal BMI? 

e) [2 marks] How many males in the data set have vitamin D levels (vitd) greater than 30 nmol/L? What is their median age? 

f) [1 mark] What category of sun exposure is most common in female participants with vitamin D level less than 50 nmol/L? 

g) [1 mark] Of all individuals that are in the normal BMI category, what percentage are female and what percentage are male? 

h) [1 mark] What percentage of all individuals in the data set are males who avoid the sun? What percentage of all individuals in the data set are males who prefer the sun? 

i) [2 marks] Use Stata to produce a histogram of vitamin D levels (vitd) for each of the four categories of BMI (bmicat) and sun exposure (sunexp); that is, one histogram for each of the following four strata: 

• normal BMI category and avoids the sun; 

• normal BMI category and prefers the sun; 

• overweight/obese BMI category and avoids the sun; 

• overweight/obese BMI category and prefers the sun. 

Look up the help file for the histogram function for the relevant options to make the following changes to the graph: 

• use the by()option to display the four histograms in a single plot; 

• use 10 bars (or bins) per histogram; 

• add a plot of the normal distribution to each histogram;Copy the graph directly into your assignment document by clicking on edit/copy in the Stata graph window. You may also use the “File -> Save As” feature in Stata to save the graph as an image that you can import into Microsoft Word. 

j) [2 marks] Use Stata to produce an appropriate graph to display the relationship between vitamin D level (vitd) and sun exposure category (sunexp). Copy the graph directly into your assignment document by clicking on edit/copy in the Stata graph window. You may also use the “File -> Save As” feature in Stata to save the graph as an image that you can import into Microsoft Word. Based only on this graph, which sun exposure category has, on average, a higher median vitamin D level (vitd)

k) [4 marks] Provide a table that summarises the distribution (sample size, range, mean, standard deviation, 25th / 50th / 75th percentiles, standard error of the sample mean) of vitamin D level (vitd) for each of the four strata defined by BMI category and sun exposure. Recall that the strata are: 

• normal BMI category and avoids the sun; 

• normal BMI category and prefers the sun; 

• overweight/obese BMI category and avoids the sun; 

• overweight/obese BMI category and prefers the sun. Ensure that the table is formatted properly (please do not copy and paste directly from Stata output). 

l) [1 mark] Serious vitamin D deficiency may be classified as a serum vitamin D level less than 30 nmol/L. Generate a new binary variable in Stata called vitd_def that classifies each person as having normal (coded as 0) or deficient (coded as 1) vitamin D level. vitd (nmol/L) vitd_def 

Vitamin D deficiency

What is the observed proportion of individuals with serious vitamin D deficiency in each of the four strata defined by BMI category and sun exposure category? (e.g., Of all the individuals who have normal BMI and avoid the sun, what proportion have serious vitamin D deficiency?) 

m) [4 marks] Assume that you do not have access to the individual participant data for this study but are instead given only the summary statistics that you calculated in part (k). Using the normal distribution, estimate the proportion of individuals with vitamin D deficiency (i.e., vitd < 30 nmol/L) in each of the four “BMI category by sun exposure category” strata. 

Assume the following: 

• vitamin D concentration is normally distributed within each stratum; and 

• the sample mean and sample standard deviation calculated in part (k) for each stratum are good estimates of the corresponding population parameters. How do these proportions based on the normal curve compare with the observed sample proportions obtained in part (l)? Is the assumption of a normal distribution appropriate for  each stratum? 

n) [2 marks] Calculate and interpret the difference in sample mean vitamin D level between individuals in the normal BMI category and individuals in the overweight/obese BMI category, separately for: individuals who prefer the sun, and individuals that avoid the sun. 

o) [2 marks] Calculate and interpret a 95% confidence interval for the difference in populationmmean vitamin D level between individuals in the normal BMI category and individuals in the overweight/obese BMI category, separately for: individuals who prefer the sun, and individuals that avoid the sun. 

p) [2 marks] Calculate and interpret a P value for the null hypothesis that there is no difference in population mean vitamin D level between individuals in the normal BMI category and individuals in the overweight/obese BMI category, separately for: individuals who prefer the sun, and individuals that avoid the sun. 

Answer

For solution, connect with online professionals. 

Customer Testimonials