# Heart Disease Dataset Assessment Answer

Pages: 4 Words: 890

# SectionA-HeartDiseaseDataset

Heart disease is the leading single cause of death in many countries across the globe including Australia. Heart disease could be in various forms including, coronary artery diseases, arrhythmias and congenital heart defects, among many others.

For this section of the assignment you will be using the dataset “Assignment3_heart.dtaa modified dataset from a heart disease study conducted in Cleveland with 300 participants. The aim of the study was to explore risk factors of heart disease.

Table 1: Description of the variables in the Assignment3_heart.dta dataset

Question 1 [19 marks]

High blood pressure is more common among smokers. In this question we will explore the association between smoking (smoking) and blood pressure (bloodpressure, mmHg).

1. Using Stata, obtain the mean, standard deviation and standard error of the mean blood pressure for smokers and non-smokers in this study, and complete the table below. Write down the Stata command you used to obtain these statistics. [2 marks]

1. Calculate by hand the difference in mean blood pressure between smokers and non-smokers. [3 marks]
2. Calculate by hand the standard error of the difference in mean blood pressure between smokers and non-smokers. [3 marks]
3. Calculate by hand the 95% confidence interval for the population mean difference in blood pressure between smokers and non-smokers. [4 marks]
4. Calculate by hand a two-sided p-value for the null hypothesis that there is no difference in the population mean blood pressure between smokers and non-smokers. [2 marks]
5. Interpret the estimated mean difference in blood pressure between smokers and non-smokers, the corresponding 95% confidence interval for the population mean difference and the p-value calculated above, and comment on the association between smoking and blood pressure. [5 marks]

# Question2[18marks]

A key risk factor of heart disease is smoking. In this question we will explore the association between smoking (smoking) and heart disease.

Presence of heart disease is detected if a patient has at least 1 major vessel with >50% narrowing (vessels). Generate a binary variable for heart disease named heart_disease as; at least 1 major vessel with >50% narrowing [coded as 1, “Yes”] and no major vessels with >50% narrowing [coded as 0, “No”]. Use this new binary variable heart_disease for Question 2.

1. Calculate using Stata, the proportion of smokers and non-smokers with heart disease in this study. [3 marks]
2. Calculate by hand, the risk ratio for the association between smoking and heart disease. [3 marks]
3. Calculate by hand the 95% confidence interval for the population risk ratio for the association between smoking and heart disease. [Hint: Use Stata to obtain the number of participants in each group, for example, number of smokers with heart disease etc. required for this calculation.[9 marks]
4. Interpret the estimated risk ratio for the association between smoking and heart disease and the corresponding 95% confidence interval for the population risk ratio calculated above, and comment on the association between smoking and heart disease. [3 marks]

Section B - Frailty Dataset

Frailty is a state of vulnerability resulting from a decline in physical and cognitive capabilities. Particularly in surgical and intensive care unit (ICU) patients, frailty predisposes to poor outcomes. Frailty is more prevalent among older patients and is associated with increased mortality, length of stay in hospital and post-operative complications.

For this section of the assignment you will be using the dataset “Assignment3_frailty.dta”, a random sample of 200 patients from an observational cohort study conducted in Melbourne. The aim of this study was to explore the risk factors of frailty and the association between frailty and other health outcomes.

Table 2. Description of the variables in the Assignment3_frailty.dta dataset.

Question 3 [9 marks]

The study investigators were interested in exploring the association between frailty (frailtyindex) and a patient’s source of admission to hospital. In the current cohort, prior to admission patients were either living at home, in other hospitals or at assisted living care facilities (adm_source).

For the purpose of this question, using the adm_source variable, generate a binary variable named adm_source_bin as; patients living at home [coded as 0, “Home”] and patients from other hospitals and assisted living facilities [coded as 1, “Other hospital/Assisted living”]. Use this new binary variable adm_source_bin for Question 3.

1. Conduct an unpaired t-test in Stata and obtain the difference in the mean frailty index between patients from other hospitals/assisted living and patients living at home prior to hospitalisation, the corresponding 95% confidence interval for the population mean difference and the p-value for the null hypothesis that there is no difference in the population mean frailty index between patients from other hospitals/assisted living and patients living at home prior to hospitalisation. [4 marks]
2. Interpret the estimated mean difference in the frailty index between patients from other hospitals/assisted living and patients living at home prior to hospitalisation, the corresponding 95% confidence interval for the population mean difference and the p-value you obtained from the unpaired t-test above, and comment on the association between source of admission and frailty. [5 marks]

# Question4[9marks]

One of the research questions of interest of this study was to explore if patients with frailty were more susceptible to diabetes with end organ damage.

In this study, patients with a frailty index score ≥0.25 were considered as frail. Generate a binary variable named frailtyindex_bin as; frailty index score ≥0.25 [coded as 1, “Frail”] and frailty index score <0.25 [coded as 0, “Non-frail”]. In this question we will explore the association between the outcome diabetes with end organ damage (