A Tutorial on GEE with Applications to Diabetes and Hypertension Data from a Complex Survey

  • Tahmina Akter University of Dhaka
  • Elizabeth Bianca Sarker University of Dhaka
  • Shafiqur Rahman University of Dhaka
Keywords: Correlated data, cluster survey, risk factors, GEE, diabetes, hypertension


Correlated data frequently arise from cross-sectional studies with complex cluster design because individuals from the same cluster or region share some common characteristics. Analyzing correlated data using standard statistical methods, which are applicable for independent data, may produce misleading inference. This article reviews the GEE and its software implementations and provides some guidelines for using it in practice. To illustrate GEE, data from the 2011 Bangladesh Demographic and Health Survey, a two-stage complex cluster survey have been used to identify the risk factors for diabetes and hypertension. The results suggest that age, current working status, education, socioeconomic status, and body mass index are significantly associated with hypertension and diabetes. Further, we found significant positive correlation between the responses from the same cluster, justifying the use of GEE.