Real World Health Care Data Analysis. Uwe Siebert

Читать онлайн книгу.

Real World Health Care Data Analysis - Uwe Siebert


Скачать книгу
rel="nofollow" href="#ulink_463b1ad9-0b58-5f97-ba1c-c3c8a5a6828d">3.5.1 Simulated REFLECTIONS Data

       3.5.2 Simulated PCI Data

       3.6 Summary

       References

      In this chapter, we present both the core data sets that are used as examples throughout the book and demonstrate how to simulate data to mimic an existing data set. Simulations are a common tool for examining and comparing the operating characteristics of different statistical methods. One must know the true value of the parameter of interest when assessing how well a particular method performs. In simulations, as opposed to a case study from actual data, the true parameter values are known, and one can test the performance of methods across various data scenarios specified by the research. However, real world data is very complex – with complex distributions and correlations amongst the many variables, missing data patterns, and so on. Often, published simulations are performed with a limited number of variables using known parametric functions to generate values along with simple or no correlations between covariates or missing data. Thus, simulations based on actual data that retain the complex correlations and missing data patterns, often called “plasmode simulations” (Gadbury et al. 2008, Franklin et al. 2014), can provide a superior test of how methods perform under real world data settings.

      This chapter is structured as follows. Sections 2 and 3 present background information about two observational studies (REFLECTIONS and Lindner) that serve as the basis of analyses throughout the book. Section 4 discusses options for simulating real world data from an existing study data set. Section 5 presents the SAS code and the analysis data sets generated for use in the later chapters.

      The Real World Examination of Fibromyalgia: Longitudinal Evaluation of Cost and Treatments (REFLECTIONS) study was a prospective observational study conducted between 2008 and 2011 at 58 clinical sites in the United States and Puerto Rico (Robinson et al. 2012). The primary objective of the study was to examine the burden of illness, treatment patterns, and outcomes for patients initiating new treatments for fibromyalgia. Data was collected via physician surveys, a clinical report form completed at the baseline office visit, and computer-assisted telephone patient interviews at five time points over the one-year study. The physician surveys collected information about the clinical site and lead physician, including physician demographics and practice characteristics. At the baseline visit, data from a thorough clinical summary of the patient was captured. This included demographics, medical history, socio-economic and work/disability status, and treatment. Phone surveys at baseline and throughout the study included information from the patient regarding changes in treatments and disease severity using multiple validated patient rating scales.

      The study enrolled a total of 1700 patients and 1575 met criteria for the analysis dataset. A summary of the demographics and baseline patient characteristics is provided in Section 3.5. One analysis conducted from the REFLECTIONS data was an examination of outcomes from patients initiating opioid treatments. Peng et al. (2015) used propensity score matching to compare Brief Pain Inventory (BPI) scores and other outcomes over the one-year follow-up period for patients initiating opioids versus those initiating other treatments for fibromyalgia. We use this example to demonstrate the creation of two simulated data sets based on the REFLECTIONS data: a one observation per patient data set used to demonstrate various propensity score-based analyses in Chapters 4–10 and a longitudinal analysis data set used to demonstrate marginal structural model and replicates analysis methods in Chapters 11 and 12.

      The Lindner study was also a prospective observational study (Kereiakes et al. 2000). It was conducted in 1997 at a single site, the Lindner Center for Research and Education, Christ Hospital, Cincinnati, Ohio. Lindner staff members used their research database system to store detailed patient data, including patient feedback and survival information from at least six consecutive months of telephone follow-up.

      Lindner doctors were high-volume practitioners of interventional cardiology involving percutaneous coronary intervention (PCI). Specifically, all Lindner operators performed >200 PCIs/year, and their average was 280 PCIs per operator in 1997. The only viable alternative to some PCI procedures is open-heart surgery, such as a coronary artery bypass graft (CABG).

      Follow-up analyses of 1472 consecutive PCIs performed at the Lindner Center in 1997 found that their research database contained the “initial” PCIs for 996 distinct patients. Of these patients, 698 (roughly 70% of the 996) had received usual PCI care augmented with planned or rescue use of a new “blood thinner” treatment and are considered the treated group in later analyses. On the other hand, 298 patients (roughly 30% of the 996) did not receive the blood thinner during their initial PCI at Lindner in 1997; these 298 patients constitute the “usual PCI care alone” treatment cohort (control group). Details of the variables included in the data set are provided in section 3.5.2. The simulated PCI15K data set is used in the example analyses of Chapter 7 (stratification), Chapter 14 (generalizability), and Chapter 15 (personalized medicine).

      The term “plasmode” has come to represent data that is based on real data (Gadbury et al. 2008). In our case, we wanted a data set that contained no actual patient data – in order that we could freely share and allow readers to implement the various approaches in this book without confidentiality or ownership issues. However, we also wanted data that was truly representative of real world health care research – maintaining the complex correlation structures and addressing common research interests. Thus, “plasmode” simulations based on the REFLECTIONS and Lindner studies were used to generate the data sets used in the remainder of this book. In particular, the method of rank transformations of Conover and Iman (1976) as implemented by Wicklin (2013) serves as the basis for the programs.

      The Peng et al. (2015) analysis from the REFLECTIONS study included 1575 patients in 3 treatment groups based on their treatment at initiation: opioid treatments (378), non-narcotic opioid like treatment (215), and all other treatments (982). Each patient had up to 5 visits including baseline. Tables 3.1 and 3.2 list the key variables in the original analysis data set from which the simulated data was formed.

      Table 3.1: List of Patient-wise Variables

Variable NameVariable Label
SubjIDSubject Number
CohortCohort
GenderGender
AgeAge in years
BMI_BBMI at Baseline
RaceRace
InsuranceInsurance
DrSpecialtyDoctor Specialty
ExerciseExercise
InptHospInpatient hospitalization in last 12 months
MissWorkOthOther missed paid work to help your care in last 12 months
UnPdCaregiverHave you used an unpaid caregiver in last 12 months
PdCaregiverHave you hired a caregiver in last 12 months
DisabilityHave you received disability income in last 12 months
SymDurDuration (in years) of symptoms
DxDurTime (in years) since initial Dx
TrtDurTime (in years) since initial Trtmnt
PhysicalSymp_BPHQ 15 total score at Baseline
FIQ_BFIQ Total Score at Baseline
GAD7_BGAD7 total score at Baseline
MFIpf_BMFI Physical Fatigue at Baseline
MFImf_BMFI Mental Fatigue at Baseline
CPFQ_BCPFQ Total Score at Baseline
ISIX_BISIX total score at Baseline
SDS_BSDS total score at Baseline

      Table 3.2: List of Visit-wise Variables

Variable NameVariable Label
VisitVisit
OPIynOpioids
Скачать книгу