Explore, with the developer’s technical experts, and facilities why the measure has leveled out in performance ratings. Have the measure submitted for maintenance review in three years.
This measure calculates the percentage of acute ischemic stroke or hemorrhagic stroke patients who arrive at the emergency department (ED) within two hours of the onset of symptoms and have a head computed tomography (CT) or magnetic resonance imaging (MRI) scan interpreted within 45 minutes of ED arrival. The measure is calculated using chart abstracted data, on a rolling, quarterly basis and is publicly reported, in aggregate, for one calendar year. The measure has been publicly reported, annually, by CMS as a component of its Hospital Outpatient Quality Reporting (OQR) Program since 2012.
-
-
1.5 Measure Type1.6 Composite MeasureNo1.7 Electronic Clinical Quality Measure (eCQM)1.8 Level Of Analysis1.9 Care Setting1.10 Measure Rationale
Not applicable; this measure is not a paired or grouped measure.
1.11 Measure Webpage1.20 Testing Data Sources1.25 Data SourcesThis measure is derived from medical record abstraction (paper or electronic). This is not an eMeasure. Administrative claims are listed as a data source as the measure is calculated based on four consecutive quarters of hospital outpatient claims data.
An electronic data collection tool is made available from vendors or facilities can download the free CMS Abstraction & Reporting Tool (CART). Paper tools for manual abstraction, which are posted on www.QualityNet.org, are also available for the CART tool. These tools are posted on www.QualityNet.org
-
1.14 Numerator
Emergency department (ED) acute ischemic stroke or hemorrhagic stroke patients arriving at the ED within 2 hours of the time last known well, with an order for a head CT or MRI scan whose time from ED arrival to interpretation of the Head CT scan is within 45 minutes of arrival.
1.14a Numerator DetailsTime from ED arrival to interpretation of the Head CT scan is within 45 minutes of arrival.
First determine if the patient encounter meets the denominator criteria (i.e., age 18 or older who were last known well within two hours of ED arrival and had a head CT or MRI ordered) if so, then assess the chart for the time the head scan was interpreted.
Next calculate the time difference between ED arrival and interpretation time of the head scan. The Head CT or MRI Scan Interpretation Date and Time is defined as the month, day, and year date and time (military time) represented in hours and minutes at which the earliest head CT or MRI scan interpretation was completed or reported.
-
1.15 Denominator
Emergency department acute ischemic stroke or hemorrhagic stroke patients arriving at the ED within two hours of the time last known well with an order for a head CT or MRI scan.
1.15a Denominator DetailsFirst, the patient encounter must meet the stroke population criteria which includes an ED encounter identified by one of these six CPT® evaluation and management codes:
99281 Emergency department visit, new or established patient
99282 Emergency department visit, new or established patient
99283 Emergency department visit, new or established patient
99284 Emergency department visit, new or established patient
99285 Emergency department visit, new or established patient
99291 Critical care, evaluation and management
The encounter date is during the appropriate calendar year and that the patient is 18 years or older with a principal diagnosis of ischemic and hemorrhagic stroke as identified by detailed lists located in the Excel file titled “OP Table 8.0: Ischemic Hemorrhagic Stroke.”
If the patient encounter meets the stroke population criteria, they are evaluated for inclusion in the denominator. For the denominator, first assess the Emergency Department Acute Ischemic Stroke or Hemorrhagic Stroke patient to determine the “Time Last Known Well”. The last known well is defined as the time prior to hospital arrival at which the patient was last known to be without the signs and symptoms of the current stroke or at his or her baseline state of health. Next, calculate the difference between the time the patient arrived in the ED and the time of the last known well, if difference in minutes is 45 minutes or less, determine if an order for a head CT or MRI scan exists.
-
1.15b Denominator Exclusions
Patients are excluded when less than 18 years of age, expired in the ED, or left the ED against medical advice, discontinued care, or those without a documented Discharge Code or Discharge code was unable to be determined.
1.15c Denominator Exclusions DetailsPatients excluded are those who meet any of the following criteria:
- less than 18 years of age at the start of the encounter
- expired (discharge code = 6)
- left the emergency department against medical advice or discontinued care (discharge
- code = 7)
- discharge code is not documented or was unable to be determined (discharge code=8)
-
OLD 1.12 MAT output not attachedAttached1.13 Attach Data Dictionary1.13a Data dictionary not attachedYes1.16 Type of Score1.17 Measure Score InterpretationBetter quality = Higher score1.18 Calculation of Measure Score
This measure calculates the percentage of acute ischemic stroke or hemorrhagic stroke patient encounters where the arrival time to the ED is within two hours of the last known well/onset of symptoms and have a head CT or MRI interpreted within 45 minutes of ED arrival. The measure is calculated based on four consecutive quarters of hospital outpatient encounter claims data, as follows:
- Check E/M Code; if on Table 1.0 proceed
- Calculate Patient Age (Outpatient Encounter Date
-minus Birthdate) - Check Patient Age; if >= 18, proceed
- Check ICD-10-CM Principal Diagnosis Code; if on Table 8.0, proceed
- Check Discharge Code; exclude any patients with code 6, 7, or 8
- Check for a Head CT or MRI Scan Order; if “Yes,” proceed
- Check Last Known Well documented; if “Yes,” proceed
- Check Date Last Known Well; if a Unable to Determine (UTD) value, proceed
- Check Time Last Known Well; if a UTD value, proceed
- Check Arrival Time; if a UTD value, proceed
- Calculate measurement value (Outpatient encounter date and arrival time minus Date Last Known Well and Time last known well (in minutes)
- Check Last Known Well Minutes measurement value; if >= 0 min and <= 120 min, record as the denominator and proceed
- Check Head CT or MRI Scan Interpretation Date; if a Unable to Determine (UTD) value, proceed
- Check Head CT or MRI Scan Interpretation Time; if a Unable to Determine (UTD) value, proceed
- Calculate Head CT/CTA or MRI/ MRA measurement value Head Ct or MRI scan Interpretation Date and Head CT or MRI Scan Interpretation Time minus Outpatient Encounter Date and Arrival Time (in minutes))
16.Check Head CT, CTA or MRA/MRI scan Minutes measurement value; if >= 0 min and <= 45 min, record as the numerator
17. Aggregate denominator and numerator counts by Medicare provider number Measure = numerator counts / denominator counts [The value should be recorded as a percentage]
1.19 Measure Stratification DetailsNot Applicable; this measure is not stratified.
1.26 Minimum Sample SizeEleven is the minimum number of cases required for public reporting.
-
Most Recent Endorsement ActivityInitial Recognition and Management Fall 2023Initial EndorsementLast Updated
-
StewardCenters for Medicare & Medicaid ServicesSteward Organization POC EmailSteward Organization URLSteward Organization Copyright
N/A
Measure Developer Secondary Point Of ContactErin Buchanan
Mathematica
1100 1st Street, NE, 12th Floor
Washington, DC 20002
United StatesMeasure Developer Secondary Point Of Contact Email
-
-
-
2.1 Attach Logic Model2.2 Evidence of Measure Importance
Powers et al. (2019) and the AHA/ASA Clinical Guidelines Writing Group published updated clinical guideline recommendations for the management of acute ischemic stroke which supports the measures intent. Several strategies in the guide have demonstrated improvement in door-to-imaging times (e.g., Emergency Medical Services activation, assessment, and management of patients). Other strategies, such as telemedicine and teleradiology, can improve access to care. Non-contrast CT and MRI remain effective in excluding intracerebral hemorrhage before intravenous alteplase administration, which aligns with OP-23. To identify patients who may benefit from mechanical thrombectomy between 6 and 24 hours after last know well time, the guidelines also recommend computed tomography angiography or magnetic resonance (MR) angiography with diffusion-weighted magnetic resonance imaging with or without MR perfusion. (Citation: Powers, W. J., Rabinstein, A. A., Ackerson, T., Adeoye, O. M., Bambakidis, N. C., Becker, K., Biller, J., Brown, M., Demaerschalk, B. M., Hoh, B., Jauch, E. C., Kidwell, C. S., Leslie-Mazwi, T. M., Ovbiagele, B., Scott, P. A., Sheth, K. N., Southerland, A. M., Summer, D., & Tirschwell, D. L. (2019). Guidelines for the early management of patients with acute ischemic stroke: 2019 update to the 2018 guidelines for the early management of acute ischemic stroke: A guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke, 50(12), e344–e418. http://doi.org/doi: 10.1161/STR.0000000000000211)
The updated guideline include the following recommendations:
Recommendation 1: All patients with suspected acute stroke should receive emergency brain imaging evaluation on first arrival to a hospital before initiating any specific therapy to treat AIS.
Recommendation 2: Systems should be established so that brain imaging studies can be performed as quickly as possible in patients who may be candidates for IV fibrinolysis or mechanical thrombectomy or both.
The benefit of IV alteplase is time dependent, with earlier treatment within the therapeutic window leading to bigger proportional benefits. A brain imaging study to exclude ICH is recommended as part of the initial evaluation of patients who are potentially eligible for these therapies. With respect to endovascular treatment, a pooled analysis of 5 randomized trials comparing EVT with medical therapy alone in which the majority of the patients were treated within 6 hours found that the odds of improved disability outcomes at 90 days (as measured by the mRS score distribution) declined with longer time from symptom onset to arterial puncture.42 The 6- to 16- and 6- to 24-hour treatment windows trials, which used advanced imaging to identify a relatively uniform patient group, showed limited variability of treatment effect with time in these highly selected patients. The absence of detailed screening logs in these trials limits estimations of the true impact of time in this population. To ensure that the highest proportion of eligible patients presenting in the 6- to 24-hour window have access to mechanical thrombectomy, evaluation and treatment should be as rapid as possible. Reducing the time interval from ED presentation to initial brain imaging can help to reduce the time to treatment initiation. Studies have shown that median or mean door-to-imaging times of ≤20 minutes can be achieved in a variety of different hospital settings.
Recommendation 3: Noncontrast CT (NCCT) is effective to exclude ICH before IV alteplase administration.
Recommendation 4: Magnetic resonance (MR) imaging (MRI) is effective to exclude ICH before IV alteplase administration.
Recommendation 5: (new recommendation) CTA with CTP or MR angiography (MRA) with diffusion-weighted magnetic resonance imaging (DW-MRI) with or without MR perfusion is recommended for certain patients. In many patients, the diagnosis of ischemic stroke can be made accurately on the basis of the clinical presentation and either a negative NCCT or one showing early ischemic changes, which can be detected in the majority of patients with careful attention. NCCT scanning of patients with acute stroke is effective for the rapid detection of acute ICH. NCCT was the only neuroimaging modality used in the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA (Recombinant Tissue-Type Plasminogen Activator) trials and in ECASS (European Cooperative Acute Stroke Study) III and is therefore sufficient neuroimaging for decisions about IV alteplase in most patients. Immediate CT scanning provides high value for patients with acute stroke. MRI was as accurate as NCCT in detecting hyperacute intraparenchymal hemorrhage in patients presenting with stroke symptoms within 6 hours of onset when gradient echo sequences were used. In patients who awake with stroke or have unclear time of onset >4.5 hours from baseline or last known well, MRI to identify diffusion-positive fluid-attenuated inversion recovery (FLAIR)–negative lesions can be useful for selecting those who can benefit from IV alteplase administration within 4.5 hours of stroke symptom recognition. CTA with CTP or MRA with DW-MRI with or without MR perfusion is useful for selecting candidates for mechanical thrombectomy between 6 and 24 hours after last known well.
Waqas et al. (2019) reviewed clinical practice guidelines and literature and recommended that emergency departments (1) develop specific protocols to triage patients based on whether a patient is admitted to the ED via an emergency medical services (EMS) transport, ED walk-in, or in-hospital stroke; (2) initiate imaging orders including non-contrast brain computed tomography (CT) scans, CT angiograms, CT perfusion imaging, and/or magnetic resonance imaging; (3) interpret scans within 20 minutes of presentation (based on the Stroke Process Time Metrics recommended by the Society of Neurointerventional Surgery); and (4) coordinate care transitions with ED facilities or an appropriate stroke center (Waqas 2019). Overall, this article reinforces the intent of OP-23 to provide timely stroke diagnosis and recommends strategies hospitals can take, such as developing context-specific protocols and coordinating care within the ED, to reduce the time from door to imaging results interpretation. (Citation: Waqas, M., Vakharia, K., Munich, S., Morrison, J., Mokin, M., Levy, E., & Siddiqui, A. (2019). Emergency Room Triage of Acute Ischemic Stroke. Neurosurgery, 85(suppl_1).S38-S46. https://doi.org/10.1093/neuros/nyz067)
-
2.6 Meaningfulness to Target Population
Lang et al. performed a cohort study to examine the benefit of tPA on patient-reported outcomes and health care utilization on 6-month stroke patients by analyzing patients who received tPA as part of usual stroke management and patients who would have received tPA had they arrived to the hospital within the therapeutic time window. Data were collected from surveys 6 months after stroke using standardized patient-reported outcome measures and questions about health care utilization. Demographic and medical data were acquired from hospital records. The tPA (n = 78) and control (n = 156) groups were matched across variables, except for stroke severity, which was better in the control group; subsequent analyses controlled for this mismatch. Patients who received tPA were compared with those who would have received tPA had they arrived to the hospital within the therapeutic window. The tPA group reported better physical function, communication, cognitive ability, depressive symptomatology, and quality of life/participation compared with the control group and fewer people in the tPA group reported skilled nursing facility stays, emergency department visits, and rehospitalizations after their stroke. Lang et al. found that the use of tPA provides a large benefit to the daily lives of people with ischemic stroke. (Reference: Lang C, Bland M, Cheng N, Corbetta M, et al. A case-control study of the effectiveness of tissue plasminogen activator on 6 month patients—Reported outcomes and health care utilization. Journal of Stroke and Cerebrovascular Diseases: The Official Journal of National Stroke Association. 2014; 23(10):2914–2919).
-
Table 1. Performance Scores by Decile
Performance Gap Overall Minimum Decile_1 Decile_2 Decile_3 Decile_4 Decile_5 Decile_6 Decile_7 Decile_8 Decile_9 Decile_10 Maximum Mean Performance Score 74.10 5.56 36.70 56.54 64.80 71.41 76.00 79.66 83.10 87.06 91.89 97.96 100 N of Entities 1431 1 161 130 139 144 149 142 151 137 141 137 78 N of Persons / Encounters / Episodes 28174 18 2866 2350 2663 2798 3165 2985 2972 3227 2490 2658 1237
-
-
-
3.1 Feasibility Assessment
Not applicable during the Fall 2023 cycle.
3.3 Feasibility Informed Final MeasureNot applicable. This measure is being submitted for maintenance.
-
3.4a Fees, Licensing, or Other Requirements
No fees, licensure, or other requirements are necessary to use this measure; however, CPT codes, descriptions, and other data are copyright 2022 American Medical Association. All rights reserved. CPT is a registered trademark of the American Medical Association. Applicable FARS\DFARS Restrictions Apply to Government Use. Fee schedules, relative value units, conversion factors, and/or related components are not assigned by the AMA, are not part of CPT, and the AMA is not recommending their use. The AMA does not directly or indirectly practice medicine or dispense medical services. The AMA assumes no liability for data contained or not contained herein.
3.4 Proprietary InformationProprietary measure or components (e.g., risk model, codes), without fees
-
-
-
4.1.3 Characteristics of Measured Entities
Data for both Clinical Data Abstraction Center (CDAC) and Clinical Data Warehouse (CDW) was obtained for 01-01-2018 through 12-31-2021 exclusive of January 1 through June 30, 2020 arrival date times due to COVID-19 considerations. The CDAC data contained 2,654 patients in 968 facilities and CDW contained 213,527 patients in 3,881 facilities. The data presented below in table 2 represents additional characteristics of the data used for testing.
Table 2. Characteristics of Facilities Meeting Minimum Case Count
Characteristics CDAC CDW
Date Collected 2018-01-01 to 2021-12-31 2018-01-01 to 2021-12-31
Sampled Population 2,654 213,527
Number of Facilities 968 3,881
Denominator Cases 1,650 139,865
Numerator Cases 1,195 104,023
Level of Analysis Facility Level Facility Level
4.1.1 Data Used for TestingThis measure was tested using patient record data abstracted from paper record, claims, and electronic health records stored in the Clinical Data Warehouse (CDW) and the Clinical Data Abstraction Center (CDAC). Data was obtained for 01-01-2018 through 12-31-2021 exclusive of January 1 through June 30, 2020 arrival date times. There are no differences in data for different aspects of testing.
4.1.4 Characteristics of Units of the Eligible PopulationThe data presented below in table 3 represents characteristics of patients included in the testing analysis. There are no differences in data for different aspects of testing. The majority of patients were white, non-Hispanic from ages 60-79 who suffered from Ischemic stroke. There was a fairly even split between male and female patients.
Table 3. Patient Characteristics among Facilities Meeting Minimum Case Count
Groups Number of patients (CDW) Performance Rates (CDW) Number of patients (CDAC) Performance Rates (CDAC)
Sex - - - -
Female 105286 73.41% 1313 70.98%
Male 108194 75.33% 1339 73.84%
Unknown Sex 47 66.67% 2 100%
Age - - - -
18-39 8885 63.48% 133 57.89%
40-59 52041 71.82% 640 68.95%
60-79 103634 75.19% 1262 73.51%
80 and Older 48967 77.66% 619 77.69%
Race - - - -
Asian 4704 73.22% 45 77.27%
Black or African American 26352 72.58% 359 71.62%
Unknown or Other 14339 71.15% 171 75.45%
White 168132 74.95% 2079 72.22%
Ethnicity - - - -
Hispanic/Latino 15977 68.56% 203 68.50%
Not Hispanic/Latino 197550 74.81% 2451 72.75%
Diagnosis - - - -
Hemorrhagic stroke 47385 66.38% 593 61.20%
Ischemic stroke 166142 76.93% 2061 76.19%
4.1.2 Differences in DataNot applicable. There are no differences in data for different aspects of testing.
-
4.2.1 Level(s) of Reliability Testing Conducted4.2.2 Method(s) of Reliability Testing
Reliability was calculated in accordance with the signal-to-noise method discussed in The Reliability of Provider Profiling: A Tutorial (2009). This approach calculates the ability of the measure to distinguish between facility performance. We calculated the signal-to-noise ratio for each facility meeting the minimum case count of 11, established by the measure calculation contractor during the data collection period, with higher scores indicating greater reliability. The reliability score is estimated using a beta-binomial model, which is appropriate for the reliability testing of pass/fail measures. The reliability score for each facility is a function of the facility’s sample size and score on the measure, and the variance across facilities.
Adams JL. The reliability of provider profiling: a tutorial. Santa Monica, CA: RAND Corporation. 2009. Retrieved from http://www.rand.org/pubs/technical_reports/TR653.
4.2.3 Reliability Testing ResultsTable 4 displays the distribution of signal to noise scores from 2021. Higher scores denote greater reliability. Reliability scores ranged from 0.43 to 1.00 and mean reliability score was 0.68.
Table 4. Results of Reliability Testing Based on Signal-to-noise analysis
Year: 2021Number of Facilities : 1431
Mean: 0.68
Standard Deviation: 0.15
Min: 0.43
5th Percentile:0.45
10th Percentile: 0.48
25th Percentile: 0.56
50th Percentile: 0.67
75th Percentile:0.77
90th Percentile:0.87
95th Percentile:1.00
Max:1.00
Table 2. Accountable Entity–Level Reliability Testing Results by Denominator-Target Population SizeAccountable Entity-Level Reliability Testing Results Overall Minimum Decile_1 Decile_2 Decile_3 Decile_4 Decile_5 Decile_6 Decile_7 Decile_8 Decile_9 Decile_10 Maximum Reliability 0.68 0.43 0.46 0.52 0.57 0.61 0.65 0.69 0.72 0.77 0.83 0.96 1.00 Mean Performance Score 1431 15 144 153 134 143 146 143 138 144 143 143 78 N of Entities 28174 165 1682 2180 2006 2404 2688 2885 3177 3576 4272 3304 1237 4.2.4 Interpretation of Reliability ResultsWhile there is no universal standard cut off for signal to noise, a reliability of 0.70 is considered the acceptable threshold for reliability. Our results for 2021 of a median reliability score of 0.67 and mean reliability score of 0.68 approach the 0.7 cut off indicating moderate reliability. Our results also align with the Draft Acceptable Reliability Thresholds suggested by the National Quality Forum (NQF) Scientific Methods Panel (SMP) in 2021 which propose the threshold of 0.6 ≥ 0.9 for adequate reliability. Our results indicate that the measure is able to identify true differences in performance between individual facilities.
-
4.3.1 Level(s) of Validity Testing Conducted4.3.2 Type of accountable entity-level validity testing conducted4.3.3 Method(s) of Validity Testing
Data Element Validity.
We assessed the data element validity of the measure by calculating a rate of agreement between facility abstraction (sourced from the CDW) and auditor (CDAC) abstraction for each of the data elements used to calculate the measure. The analysis used data element values for 1548 denominator cases abstracted by CDAC, which were previously abstracted by facilities. We then used Gwet’s AC-1 statistic to account for chance agreement. A Gwet’s AC-1 statistic less than 0.5 indicates a fair agreement, 0.5-0.8 indicates a medium effect size, and greater than or equal to 0.8 indicates a large effect size.
Hypothesis-driven validity.
We assessed the validity of the measure through literature informed hypothesis testing. Based on our reviews of literature,1,2 we anticipated that female patients would have a longer arrival to CT interpretation time than male patients. In addition to the t-statistic to detect statistical differences, we calculated Cohen’s D to show whether a difference is meaningful in practice or not.
- Sex and Race‐Ethnic Disparities in Door‐to‐CT Time in Acute Ischemic Stroke: The Florida Stroke Registry. Sai P. Polineni MPH, Enmanuel J. Perez MD, PhD, Kefeng Wang MS, Carolina M. Gutierrez PhD, Jeffrey Walker MBA‐HCM, Dianne Foster RN, BSN, MBA, Chuanhui Dong PhD, Negar Asdaghi MD, Jose G. Romano MD, Ralph L. Sacco MD, MS, Tatjana Rundek MD, PhD [email protected], and for the Florida Stroke Registry
- Predictors of Time From Hospital Arrival to Initial Brain-Imaging Among Suspected Stroke Patients. Kathryn M. Rose, PhD, Wayne D. Rosamond, PhD, Sara L. Huston, PhD, Carol V. Murphy, RN, MPH, and Charles H. Tegeler, MD
4.3.4 Validity Testing ResultsAs demonstrated in table 5, percent agreement ranged from 85% - 100%. Head CT/MRI Scan Interpretation Time had a percent agreement and Gwet’s AC1 score at 85% and 0.83 respectively. Head CT/MRI Scan Order, Last Known Well, Principal ICD code, E/M Code, Date Last Known Well (LKW), and Head CT/MRI Scan Interpretation Date had complete agreement (100%) and Gwet’s AC1 scores of 1.
Table 5. Data Element Validity for Categorical Variables, Non-categorical Variables, and Constructed Outcomes
Variable n Percent Agreement Gwet’s AC1
Discharge Code 1548 98% 0.98
Head CT/MRI Scan Order 1548 100% 1.00
Last Known Well 1548 100% 1.00
Principal ICD code 1548 100% 1.00
E/M Code 1548 100% 1.00
Arrival time 1548 99% 0.99
Date Last Known Well (LKW) 1548 100% 1.00
Time LKW 1548 93% 0.93
Head CT/MRI Scan Interpretation Date 1548 100% 1.00
Head CT/MRI Scan Interpretation Time 1548 85% 0.83
Numerator 1548 97% 0.97
Denominator 1548 100% 1.00
Hypothesis-driven validity
Table 6 shows that in 2021, the mean difference between females and males was 2.83 with a t-score of 2.47, p-value of 0.01 and Cohen’s d of 0.06.
Table 6. Empirical Validity Analysis of Differences between Males and Females
Year: 2021
Category: Patient Sex
Value: Female vs. Male
Mean Difference: 2.83
Confidence Interval Lower Limit:0.58
Confidence Interval Upper Limit : 5.07
t: 2.47
p : 0.01
Cohen’s d: 0.06
4.3.5 Interpretation of Validity ResultsData Element Validity.
Results demonstrated that the agreement between the data source and the gold standard is high, and the measure score correctly reflects the quality of care provided by identifying differences in quality. We used Gwet’s AC1 statistic to account for agreement by chance, a more robust measure of concordance than overall agreement.
Hypothesis-driven validity.
For 2021, there was a difference between females and males and that difference was statistically significant but based on the Cohen’s d of 0.06, the effect size of that difference is moderate. The groups differ by 0.06 standard deviations. From these results, we conclude that the differences by sex between ED arrival and Head CT/MRI scan are statistically significant. This conclusion aligns with the literature which indicates stroke signs are not always identified as quickly as in women as they are in men.
-
4.4.1 Methods used to address risk factors4.4.1b If an outcome or resource use measure is not risk adjusted or stratified
Not applicable. This measure is not an outcome or resource use measure and is not risk adjusted or stratified.
Risk adjustment approachOffRisk adjustment approachOffConceptual model for risk adjustmentOffConceptual model for risk adjustmentOff
-
-
-
5.1 Contributions Towards Advancing Health Equity
Optional question
-
-
-
6.1.3 Current Use(s)6.1.4 Program DetailsThe CMS Hospital Outpatient Quality Reporting Program, https://www.cms.gov/medicare/quality-initiatives-patient-assessment-instruments/hospitalqualityinits/hospitaloutpatientqualityreportingprogram, The Hospital OQR Program is a pay for quality data reporting program implemented by CMS for outpatient hospital services. In addition to providing hos, National, The publicly reported values (on Hospital Compare) are calculated for all facilities participating in the Hospital OQR Program in the United States th
-
6.2.1 Actions of Measured Entities to Improve Performance
In order to improve performance on this measure, measured entities must educate their providers around following guidelines for diagnosing and treating an acute ischemic stroke. These actions do not cause undue burden to the measure entities.
6.2.2 Feedback on Measure PerformanceFeedback received from stakeholders (via the ServiceNow tool) is used to revise the measure specifications. Following receipt of a suggestion to adjust the specifications, a literature review is performed to determine if the proposed change aligns with the empirical evidence base for the measure; feedback from the expert work group is obtained to evaluate the change to the specifications. To date, we have received no significant concerns raised by stakeholders about the measure specifications through ServiceNow. In addition, stakeholders may submit comments on the measure through the Outpatient Prospective Payment System (OPPS) annual rule-making process. No comments were received for this measure during the most recent OPPS rule-making cycle.
6.2.3 Consideration of Measure FeedbackTo date, we have received no significant feedback about the measure specifications.
6.2.4 Progress on ImprovementSummary statistics of performance scores during the January 1, 2018 through December 31, 2021 data collection periods are provided in the Gap section. In 2015, the average hospital score was 71.28% among 1276 hospitals. In 2016, there was an average change in hospital scores of 1.43%, the average hospital score was 73.27% among 1401 hospitals. In 2017, there was an average change in hospital scores of 1.64%, the average hospital score was 74.33% among 1507 hospitals. In 2018, there was an average change in hospital scores of 0.26%, the average hospital score was 73.21% among 1607 hospitals. In 2019, there was an average change in hospital scores of 0.28%, the average hospital score was 73.73% among 1592 hospitals. In 2020, there was an average change in hospital scores of 0.54%, the average hospital score was 75.89% among 502 hospitals. In 2021, there was an average change in hospital scores of 1.42%, the average hospital score was 71.53% among 1492 hospitals.
Performance scores have remained stable over the years showing continued room for improvement. As noted in prior submissions, the number of patients receiving high-quality healthcare as performance on the measure improves is larger than the number of cases captured by the measure because a hospital can choose to only report a sample cases.
6.2.5 Unexpected FindingsWe did not identify any unintended consequences during measure testing. Similarly, no evidence of unintended consequences to individuals or populations has been reported by external stakeholders since its implementation. We will continue to monitor the potential for unintended consequences through an annual review of the literature as well as an ongoing review of stakeholder comments and inquiries. The risk in advancing measures that address timeliness is that there may be a decrease in testing performance to avoid measurement, however this is not likely due to the need to assess diagnostic results to ensure a proper diagnosis.
-
-
-
CBE #0661 Staff Assessment
Importance
ImportanceStrengths:
- Updated clinical guidelines cited support the measure concept, including recommendations that suspected stroke patients receive brain imaging studies as soon as possible after arriving at the ED, that non-contrast CT and MRI are both effective at ruling out ICH before treatment, and that certain patients benefit from MRA/CTA (Powers et al. 2019), and a systematic review of guidelines makes similar recommendations (Waqas et al. 2019). The success of treatment with IV alteplase is time dependent, and developers cite a pooled analysis of 5 RCTs showing that treatment with EVT within 6 hours of stroke onset found odds of improved disability outcomes at 90 days (reference not provided).
- Mean hospital scores are stable in the 71-75% range from 2015 to 2020, showing room for improvement. Performance scores ranged from a minimum of 5.56% to maximum of 100%. Developer notes that performance scores are not limited to facilities present each year, indicating the limited number of facilities in 2020 skewed performance scores higher.
- Meaningfulness to patients was demonstrated by citing one study using patient survey and matched controls, which showed that treatment with tPA within the therapeutic time window was associated with better physical function, communication, cognitive ability, depressive symptomatology, and quality of life/participation compared with control, and fewer SNF stays, ED visits, and readmissions (Lang et al., 2014).
Limitations:
- Limitation of the pooled analysis is that 6-16 and 6-24 hour window trials showed limited variability of treatment effect with time, which the developers interpreted as evidence for the importance of rapid imaging (reference not provided).
- Sample in Lang et al. was relatively small (tPA (n = 78); control (n = 156))
Rationale:
The 2019 clinical guidelines and a systematic review of 5 randomized-controlled trials presented support the measure concept, emphasizing the importance of rapid brain imaging in suspected stroke patients visiting the ED, the availability of effective imaging tests, and the benefits of early treatment. Meaningfulness to patients was demonstrated in a small sample study using patient surveys, which showed improved function and quality of life, and reduced utilization among patients who received early tPA treatment. Performance scores show room for improvement and substantial variability.
Feasibility Acceptance
Feasibility AcceptanceStrengths:
- Developer does not state this explicitly but data elements are listed and all seem plausible as normally collected during the normal course of care. Developer states that “No fees, licensure, or other requirements are necessary to use this measure; however, CPT codes, descriptions, and other data are copyright 2022 American Medical Association [AMA].”
Limitations:
- Developers do not mention whether AMA copyright is a potential burden or barrier for providers. Under the SA data description, developer notes “Proprietary measure or components (e.g., risk model, codes)."
Rationale:
The measure appears to be feasible insofar as data elements are collected in the normal course of care and no fees or licensure is required; however, it is not clear how the AMA copyright or proprietary components could impact providers.
Scientific Acceptability
Scientific Acceptability ReliabilityStrengths:
- The measure is well defined and specified.
- Accountable entity-level (i.e., measure score) reliability testing was estimated using signal-to-noise analysis on a 2021 dataset consisting of 28,174 persons across 1,431 facilities meeting the minimum count of 11 cases. A decile table of reliability by population size is provided. The median reliability 0.68. The mean of the 3rd decile is 0.57, and the mean of the 4th decile is 0.61 which indicates that 65-70% of the entities have a reliability >0.6.
Limitations:
- Approximately 30-35% of entities have reliability less than 0.6, likely facilities with a low denominator size. Consider mitigation for entities with low case counts.
Rationale:
Measure score reliability testing (accountable entity-level) performed. However, reliability <0.6 for 30-35% of entities. Some possible mitigation strategies to improve these estimates could be to
- Empirical approaches outlined in the report, MAP 2019 Recommendations from the Rural Health Technical Expert Panel Final Report, https://www.qualityforum.org/WorkArea/linkit.aspx?LinkIdentifier=id&ItemID=89673
- Consider a higher minimum case volume.
- Extend the time frame.
- Focus on applying mitigation at the lower volume providers.
Scientific Acceptability ValidityStrengths:
- Data element validity testing was previously performed for claims, EHR, and paper records.
- Data element validity was assessed by calculating rate of agreement (% agreement and Gwet AC-1) in all data elements used to calculate the measure between facility (CDW) abstracted and auditor abstracted (CDAC, the gold standard) data, using a sample of 1548 denominator cases from CDAC, 2021 data. Minimum agreement was 85% / .83 (head CT/MRI interpretation time); all other elements had 93% / .93 – 100% / 1.0 agreement (large effect).
- Developer conducted accountable entity-level (measure score) validity testing, where the developer hypothesized that female patients would have a longer arrival to CT interpretation time compared to male patients. Mean difference showed longer time for female patients, Cohen’s D of 0.06 (moderate effect size), which the developer claims aligns with literature showing stroke signs are not identified as quickly in women.
- No risk adjustment since the measure is a process measure.
Limitations:
- Hypothesis testing confirms the expected difference in performance between men and women (worse for women), which aligns with expectation based on slower identification of strokes in women, but does not address the rationale for differences at the entity level. Meaning, is the difference a clinical practice concern or is the difference due to underlying patient characteristics.
Rationale:
Data element validity testing demonstrates high agreement between facility data (CDW) and auditor data (CDAC, the gold standard). Hypothesis testing confirms the expected difference in performance between men and women (worse for women), which aligns with expectation based on slower identification of strokes in women, but does not address why there are differences at the entity level. The committee may consider asking the developer to speak to this further.
Equity
EquityStrengths:
- None
Limitations:
- Developer did not address this optional criterion.
Rationale:
Developer did not address this optional criterion.
Use and Usability
Use and UsabilityStrengths:
- Currently in use in the CMS Hospital Outpatient Quality Reporting Program.
- Measured entities are expected to educate providers on guidelines for diagnosing and treating ischemic stroke.
- Feedback is collected through ServiceNow and also through the annual rulemaking process; if warranted, a literature review is performed to evaluate whether the proposed specification change aligns with the evidence base; developers report that no significant concerns were received from stakeholders (to date) or through public comment (over the most recent OPPS rulemaking cycle)
- Performance scores continue to show room for improvement.
- Developer reports no unexpected findings or unintended consequences.
Limitations:
- No mention of feedback reports or similar mechanism for informing providers of their performance.
- While performance scores show room for improvement, they have been stable from 2015 to 2021 (range: 71.28% - 75.89%). In addition, the number of hospitals reporting each year changes considerably (range: 502 - 1607), making it difficult to interpret changes in the rate (e.g., the highest rate was reported in the year with the fewest reporting hospitals). Finally, developers note that the number of patients receiving high quality care is larger than the number of cases captured since a hospital can choose to report only a sample, but developers do not indicate how many hospitals are reporting samples or their sample sizes.
Rationale:
This measure is currently in use in the CMS Hospital Outpatient Quality Reporting Program. To improve quality, hospitals are expected to educate providers on guidelines for diagnosing and treating ischemic stroke; they do not mention potential QI mechanisms such as providing performance reports to providers.
This measure continues to show room for improvement; however, the rate has remained largely stable from 2015-2021 and possible reasons for the lack of improvement are not articulated. Developers report no unexpected findings.
Summary
-
Head CT or MRI within 45 ED Arrival for Ischemic Stroke
Importance
ImportanceMultiple studies have clearly shown the importance of timely imaging and intervention in acute ischemic stroke to improve short and long term outcomes. This measure supports those clinical guidelines.
Feasibility Acceptance
Feasibility AcceptanceIt is likely as feasible as other measures that are abstracted. That said, it does appear to have a lot of elements that need to be looked at, which if a provider/facility does not have software to do (which could be expensive) they will need manual extraction, which can also be expensive and time consuming.
Scientific Acceptability
Scientific Acceptability ReliabilityNo other comments aside from what PQM noted.
Scientific Acceptability ValidityNo other comments
Equity
EquityWasn't really commented on or addressed.
Use and Usability
Use and UsabilityI would also agree with PQM's assessment on this section. The lack of improvement with the measure being in place is puzzling.
Summary
Ultimately, I think the measure Mets, but there are questions around the equity. Also, given there doesn't seem to be improvement noted, even though the measure has been in place for some time and there is a lack of clarity on how to improve in the measure.
I look forward to a discussion of Use and Usability.
Importance
ImportanceAgree with staff assessment.
Feasibility Acceptance
Feasibility AcceptanceAgree with staff assessment.
Scientific Acceptability
Scientific Acceptability ReliabilityAgree with staff assessment.
Scientific Acceptability ValidityAgree with staff assessment.
Equity
EquityAgree with staff assessment.
Use and Usability
Use and UsabilityThere has not been much movement on this measure over many years. Could the developers address why this is the case? Are there nuances to the definition of the measure like key exclusion groups that were not accounted for? For example, I don't see any exclusions for patient transfers. If an ED performs head imaging, and transfers to another ED within that 2 hour timeframe for Neurology or Neurosurgery consultation, for example, the second ED may not require additional STAT head imaging.
Summary
See my comments above on use and usability.
Stable performance and not improving.
Importance
ImportanceClinical guidilines have been in place for a long time. Critical measure for the recovery of the patient
Feasibility Acceptance
Feasibility AcceptanceData elements are being collected
Scientific Acceptability
Scientific Acceptability ReliabilityMeasure well defined and data is stable without improvement
Scientific Acceptability ValidityValidity testing performed
Equity
EquityNot addressed but the should be no equity variable for the population
Use and Usability
Use and UsabilityPart of public reporting. The data may not be used as needed since the data is stable in the 70s% and since this has been in place for a number of years more progress should have been made.
Summary
Part of public reporting. The data may not be used as needed since the data is stable in the 70s% and since this has been in place for a number of years more progress should have been made. This measure must be part of the 5 Star rating and available to the public to view.
Measure seems important, will like more information
Importance
ImportanceThere is strong evidence supporting the value of this measure. I support this measure with but would like to see the reason why there is a 45-minute time limit for the CT scan or MRI. I am unable to find it in the literature cited.
Feasibility Acceptance
Feasibility AcceptanceThere is strong evidence that this is a feasible measure. There is a need for the measure established with data of collected score average over the year being fairly stable, even though a higher score indicates better patient outcomes. However, the developer fails to clarify if the proprietary information would be considered a burden to the implementer.
Scientific Acceptability
Scientific Acceptability ReliabilityAgree with staff assessment.
Scientific Acceptability ValidityAgree with staff assessment.
Equity
EquityNot addressed in the information submitted.
Use and Usability
Use and UsabilityThe usability of this measure is well established by the availability for reporting after several years. The need and benefits are clear. However, I would like to understand why the scores have been stagnant for years.
Summary
As far as this measure, it seems like it is feasible and usable. However, I would like to understand better why the 45 minutes post arrival time frame was selected and why the scores over the years have been stagnant as reported.
CBE #0661
Importance
ImportanceThe importance of timely identification of stroke type and appropriate treatment is clearly essential.
I would like to know more about what might interfere with rapid access to scanning. Are there in-hospital factors (eg. crowded waiting rooms and difficulty identifying potential stroke sufferers, especially those who do not arrive via ambulance)? Education factors (how to expand education to patients/families about the signs of stroke? SDOH factors (rural areas with lengthy travel times to facilities? racial differences in responsiveness?)?
Feasibility Acceptance
Feasibility AcceptanceAgree with staff assessment
Scientific Acceptability
Scientific Acceptability ReliabilityNothing to add to staff assessment.
Scientific Acceptability ValidityNothing to add to staff assessment.
Equity
EquityNothing to add to staff assessment.
The fact that response time and identification of stroke symptoms is slower for women than for men is quite serious and should be further studied and rectified at some point.
Use and Usability
Use and UsabilityAgree with staff assessment.
Summary
No comments.
CBE #0661
Importance
ImportanceEvidence and clinical guidelines linking rapid treatment to improved outcomes, appropriate logic model, measure still has ample room for improvement. Meaningfulness to patients also addressed. Perhaps developer could clarify if guideline recommendations were graded.
Feasibility Acceptance
Feasibility AcceptanceRoutine data appear to be generated during course of care but measure as specified requires abstraction of data. Since it has been in use for a long time it is feasible to report. Would be interested to know from developer if there is a path to support implementation of an ecqm that could be less burdensome on facilities.
Scientific Acceptability
Scientific Acceptability ReliabilityMedian and Mean reliability scores are above SMP threshold of 0.6. However, reliability for about 1/3 of facilities are below the 0.6 threshold and E&M guidance suggests that not just the median or mean should be considered. Increasing minimum case count from 11 may improve reliability.
Scientific Acceptability ValidityData Element validity testing indicates high agreement with gold standard across critical data elements with interpretation time having the lowest agreement at 85% or 0.83. Hypothesis driven validity testing results and interpretation do not seem to align as developer interprets cohen’s d of 0.06 as moderate and common threshold for cohen’s d moderate effect is 0.5. Perhaps the developer could clarify. Otherwise validity seems appropriate.
Equity
EquityRecognize this is an optional criteria. Was significance testing conducted on stratified results? Data appear to be available to address.
Use and Usability
Use and UsabilityMeasure is in use in the Outpatient Quality Reporting Program. Measures Scores appear to generally be stable over the timeframe with some improvement noted early in implementation. Measure has feedback mechanism with no unintended consequences identified. Would like to understand developers rationale for lack of improvement.
Summary
N/A
Head CT or MRI w/in 45" ED Arrival for Ischemic Stroke
Importance
ImportanceAgree with PQM assessment to include recommendations for suspected stroke patients receive brain imaging studies as soon as possible. Studies have shown as well as witness to personal success stories of patients have shown the importance of timely imaging/intervention in acute ischemic stroke to improve patient outcomes.
The mean hospital scores although stable, do show room for improvement overall.
Feasibility Acceptance
Feasibility AcceptanceThe measure appears to be feasible insofar as data elements are collected in the normal course of care. This measure is currently in use, therefore sites should have processes in place to monitor and drive performance.
Scientific Acceptability
Scientific Acceptability ReliabilityThe measure is well defined and specified
Consider suggestions by PQM staff for considering a higher minimum case volume, extend the time frame. For lower volume sites, how to keep this best practice a focus for providers should be a focus.
Scientific Acceptability ValidityAgree with the limitations for the hypothesis testing confirming a difference between before performance for male vs female patients. How can this measure help sites identify this inequity and drive actions to promote better/quicker recognition for female patients? Is there data that improvements with this measure have been made for one gender more than the other (e.g., have the male time dropped but female has not, or has it improved and still not at the same timeframe as a male?)
Equity
EquityDeveloper did not address this optional criterion
Use and Usability
Use and UsabilityAgree - Performance scores continue to show room for improvement but have remained largely stable from 2015-2021. Would be interested to know how many sites have increased over the 6 years or remain at a stable level or have declined. Is this measure driving the expected improvements in diagnosing/treating ischemic stroke patients since results have been in the low 70% range for over 5 years?
Summary
Overall, the research behind the measure to drive decreasing the time for diagnosis and treatment is valid, I would like to learn more about what can be done with this measure to actual drive improvement as the metrics have only been stable for a few years now.
NA
Importance
ImportanceThere's strong evidence for this measure for the care of suspected stroke patients in the ED.
Feasibility Acceptance
Feasibility AcceptanceThe developer did not mention the copyright implication of the measure on physicians and/or hospital systems.
Scientific Acceptability
Scientific Acceptability ReliabilityThe measure is suitable and accurately specified.
Scientific Acceptability ValidityThe measure is valid.
Equity
EquityIt's not discussed, even though there are questions about why no data on equity was reported.
Use and Usability
Use and UsabilityI agree with the staff assessment.
Summary
NA
NA
Importance
Importanceagree
Feasibility Acceptance
Feasibility Acceptanceagree
Scientific Acceptability
Scientific Acceptability Reliabilityagree
Scientific Acceptability Validityagree
Equity
Equityalthough past research shows a gender difference in which women have a greater order to read time, this needs to be more fully researched and adressed
Use and Usability
Use and Usabilityagree
Summary
NA
This measure impacts quality…
Importance
ImportanceThe data and literature continue to show that time to treatment improves patient outcomes and quality of life. This is supported by the Lang article that discusses patient reported outcomes and quality of life
Feasibility Acceptance
Feasibility AcceptancePaper absraction is time consuming.
Scientific Acceptability
Scientific Acceptability ReliabilitySupport PQM comments
Scientific Acceptability ValiditySupport PQM comments
Equity
EquityWhile they have some data on gender there is nothing on race and ethnicity. Their data comes from one source Florida State Stroke registry. I would think there is more race and ethnicity data for strokes and it would have been good to see.
Use and Usability
Use and UsabilityLike others have mentioned, I am concerned with the lack of change from 2015-2021. If this measure continues on, during the next maintenance review process, there should be scrutiny and discussion if there continues to be minimal change.
Summary
This measure impacts quality of life for patients and families so it is important. Addressing that since this measure has been implemented, there have not been any unintended consequences is worth highlighting.
This is important measure…
Importance
ImportanceThe measure showed importance of IV tPA treatment within 2 hours from ischemic stroke to patient's outcome. Therefore, the concept of assessing quality of care by checking "percentage of acute ischemic stroke or hemorrhagic stroke patients who arrive at the emergency department (ED) within two hours of the onset of symptoms and have a head computed tomography (CT) or magnetic resonance imaging (MRI) scan interpreted within 45 minutes of ED arrival" could be important.
Feasibility Acceptance
Feasibility AcceptanceIt seems that the data is collectable from existing database.
Scientific Acceptability
Scientific Acceptability ReliabilityAgree with staff assessment.
Scientific Acceptability Validityagree with staff assessment.
Equity
EquityEquity not addressed by developer.
Use and Usability
Use and UsabilityThe rates massacred remain stable during the data collecting years. The developers showed include strategies to improve rates otherwise there is no point in collecting the data.
Summary
This is important measure but in order for it to be useful it should address ways to improve performance.
CT MRI Results45 - CBE ID 0661
Importance
Importance- The developer presents the existing quality measure including measure characteristics and specifications. While this has a high level of detail, it does not outline the importance of this measure.
- There are two citations of literature including one consensus guideline(Powers,et al) and a review of the guideline by Waqas et al. These citations expand numerator performance definition to 20 minutes to interpreting scans from ED entry as well as consider time from last known well of 4.5/6/16/24 hours rather than the measure value of 45 minutes until scan interpretation and 2 hours from last known well. These represent different measures without a rationale for teh current measure standard except that it is an existing standard.
- The Lang et al study supports the importance of this measure to patients.
- In exploring any performance gaps, there was no performance improvement shown in the last five years per the developer which provides a continued opportunity for improvement. Beyond this opportunity to keep working on it, no improvement begs the question why continue to invest resources in measurement for a measure that does not change? This does not mean stroke treatment is not important and an opportunity exists to improve quality. It may mean the current measure is not an important or effecgive way to improve that quality gap.
Feasibility Acceptance
Feasibility AcceptanceThis is a maintenance measure that is currently being carried out.
While the AMA CPT use is identified as a cost/fee, the developer does not describe other potential requirements that may impact feasibility. This measure, however, requires chart abstraction to report as specified which may be a significant fee or cost.
Scientific Acceptability
Scientific Acceptability ReliabilityData were abstracted from two existing data sets for a 3.5 year non-continuous period between 2018-2021. Patient level descriptive data are presented including age, sex, race, ethnicity, and diagnosis. Because this is a facility level analysis, facility characteristics would be valuable to consider. Urban/exurban, ownership status, size, teaching, etc. would correlate to the unit of analysis for the measure. Availability of a specialty stroke service may also influence the quality of care and outcome that the measure seeks to address.
Beta-binomial for facility is used and briefly described. This is consistent with the norm around facility level testing and the interpretation follows that norm.
Scientific Acceptability ValidityEmpiric validity testing is presented in two ways.
One of the validity tests presented is a comparison of the data warehouse(CDW) to the audit group(CDAC). As presented, this seems to be more of a reliability testing based on two observations of the data. A use of these two sources for validity testing would be to demonstrate the validity of the audit group methodology in comparison to the data warehouse.
The data on Patient Sex may be an opportunity for discriminant validity to be measured and reported but this is not-addequately described.
Otherwise, face validity may be easier to demonstrate with the expert guidelines used in the importance section.
Equity
EquityThis is an optional element so met was selected. The developer chose not to consider equity in the submission. This is an opportunity for the next maintenance submission.
Use and Usability
Use and UsabilityThe measure is currently in use as part of the Hospital OQR.
While the measure has not found any improvement since 2018, the developer suggests a way to change this is "provider education". Other potential opportunities for improvement could be system changes such as the physical workflow in the emergency department including the location of the CT and radiology department or changing IT workflows or standard order sets. Involvement of non-provider staff and public health level education programs are other opportunities.
As described above, however, this measure has shown no improvement since 2018. It is not clear, based on this, how the resources invested in the measure are positively impacting quality. While this is not all the developers responsibility, there may be opportunities to refine the measure to better measure and impact quality in stroke care.
Summary
The care of stroke, particularly the detection of hemorrhagic versus embolic, stroke is important for patients, especially those who have a higher burden of vascular diseases in their communities.
The current measure has not demonstrated improvement at least since 2018. The measure has not clearly distinguished where improvement has occured. There may be an opportunity to refine how we measure/evaluate stroke care in order to improve outcomes. While this measure should be kept, for the next maintenance period, the measurement should be refined to identify improvement.
measure seems useful but no improvement
Importance
Importancestrongly agree with preliminary assessment
Feasibility Acceptance
Feasibility Acceptancescoring many hospitals over many years also demonstrates feasibility
Scientific Acceptability
Scientific Acceptability ReliabilityReliability seems adequate in most instances. As noted in the preliminary assessment, further investigation of the cause(s) of lower reliability scores in a sizeable group of entities is warranted.
Scientific Acceptability Validitythe validity testing results were reassuring that validity is adequate and specific threats to validity weren't identified
Equity
Equitynot addressed
Use and Usability
Use and UsabilityNeed an rationale for the lack of improvement. Are there any examples of improved performance anywhere?
Summary
main concern is apparent lack of improvement followed by or maybe related to reliability for some facilities
Head CT or MRI Scan Results
Importance
ImportanceDescription of patient input does not support the conclusion that time-to-interpretation is meaningful for patients (evidence is about tPA administration, not interpretation of images).
Feasibility Acceptance
Feasibility AcceptanceData for the measure are generated during care. Measure uses data from EHRs or other electronic sources. Measure is being implemented.
Scientific Acceptability
Scientific Acceptability ReliabilitySpecifications are clear. Reliability results are fair overall, although concerning for quite a few facilities. Low case volume may be the reason for the lower reliability numbers although this isn’t clear from the submission. If so, there may not be an “easy fix” since sampling for the measure is proportional to size, with the smallest facilities reporting on all cases (although one option would be to extend the measure timeframe to >1 year).
Scientific Acceptability ValidityValidity testing was conducted for data elements and measure score, with adequate results for both.
Equity
EquityThe validity testing results indicates a difference in performance rates for men vs. women. Other subpopulation results were shown although differences between subgroups were not tested. Additional analysis (especially by insurance status) would be helpful.
Use and Usability
Use and UsabilityMeasure is in use. There has been no substantial feedback on the measure or indications of unexpected findings. Improvement is positive although slight. Utility for low-volume providers in an accountability application likely is questionable, although the measure should be useful for quality improvement for many low-volume providers.
Summary
This measure meets most of the requirements for re-endorsement. The developer should solicit patient input about the meaningfulness of the timeliness of imaging interpretation for patients. Reliability appears to be less than adequate for some facilities (likely those with low case-volume); while this may impact the decision to use the measure in certain programs, I don’t think it should disqualify the measure for re-endorsement.
Important Measure
Importance
ImportanceClinical guidelines have been in place for quite some time. This measure supports early identification of ICH vs ischemic stroke to begin appropriate initial treatment.
Feasibility Acceptance
Feasibility AcceptanceAgree with staff assessment
Scientific Acceptability
Scientific Acceptability Reliabilityno additional comments
Scientific Acceptability Validityno additional comments
Equity
Equitynot addressed
Use and Usability
Use and UsabilityThe measure has been in place for several years in the Outpatient Quality Reporting Program with minimal or no improvement in recent years. I would like to see the measure steward address barriers to improvement.
Summary
I believe this measure is still valid and important to measure.
Curious case of stagnant performance
Importance
ImportanceClinical guidelines and systematic reviews clearly support the measure as constructed.
Feasibility Acceptance
Feasibility AcceptanceSuccessfully used in federal reporting.
Scientific Acceptability
Scientific Acceptability ReliabilityAgree with staff assessment that a mitigation strategy for facilities with a low denominator is appropriate; personally my target would be no lower than a .5 reliability score, so something like a 15th percentile and below cutoff.
Scientific Acceptability ValidityData element validity is clearly established, but I would have preferred a different approach to empirical validity of the measure score, ideally by correlating this performance measure with other scores on related measures.
Equity
EquityNot addressed.
Use and Usability
Use and UsabilityThe measure is in use, and performance scores have been made available and feedback has been solicited. However, performance scores are arguably worsening over time since implementation, and training providers is the only recommended intervention. No meaningful feedback was apparently raised. At a subsequent maintenance review, if performance and available interventions remain as-is, the measure should no longer be considered usable.
Summary
The key here is whether the developers can articulate a clear case for why the measure should continue to be used if measure scores have not improved, and interventions to improve scores are limited with no evidence presented of a successful implementation.
Comments on 0661
Importance
ImportanceClinical guidelines provide support for this measure concept; The measure developers identified a gap in performance (deciles show variation). Adherence to the process results in improved functional/QOL outcomes for patients
Feasibility Acceptance
Feasibility AcceptanceNeither of these were identified in the submission:
- Near-term paths are specified to support routine and electronic data capture with an implementable data collection strategy OR
- Required data are routinely generated and used during care, required data are available in EHRs or other electronic sources, and the data collection strategy can be implemented
Scientific Acceptability
Scientific Acceptability ReliabilityConcerns with the reliability testing -- approximately 30-35% of entities have reliability less than 0.6, likely facilities with a low denominator size.
Scientific Acceptability ValidityNo concerns.
Equity
EquityDid not answer.
Use and Usability
Use and UsabilityThere has been little improvement over the last 6 years on this measure. What are the plans to drive improvement?
Summary
Comments on 0661
Door to Imaging Times
Importance
ImportanceThis measure has been in use for a number of years. Logic indicates that faster door-to-imaging times yield better patient outcomes. It is curious that the measure performance has grown stagnant
Feasibility Acceptance
Feasibility AcceptanceThis measure has been abstracted for many years. The real trick will be getting abstraction done soon enough to provide physicians with relatively concurrent data from which to learn and improve.
Scientific Acceptability
Scientific Acceptability ReliabilityAgree with staff assessment.
Scientific Acceptability ValidityAgree with staff assessment.
Equity
EquityDeveloper did not address this optional criterion.
Use and Usability
Use and UsabilityProcess improvement is definitely a challenge with this measure. I look forward to hearing more comments on this point. This measure currently in use in CMS HOQR has been present since 2012, and is publicly reported, but has also not shown improvement since 2015 for reasons that are not articulated.
To improve quality, hospitals are expected to educate providers on guidelines for diagnosing and treating ischemic stroke, but the developers do not mention potential QI mechanisms such as providing performance reports to providers. Manually abstracted measures also have a lagged feedback loop to providers unless sites take steps to facilitate this improvement point.
Summary
This is an important measure and I look forward to this discussion. This measure has been in CMS HOQR for the last 11 years and performance settled around 2015. I would like to know what are the thoughts around driving improvement in this area, as the developer does not specify.
A few questions for developer
Importance
ImportanceEvidence suggests measure would lead to improved outcomes. Credible link between structure, process, outcome. Evidence matches specifications of the measure. Empirical studies provided. Continued variation in measure scores. Patients surveyed find the measure valuable.
Feasibility Acceptance
Feasibility AcceptanceRoutinely generated from electronic sources. I assume AMA statements are about CPT codes used to capture data, which is the case for many measures and is not a major concern.
Scientific Acceptability
Scientific Acceptability ReliabilityTechnically, the threshold is not met, hence rating. Agree with staff assessment of accountable entity-level reliability ratings falling short of thresholds and potential fix of increasing minimum thresholds to improve reliability. Would hesitate to suggest extending timeframe due to measure's context and rolling, quarterly nature of result collection. This is because, when considered along with the data element validity testing results, I think some of the potential issues with reliability may be addressed. Would like to hear from others who know more about SNR testing drawbacks. Can the results be considered in concert? Or must that be entirely separate?
Scientific Acceptability ValidityWould like to know a little more about any issues with data element "Head CT/MRI Scan Interpretation Time" because this seems essential. But Gwet's AC1 (which appears to be a valid use of this test from outside research--would have liked more explanation of this choice) results suggest there isn't a strong concern. Would like to know if developer has a plan for addressing this difference in this data element, though. Hypothesis testing makes sense to me although would appreciate more explanation of the results.
Equity
Equitynot required.
Use and Usability
Use and UsabilityAgree with staff assessment--would like to know more about why results have not improved over time.
Summary
Would like just a little more explanation for a few findings. See above.
Head Scan/CT Stroke
Importance
ImportanceThe developer cites recommendations that support the measure. In addition, the developer cited a survey to show meaningfulness to patients who would be impacted by this process measure.
Feasibility Acceptance
Feasibility AcceptanceThere are no concerns regarding feasibility as the measure is being submitted for maintenance.
Scientific Acceptability
Scientific Acceptability ReliabilityIn agreement with staff assessment
Scientific Acceptability ValidityIn agreement with staff assessment
Equity
EquityNot addressed by the developer as this was optional.
Use and Usability
Use and UsabilityIn agreement with staff assessment.
Summary
none
Importnat measure but could perhaps use upgrades
Importance
ImportancePrompt imaging and accurate treatment of stroke patients is essential. This is an important measure that has been in place for some years. The important quyrstion raised is why there has been no improvement in the years it has been in place.
Feasibility Acceptance
Feasibility AcceptanceThis appears to have a history of effective data collection. It is, however a relatively complex measure that would be time-consuming to do manually.
Scientific Acceptability
Scientific Acceptability ReliabilityHigher minimum case volume should be considered.
Scientific Acceptability ValidityGender differences in performance need more exploration.
Equity
EquityNot addressed.
Use and Usability
Use and Usabilitythe measure does not seem to have had an effect on performance. Would like some discussion of the reasons for this and ways it could be addressed. Would also like more details on the reporting of samples.
Summary
This is an important measure but the lack of effect over the years should be addressed.
-
N/A