Skip to main content

Head CT or MRI Scan Results for Acute Ischemic Stroke or Hemorrhagic Stroke Patients who Received Head CT or MRI Scan Interpretation within 45 minutes of ED Arrival

CBE ID
0661
Endorsement Status
E&M Committee Rationale/Justification

Explore, with the developer’s technical experts, and facilities why the measure has leveled out in performance ratings. Have the measure submitted for maintenance review in three years.

1.0 New or Maintenance
Previous Endorsement Cycle
Is Under Review
No
Next Maintenance Cycle
Fall 2026
1.6 Measure Description

This measure calculates the percentage of acute ischemic stroke or hemorrhagic stroke patients who arrive at the emergency department (ED) within two hours of the onset of symptoms and have a head computed tomography (CT) or magnetic resonance imaging (MRI) scan interpreted within 45 minutes of ED arrival. The measure is calculated using chart abstracted data, on a rolling, quarterly basis and is publicly reported, in aggregate, for one calendar year. The measure has been publicly reported, annually, by CMS as a component of its Hospital Outpatient Quality Reporting (OQR) Program since 2012.

    Measure Specs
      General Information
      1.7 Measure Type
      1.7 Composite Measure
      No
      1.3 Electronic Clinical Quality Measure (eCQM)
      1.8 Level of Analysis
      1.10 Measure Rationale

      Not applicable; this measure is not a paired or grouped measure.

      1.25 Data Source Details

      This measure is derived from medical record abstraction (paper or electronic). This is not an eMeasure. Administrative claims are listed as a data source as the measure is calculated based on four consecutive quarters of hospital outpatient claims data. 

      An electronic data collection tool is made available from vendors or facilities can download the free CMS Abstraction & Reporting Tool (CART). Paper tools for manual abstraction, which are posted on www.QualityNet.org, are also available for the CART tool. These tools are posted on www.QualityNet.org

      1.14 Numerator

      Emergency department (ED) acute ischemic stroke or hemorrhagic stroke patients arriving at the ED within 2 hours of the time last known well, with an order for a head CT or MRI scan whose time from ED arrival to interpretation of the Head CT scan is within 45 minutes of arrival.

      1.14a Numerator Details

      Time from ED arrival to interpretation of the Head CT scan is within 45 minutes of arrival.

      First determine if the patient encounter meets the denominator criteria (i.e., age 18 or older who were last known well within two hours of ED arrival and had a head CT or MRI ordered) if so, then assess the chart for the time the head scan was interpreted. 

       

      Next calculate the time difference between ED arrival and interpretation time of the head scan. The Head CT or MRI Scan Interpretation Date and Time is defined as the month, day, and year date and time (military time) represented in hours and minutes at which the earliest head CT or MRI scan interpretation was completed or reported.

      1.15 Denominator

      Emergency department acute ischemic stroke or hemorrhagic stroke patients arriving at the ED within two hours of the time last known well with an order for a head CT or MRI scan.

      1.15a Denominator Details

      First, the patient encounter must meet the stroke population criteria which includes an ED encounter identified by one of these six CPT® evaluation and management codes: 

       

        99281 Emergency department visit, new or established patient

        99282 Emergency department visit, new or established patient

        99283 Emergency department visit, new or established patient

        99284 Emergency department visit, new or established patient

        99285 Emergency department visit, new or established patient

        99291 Critical care, evaluation and management

      The encounter date is during the appropriate calendar year and that the patient is 18 years or older with a principal diagnosis of ischemic and hemorrhagic stroke as identified by detailed lists located in the Excel file titled “OP Table 8.0: Ischemic Hemorrhagic Stroke.”

      If the patient encounter meets the stroke population criteria, they are evaluated for inclusion in the denominator. For the denominator, first assess the Emergency Department Acute Ischemic Stroke or Hemorrhagic Stroke patient to determine the “Time Last Known Well”. The last known well is defined as the time prior to hospital arrival at which the patient was last known to be without the signs and symptoms of the current stroke or at his or her baseline state of health.  Next, calculate the difference between the time the patient arrived in the ED and the time of the last known well, if difference in minutes is 45 minutes or less, determine if an order for a head CT or MRI scan exists.

      1.15b Denominator Exclusions

      Patients are excluded when less than 18 years of age, expired in the ED, or left the ED against medical advice, discontinued care, or those without a documented Discharge Code or Discharge code was unable to be determined.

      1.15c Denominator Exclusions Details

      Patients excluded are those who meet any of the following criteria: 

      • less than 18 years of age at the start of the encounter
      • expired (discharge code = 6)
      • left the emergency department against medical advice or discontinued care (discharge
      • code = 7) 
      • discharge code is not documented or was unable to be determined (discharge code=8)
      1.13a Attach Data Dictionary
      1.16 Type of Score
      1.17 Measure Score Interpretation
      Better quality = Higher score
      1.18 Calculation of Measure Score

      This measure calculates the percentage of acute ischemic stroke or hemorrhagic stroke patient encounters where the arrival time to the ED is within two hours of the last known well/onset of symptoms and have a head CT or MRI interpreted within 45 minutes of ED arrival. The measure is calculated based on four consecutive quarters of hospital outpatient encounter claims data, as follows: 

      1. Check E/M Code; if on Table 1.0 proceed 
      2. Calculate Patient Age (Outpatient Encounter Date minus Birthdate) 
      3. Check Patient Age; if >= 18, proceed 
      4. Check ICD-10-CM Principal Diagnosis Code; if on Table 8.0, proceed 
      5. Check Discharge Code; exclude any patients with code 6, 7, or 8 
      6. Check for a Head CT or MRI Scan Order; if “Yes,” proceed 
      7. Check Last Known Well documented; if “Yes,” proceed 
      8. Check Date Last Known Well; if a Unable to Determine (UTD) value, proceed 
      9. Check Time Last Known Well; if a UTD value, proceed 
      10. Check Arrival Time; if a UTD value, proceed 
      11. Calculate measurement value (Outpatient encounter date and arrival time minus Date Last Known Well and Time last known well (in minutes) 
      12. Check Last Known Well Minutes measurement value; if >= 0 min and <= 120 min, record as the denominator and proceed 
      13. Check Head CT or MRI Scan Interpretation Date; if a Unable to Determine (UTD) value, proceed 
      14. Check Head CT or MRI Scan Interpretation Time; if a Unable to Determine (UTD) value, proceed 
      15. Calculate Head CT/CTA or MRI/ MRA measurement value Head Ct or MRI scan Interpretation Date and Head CT or MRI Scan Interpretation Time minus Outpatient Encounter Date and Arrival Time (in minutes)

                 16.Check Head CT, CTA or MRA/MRI scan Minutes measurement value; if >= 0 min and <= 45 min, record as the numerator 

                17. Aggregate denominator and numerator counts by Medicare provider number Measure = numerator counts / denominator counts [The value should be recorded as a percentage]

      1.19 Measure Stratification Details

      Not Applicable; this measure is not stratified.

      1.26 Minimum Sample Size

      Eleven is the minimum number of cases required for public reporting. 

       

      Most Recent Endorsement Activity
      Initial Recognition and Management Fall 2023
      Initial Endorsement
      Last Updated
      Steward Organization
      Centers for Medicare & Medicaid Services
      Steward POC email
      Steward Organization Copyright

      N/A

      Steward Address

      Perry Lazar
      7500 Security Boulevard
      Baltimore, MD 21244
      United States

      Measure Developer POC

      Erin Buchanan
      Mathematica
      1100 1st Street, NE, 12th Floor
      Washington, DC 20002
      United States

        Evidence
        2.1 Attach Logic Model
        2.2 Evidence of Measure Importance

        Powers et al. (2019) and the AHA/ASA Clinical Guidelines Writing Group published updated clinical guideline recommendations for the management of acute ischemic stroke which supports the measures intent. Several strategies in the guide have demonstrated improvement in door-to-imaging times (e.g., Emergency Medical Services activation, assessment, and management of patients). Other strategies, such as telemedicine and teleradiology, can improve access to care. Non-contrast CT and MRI remain effective in excluding intracerebral hemorrhage before intravenous alteplase administration, which aligns with OP-23. To identify patients who may benefit from mechanical thrombectomy between 6 and 24 hours after last know well time, the guidelines also recommend computed tomography angiography or magnetic resonance (MR) angiography with diffusion-weighted magnetic resonance imaging with or without MR perfusion. (Citation: Powers, W. J., Rabinstein, A. A., Ackerson, T., Adeoye, O. M., Bambakidis, N. C., Becker, K., Biller, J., Brown, M., Demaerschalk, B. M., Hoh, B., Jauch, E. C., Kidwell, C. S., Leslie-Mazwi, T. M., Ovbiagele, B., Scott, P. A., Sheth, K. N., Southerland, A. M., Summer, D., & Tirschwell, D. L. (2019). Guidelines for the early management of patients with acute ischemic stroke: 2019 update to the 2018 guidelines for the early management of acute ischemic stroke: A guideline for healthcare professionals from the American Heart Association/American Stroke Association. Stroke, 50(12), e344–e418. http://doi.org/doi: 10.1161/STR.0000000000000211)

         

        The updated guideline include the following recommendations:

        Recommendation 1: All patients with suspected acute stroke should receive emergency brain imaging evaluation on first arrival to a hospital before initiating any specific therapy to treat AIS.

        Recommendation 2: Systems should be established so that brain imaging studies can be performed as quickly as possible in patients who may be candidates for IV fibrinolysis or mechanical thrombectomy or both.

        The benefit of IV alteplase is time dependent, with earlier treatment within the therapeutic window leading to bigger proportional benefits. A brain imaging study to exclude ICH is recommended as part of the initial evaluation of patients who are potentially eligible for these therapies. With respect to endovascular treatment, a pooled analysis of 5 randomized trials comparing EVT with medical therapy alone in which the majority of the patients were treated within 6 hours found that the odds of improved disability outcomes at 90 days (as measured by the mRS score distribution) declined with longer time from symptom onset to arterial puncture.42 The 6- to 16- and 6- to 24-hour treatment windows trials, which used advanced imaging to identify a relatively uniform patient group, showed limited variability of treatment effect with time in these highly selected patients. The absence of detailed screening logs in these trials limits estimations of the true impact of time in this population. To ensure that the highest proportion of eligible patients presenting in the 6- to 24-hour window have access to mechanical thrombectomy, evaluation and treatment should be as rapid as possible. Reducing the time interval from ED presentation to initial brain imaging can help to reduce the time to treatment initiation. Studies have shown that median or mean door-to-imaging times of ≤20 minutes can be achieved in a variety of different hospital settings.

        Recommendation 3: Noncontrast CT (NCCT) is effective to exclude ICH before IV alteplase administration. 

        Recommendation 4: Magnetic resonance (MR) imaging (MRI) is effective to exclude ICH before IV alteplase administration.

        Recommendation 5: (new recommendation) CTA with CTP or MR angiography (MRA) with diffusion-weighted magnetic resonance imaging (DW-MRI) with or without MR perfusion is recommended for certain patients. In many patients, the diagnosis of ischemic stroke can be made accurately on the basis of the clinical presentation and either a negative NCCT or one showing early ischemic changes, which can be detected in the majority of patients with careful attention. NCCT scanning of patients with acute stroke is effective for the rapid detection of acute ICH. NCCT was the only neuroimaging modality used in the National Institute of Neurological Disorders and Stroke (NINDS) rt-PA (Recombinant Tissue-Type Plasminogen Activator) trials and in ECASS (European Cooperative Acute Stroke Study) III and is therefore sufficient neuroimaging for decisions about IV alteplase in most patients. Immediate CT scanning provides high value for patients with acute stroke. MRI was as accurate as NCCT in detecting hyperacute intraparenchymal hemorrhage in patients presenting with stroke symptoms within 6 hours of onset when gradient echo sequences were used. In patients who awake with stroke or have unclear time of onset >4.5 hours from baseline or last known well, MRI to identify diffusion-positive fluid-attenuated inversion recovery (FLAIR)–negative lesions can be useful for selecting those who can benefit from IV alteplase administration within 4.5 hours of stroke symptom recognition. CTA with CTP or MRA with DW-MRI with or without MR perfusion is useful for selecting candidates for mechanical thrombectomy between 6 and 24 hours after last known well.

         

        Waqas et al. (2019) reviewed clinical practice guidelines and literature and recommended that emergency departments (1) develop specific protocols to triage patients based on whether a patient is admitted to the ED via an  emergency medical services (EMS) transport, ED walk-in, or in-hospital stroke; (2) initiate imaging orders including non-contrast brain computed tomography (CT) scans, CT angiograms, CT perfusion imaging, and/or magnetic resonance imaging; (3) interpret scans within 20 minutes of presentation (based on the Stroke Process Time Metrics recommended by the Society of Neurointerventional Surgery); and (4) coordinate care transitions with ED facilities or an appropriate stroke center (Waqas 2019). Overall, this article reinforces the intent of OP-23 to provide timely stroke diagnosis and recommends strategies hospitals can take, such as developing context-specific protocols and coordinating care within the ED, to reduce the time from door to imaging results interpretation. (Citation: Waqas, M., Vakharia, K., Munich, S., Morrison, J., Mokin, M., Levy, E., & Siddiqui, A. (2019). Emergency Room Triage of Acute Ischemic Stroke. Neurosurgery, 85(suppl_1).S38-S46. https://doi.org/10.1093/neuros/nyz067) 

        2.6 Meaningfulness to Target Population

        Lang et al. performed a cohort study to examine the benefit of tPA on patient-reported outcomes and health care utilization on 6-month stroke patients by analyzing patients who received tPA as part of usual stroke management and patients who would have received tPA had they arrived to the hospital within the therapeutic time window. Data were collected from surveys 6 months after stroke using standardized patient-reported outcome measures and questions about health care utilization. Demographic and medical data were acquired from hospital records. The tPA (n = 78) and control (n = 156) groups were matched across variables, except for stroke severity, which was better in the control group; subsequent analyses controlled for this mismatch. Patients who received tPA were compared with those who would have received tPA had they arrived to the hospital within the therapeutic window. The tPA group reported better physical function, communication, cognitive ability, depressive symptomatology, and quality of life/participation compared with the control group and fewer people in the tPA group reported skilled nursing facility stays, emergency department visits, and rehospitalizations after their stroke. Lang et al. found that the use of tPA provides a large benefit to the daily lives of people with ischemic stroke. (Reference: Lang C, Bland M, Cheng N, Corbetta M, et al. A case-control study of the effectiveness of tissue plasminogen activator on 6 month patients—Reported outcomes and health care utilization. Journal of Stroke and Cerebrovascular Diseases: The Official Journal of National Stroke Association. 2014; 23(10):2914–2919).

        Table 1. Performance Scores by Decile
        Performance Gap
        Overall Minimum Decile_1 Decile_2 Decile_3 Decile_4 Decile_5 Decile_6 Decile_7 Decile_8 Decile_9 Decile_10 Maximum
        Mean Performance Score 74.10 5.56 36.70 56.54 64.80 71.41 76.00 79.66 83.10 87.06 91.89 97.96 100
        N of Entities 1431 1 161 130 139 144 149 142 151 137 141 137 78
        N of Persons / Encounters / Episodes 28174 18 2866 2350 2663 2798 3165 2985 2972 3227 2490 2658 1237
          Equity
          3.1 Contributions Toward Advancing Health Equity

          Optional question

            Feasibility
            4.1 Feasibility Assessment

            Not applicable during the Fall 2023 cycle.

            4.3 Feasibility Informed Final Measure

            Not applicable. This measure is being submitted for maintenance.

            4.4 Proprietary Information
            Proprietary measure or components (e.g., risk model, codes), without fees
            4.4a Fees, Licensing, or Other Requirements

            No fees, licensure, or other requirements are necessary to use this measure; however, CPT codes, descriptions, and other data are copyright 2022 American Medical Association. All rights reserved. CPT is a registered trademark of the American Medical Association. Applicable FARS\DFARS Restrictions Apply to Government Use. Fee schedules, relative value units, conversion factors, and/or related components are not assigned by the AMA, are not part of CPT, and the AMA is not recommending their use. The AMA does not directly or indirectly practice medicine or dispense medical services. The AMA assumes no liability for data contained or not contained herein.

             

              Testing Data
              5.1.1 Data Used for Testing

              This measure was tested using patient record data abstracted from paper record, claims, and electronic health records stored in the Clinical Data Warehouse (CDW) and the Clinical Data Abstraction Center (CDAC). Data was obtained for 01-01-2018 through 12-31-2021 exclusive of January 1 through June 30, 2020 arrival date times. There are no differences in data for different aspects of testing.

              5.1.2 Differences in Data

              Not applicable. There are no differences in data for different aspects of testing. 

              5.1.3 Characteristics of Measured Entities

               Data for both Clinical Data Abstraction Center (CDAC) and Clinical Data Warehouse (CDW) was obtained for 01-01-2018 through 12-31-2021 exclusive of January 1 through June 30, 2020 arrival date times due to COVID-19 considerations. The CDAC data contained 2,654 patients in 968 facilities and CDW contained 213,527 patients in 3,881 facilities. The data presented below in table 2 represents additional characteristics of the data used for testing.

               

              Table 2. Characteristics of Facilities Meeting Minimum Case Count

              Characteristics                                       CDAC                                                                                    CDW
              Date Collected                       2018-01-01 to 2021-12-31                                                   2018-01-01 to 2021-12-31
              Sampled Population                           2,654                                                                                      213,527
              Number of Facilities                             968                                                                                         3,881
              Denominator Cases                             1,650                                                                                    139,865
              Numerator Cases                                  1,195                                                                                    104,023
              Level of Analysis                              Facility Level                                                                       Facility Level
               

               

              5.1.4 Characteristics of Units of the Eligible Population

              The data presented below in table 3 represents characteristics of patients included in the testing analysis. There are no differences in data for different aspects of testing. The majority of patients were white, non-Hispanic from ages 60-79 who suffered from Ischemic stroke. There was a fairly even split between male and female patients. 

               

              Table 3. Patient Characteristics among Facilities Meeting Minimum Case Count

               

              Groups    Number of patients (CDW)    Performance Rates (CDW)    Number of patients (CDAC)    Performance Rates (CDAC)
              Sex                                -                                                                -                                                         -                                                             -
              Female                  105286                                                     73.41%                                           1313                                                  70.98%
              Male                       108194                                                     75.33%                                           1339                                                   73.84%
              Unknown Sex         47                                                           66.67%                                           2                                                           100%
               

              Age                                -                                                                 -                                                     -                                                              -
              18-39                        8885                                                        63.48%                                           133                                                     57.89%
              40-59                        52041                                                      71.82%                                           640                                                   68.95%
              60-79                        103634                                                   75.19%                                         1262                                                    73.51%
              80 and Older          48967                                                     77.66%                                         619                                                      77.69%
               

              Race                              -                                                                -                                                         -                                                            -
              Asian                          4704                                                         73.22%                                          45                                                       77.27%
              Black or African American 26352                                      72.58%                                          359                                                    71.62%
              Unknown or Other    14339                                                  71.15%                                          171                                                   75.45%
              White                       168132                                                       74.95%                                          2079                                                 72.22%
               

              Ethnicity                      -                                                                     -                                                   -                                                              -
              Hispanic/Latino    15977                                                        68.56%                                          203                                                     68.50%
              Not Hispanic/Latino    197550                                              74.81%                                         2451                                                  72.75%
               

              Diagnosis                           -                                                            -                                                       -                                                            -
              Hemorrhagic stroke    47385                                              66.38%                                            593                                                     61.20%
              Ischemic stroke             166142                                            76.93%                                            2061                                                 76.19%
               

              5.2.2 Method(s) of Reliability Testing

              Reliability was calculated in accordance with the signal-to-noise method discussed in The Reliability of Provider Profiling: A Tutorial (2009). This approach calculates the ability of the measure to distinguish between facility performance. We calculated the signal-to-noise ratio for each facility meeting the minimum case count of 11, established by the measure calculation contractor during the data collection period, with higher scores indicating greater reliability. The reliability score is estimated using a beta-binomial model, which is appropriate for the reliability testing of pass/fail measures. The reliability score for each facility is a function of the facility’s sample size and score on the measure, and the variance across facilities.

              Adams JL. The reliability of provider profiling: a tutorial. Santa Monica, CA: RAND Corporation. 2009. Retrieved from http://www.rand.org/pubs/technical_reports/TR653.
               

              5.2.3 Reliability Testing Results

              Table 4 displays the distribution of signal to noise scores from 2021. Higher scores denote greater reliability. Reliability scores ranged from 0.43 to 1.00 and mean reliability score was 0.68. 

               

              Table 4. Results of Reliability Testing Based on Signal-to-noise analysis
              Year: 2021    

              Number of Facilities : 1431   

              Mean: 0.68  

              Standard Deviation: 0.15    

              Min: 0.43    

              5th Percentile:0.45    

              10th Percentile: 0.48   

              25th Percentile: 0.56    

              50th  Percentile: 0.67    

              75th Percentile:0.77    

              90th Percentile:0.87    

              95th Percentile:1.00    

              Max:1.00

               

              5.2.4 Interpretation of Reliability Results

              While there is no universal standard cut off for signal to noise, a reliability of 0.70 is considered the acceptable threshold for reliability. Our results for 2021 of a median reliability score of 0.67 and mean reliability score of 0.68 approach the 0.7 cut off indicating moderate reliability. Our results also align with the Draft Acceptable Reliability Thresholds suggested by the National Quality Forum (NQF) Scientific Methods Panel (SMP) in 2021 which propose the threshold of 0.6 ≥ 0.9 for adequate reliability. Our results indicate that the measure is able to identify true differences in performance between individual facilities.

               

              Table 2. Accountable Entity Level Reliability Testing Results by Denominator, Target Population Size
              Accountable Entity-Level Reliability Testing Results
              &nbsp; Overall Minimum Decile_1 Decile_2 Decile_3 Decile_4 Decile_5 Decile_6 Decile_7 Decile_8 Decile_9 Decile_10 Maximum
              Reliability 0.68 0.43 0.46 0.52 0.57 0.61 0.65 0.69 0.72 0.77 0.83 0.96 1.00
              Mean Performance Score 1431 15 144 153 134 143 146 143 138 144 143 143 78
              N of Entities 28174 165 1682 2180 2006 2404 2688 2885 3177 3576 4272 3304 1237
              5.3.3 Method(s) of Validity Testing

              Data Element Validity.

              We assessed the data element validity of the measure by calculating a rate of agreement between facility abstraction (sourced from the CDW) and auditor (CDAC) abstraction for each of the data elements used to calculate the measure. The analysis used data element values for 1548 denominator cases abstracted by CDAC, which were previously abstracted by facilities. We then used Gwet’s AC-1 statistic to account for chance agreement. A Gwet’s AC-1 statistic less than 0.5 indicates a fair agreement, 0.5-0.8 indicates a medium effect size, and greater than or equal to 0.8 indicates a large effect size. 

               

               

              Hypothesis-driven validity. 

              We assessed the validity of the measure through literature informed hypothesis testing. Based on our reviews of literature,1,2 we anticipated that female patients would have a longer arrival to CT interpretation time than male patients. In addition to the t-statistic to detect statistical differences, we calculated Cohen’s D to show whether a difference is meaningful in practice or not.

              1. Sex and Race‐Ethnic Disparities in Door‐to‐CT Time in Acute Ischemic Stroke: The Florida Stroke Registry. Sai P. Polineni MPH, Enmanuel J. Perez MD, PhD, Kefeng Wang MS, Carolina M. Gutierrez PhD, Jeffrey Walker MBA‐HCM, Dianne Foster RN, BSN, MBA, Chuanhui Dong PhD, Negar Asdaghi MD, Jose G. Romano MD, Ralph L. Sacco MD, MS, Tatjana Rundek MD, PhD [email protected], and for the Florida Stroke Registry
              2.  Predictors of Time From Hospital Arrival to Initial Brain-Imaging Among Suspected Stroke Patients. Kathryn M. Rose, PhD,                   Wayne D. Rosamond, PhD, Sara L. Huston, PhD, Carol V. Murphy, RN, MPH, and Charles H. Tegeler, MD

                 
              5.3.4 Validity Testing Results

              As demonstrated in table 5, percent agreement ranged from 85% - 100%. Head CT/MRI Scan Interpretation Time had a percent agreement and Gwet’s AC1 score at 85% and 0.83 respectively. Head CT/MRI Scan Order, Last Known Well, Principal ICD code, E/M Code, Date Last Known Well (LKW), and Head CT/MRI Scan Interpretation Date had complete agreement (100%) and Gwet’s AC1 scores of 1. 

               

              Table 5. Data Element Validity for Categorical Variables, Non-categorical Variables, and Constructed Outcomes

              Variable                                                             n                                Percent Agreement                                             Gwet’s AC1
              Discharge Code                                          1548                                            98%                                                                  0.98
              Head CT/MRI Scan Order                        1548                                           100%                                                                1.00
              Last Known Well                                         1548                                           100%                                                                1.00
              Principal ICD code                                      1548                                          100%                                                               1.00
              E/M Code                                                        1548                                          100%                                                               1.00
              Arrival time                                                    1548                                          99%                                                                  0.99
              Date Last Known Well (LKW)                   1548                                         100%                                                               1.00
              Time LKW                                                         1548                                        93%                                                                   0.93
              Head CT/MRI Scan Interpretation Date    1548                                   100%                                                                1.00
              Head CT/MRI Scan Interpretation Time    1548                                  85%                                                                  0.83
              Numerator                                                        1548                                        97%                                                                 0.97
              Denominator                                                   1548                                       100%                                                               1.00
               

              Hypothesis-driven validity

              Table 6 shows that in 2021, the mean difference between females and males was 2.83 with a t-score of 2.47, p-value of 0.01 and Cohen’s d of 0.06.

               

              Table 6. Empirical Validity Analysis of Differences between Males and Females

              Year: 2021    

              Category: Patient Sex   

              Value: Female vs. Male    

              Mean Difference: 2.83    

              Confidence Interval Lower Limit:0.58    

              Confidence Interval Upper Limit : 5.07   

              t: 2.47    

              p : 0.01   

              Cohen’s d: 0.06

              5.3.5 Interpretation of Validity Results

              Data Element Validity.

              Results demonstrated that the agreement between the data source and the gold standard is high, and the measure score correctly reflects the quality of care provided by identifying differences in quality. We used Gwet’s AC1 statistic to account for agreement by chance, a more robust measure of concordance than overall agreement. 

               

              Hypothesis-driven validity. 

              For 2021, there was a difference between females and males and that difference was statistically significant but based on the Cohen’s d of 0.06, the effect size of that difference is moderate. The groups differ by 0.06 standard deviations. From these results, we conclude that the differences by sex between ED arrival and Head CT/MRI scan are statistically significant. This conclusion aligns with the literature which indicates stroke signs are not always identified as quickly as in women as they are in men.  

              5.4.1 Methods Used to Address Risk Factors
              5.4.1b Rationale For No Adjustment or Stratification

              Not applicable. This measure is not an outcome or resource use measure and is not risk adjusted or stratified.

                Use
                6.1.4 Program Details
                Name of the program and sponsor
                The CMS Hospital Outpatient Quality Reporting Program
                Purpose of the program
                The Hospital OQR Program is a pay for quality data reporting program implemented by CMS for outpatient hospital services. In addition to providing hospitals with a financial incentive to report their quality of care measure data.
                Geographic area and percentage of accountable entities and patients included
                National
                Applicable level of analysis and care setting

                The publicly reported values (on Hospital Compare) are calculated for all facilities participating in the Hospital OQR Program in the United States that meet minimum case count requirements. Facilities eligible to report this measure are subject to the Outpatient Prospective Payment System (OPPS) guidelines

                6.2.1 Actions of Measured Entities to Improve Performance

                In order to improve performance on this measure, measured entities must educate their providers around following guidelines for diagnosing and treating an acute ischemic stroke. These actions do not cause undue burden to the measure entities. 

                6.2.2 Feedback on Measure Performance

                Feedback received from stakeholders (via the ServiceNow tool) is used to revise the measure specifications. Following receipt of a suggestion to adjust the specifications, a literature review is performed to determine if the proposed change aligns with the empirical evidence base for the measure; feedback from the expert work group is obtained to evaluate the change to the specifications. To date, we have received no significant concerns raised by stakeholders about the measure specifications through ServiceNow. In addition, stakeholders may submit comments on the measure through the Outpatient Prospective Payment System (OPPS) annual rule-making process. No comments were received for this measure during the most recent OPPS rule-making cycle.

                6.2.3 Consideration of Measure Feedback

                To date, we have received no significant feedback about the measure specifications.

                6.2.4 Progress on Improvement

                Summary statistics of performance scores during the January 1, 2018 through December 31, 2021 data collection periods are provided in the Gap section. In 2015, the average hospital score was 71.28% among 1276 hospitals. In 2016, there was an average change in hospital scores of 1.43%, the average hospital score was 73.27% among 1401 hospitals. In 2017, there was an average change in hospital scores of 1.64%, the average hospital score was 74.33% among 1507 hospitals. In 2018, there was an average change in hospital scores of 0.26%, the average hospital score was 73.21% among 1607 hospitals. In 2019, there was an average change in hospital scores of 0.28%, the average hospital score was 73.73% among 1592 hospitals. In 2020, there was an average change in hospital scores of 0.54%, the average hospital score was 75.89% among 502 hospitals. In 2021, there was an average change in hospital scores of 1.42%, the average hospital score was 71.53% among 1492 hospitals.

                Performance scores have remained stable over the years showing continued room for improvement. As noted in prior submissions, the number of patients receiving high-quality healthcare as performance on the measure improves is larger than the number of cases captured by the measure because a hospital can choose to only report a sample cases.

                6.2.5 Unexpected Findings

                We did not identify any unintended consequences during measure testing. Similarly, no evidence of unintended consequences to individuals or populations has been reported by external stakeholders since its implementation. We will continue to monitor the potential for unintended consequences through an annual review of the literature as well as an ongoing review of stakeholder comments and inquiries. The risk in advancing measures that address timeliness is that there may be a decrease in testing performance to avoid measurement, however this is not likely due to the need to assess diagnostic results to ensure a proper diagnosis.

                  Public Comments
                  First Name
                  Matthew
                  Last Name
                  Pickering

                  Submitted by MPickering01 on Mon, 01/08/2024 - 10:52

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Strengths:

                  • Updated clinical guidelines cited support the measure concept, including recommendations that suspected stroke patients receive brain imaging studies as soon as possible after arriving at the ED, that non-contrast CT and MRI are both effective at ruling out ICH before treatment, and that certain patients benefit from MRA/CTA (Powers et al. 2019), and a systematic review of guidelines makes similar recommendations (Waqas et al. 2019). The success of treatment with IV alteplase is time dependent, and developers cite a pooled analysis of 5 RCTs showing that treatment with EVT within 6 hours of stroke onset found odds of improved disability outcomes at 90 days (reference not provided).
                  • Mean hospital scores are stable in the 71-75% range from 2015 to 2020, showing room for improvement. Performance scores ranged from a minimum of 5.56% to maximum of 100%. Developer notes that performance scores are not limited to facilities present each year, indicating the limited number of facilities in 2020 skewed performance scores higher.
                  • Meaningfulness to patients was demonstrated by citing one study using patient survey and matched controls, which showed that treatment with tPA within the therapeutic time window was associated with better physical function, communication, cognitive ability, depressive symptomatology, and quality of life/participation compared with control, and fewer SNF stays, ED visits, and readmissions (Lang et al., 2014).

                   

                  Limitations:

                  • Limitation of the pooled analysis is that 6-16 and 6-24 hour window trials showed limited variability of treatment effect with time, which the developers interpreted as evidence for the importance of rapid imaging (reference not provided).
                  • Sample in Lang et al. was relatively small (tPA (n = 78); control (n = 156))

                   

                  Rationale:

                  The 2019 clinical guidelines and a systematic review of 5 randomized-controlled trials presented support the measure concept, emphasizing the importance of rapid brain imaging in suspected stroke patients visiting the ED, the availability of effective imaging tests, and the benefits of early treatment. Meaningfulness to patients was demonstrated in a small sample study using patient surveys, which showed improved function and quality of life, and reduced utilization among patients who received early tPA treatment. Performance scores show room for improvement and substantial variability.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Strengths:

                  • Developer does not state this explicitly but data elements are listed and all seem plausible as normally collected during the normal course of care. Developer states that “No fees, licensure, or other requirements are necessary to use this measure; however, CPT codes, descriptions, and other data are copyright 2022 American Medical Association [AMA].” 

                  Limitations:

                  • Developers do not mention whether  AMA copyright is a potential burden or barrier for providers. Under the SA data description, developer notes “Proprietary measure or components (e.g., risk model, codes)."

                  Rationale:

                  The measure appears to be feasible insofar as data elements are collected in the normal course of care and no fees or licensure is required; however, it is not clear how the AMA copyright or proprietary components could impact providers.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Strengths:

                  • The measure is well defined and specified.
                  • Accountable entity-level (i.e., measure score) reliability testing was estimated using signal-to-noise analysis on a 2021 dataset consisting of 28,174 persons across 1,431 facilities meeting the minimum count of 11 cases. A decile table of reliability by population size is provided. The median reliability 0.68. The mean of the 3rd decile is 0.57, and the mean of the 4th decile is 0.61 which indicates that 65-70% of the entities have a reliability >0.6.

                  Limitations:

                  • Approximately 30-35% of entities have reliability less than 0.6, likely facilities with a low denominator size. Consider mitigation for entities with low case counts.

                  Rationale:

                  Measure score reliability testing (accountable entity-level) performed. However, reliability <0.6 for 30-35% of entities. Some possible mitigation strategies to improve these estimates could be to 
                   

                  • Empirical approaches outlined in the report, MAP 2019 Recommendations from the Rural Health Technical Expert Panel Final Report, https://www.qualityforum.org/WorkArea/linkit.aspx?LinkIdentifier=id&ItemID=89673 
                  • Consider a higher minimum case volume.
                  • Extend the time frame.
                  • Focus on applying mitigation at the lower volume providers.
                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Strengths:

                  • Data element validity testing was previously performed for claims, EHR, and paper records.
                  • Data element validity was assessed by calculating rate of agreement (% agreement and Gwet AC-1) in all data elements used to calculate the measure between facility (CDW) abstracted and auditor abstracted (CDAC, the gold standard) data, using a sample of 1548 denominator cases from CDAC, 2021 data. Minimum agreement was 85% / .83 (head CT/MRI interpretation time); all other elements had 93% / .93 – 100% / 1.0 agreement (large effect).
                  • Developer conducted accountable entity-level (measure score) validity testing, where the developer hypothesized that female patients would have a longer arrival to CT interpretation time compared to male patients. Mean difference showed longer time for female patients, Cohen’s D of 0.06 (moderate effect size), which the developer claims aligns with literature showing stroke signs are not identified as quickly in women.
                  • No risk adjustment since the measure is a process measure.

                  Limitations:

                  • Hypothesis testing confirms the expected difference in performance between men and women (worse for women), which aligns with expectation based on slower identification of strokes in women, but does not address the rationale for differences at the entity level. Meaning, is the difference a clinical practice concern or is the difference due to underlying patient characteristics.

                  Rationale:

                  Data element validity testing demonstrates high agreement between facility data (CDW) and auditor data (CDAC, the gold standard). Hypothesis testing confirms the expected difference in performance between men and women (worse for women), which aligns with expectation based on slower identification of strokes in women, but does not address why there are differences at the entity level. The committee may consider asking the developer to speak to this further.

                  Equity

                  Equity Rating
                  Equity

                  Strengths:

                  • None

                  Limitations:

                  • Developer did not address this optional criterion.

                  Rationale:

                  Developer did not address this optional criterion.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Strengths:

                  • Currently in use in the CMS Hospital Outpatient Quality Reporting Program.
                  • Measured entities are expected to educate providers on guidelines for diagnosing and treating ischemic stroke.
                  • Feedback is collected through ServiceNow and also through the annual rulemaking process; if warranted, a literature review is performed to evaluate whether the proposed specification change aligns with the evidence base; developers report that no significant concerns were received from stakeholders (to date) or through public comment (over the most recent OPPS rulemaking cycle)
                  • Performance scores continue to show room for improvement.
                  • Developer reports no unexpected findings or unintended consequences.

                  Limitations:

                  • No mention of feedback reports or similar mechanism for informing providers of their performance.
                  • While performance scores show room for improvement, they have been stable from 2015 to 2021 (range: 71.28% - 75.89%). In addition, the number of hospitals reporting each year changes considerably (range: 502 - 1607), making it difficult to interpret changes in the rate (e.g., the highest rate was reported in the year with the fewest reporting hospitals). Finally, developers note that the number of patients receiving high quality care is larger than the number of cases captured since a hospital can choose to report only a sample, but developers do not indicate how many hospitals are reporting samples or their sample sizes.

                  Rationale:

                  This measure is currently in use in the CMS Hospital Outpatient Quality Reporting Program. To improve quality, hospitals are expected to educate providers on guidelines for diagnosing and treating ischemic stroke; they do not mention potential QI mechanisms such as providing performance reports to providers.

                  This measure continues to show room for improvement; however, the rate has remained largely stable from 2015-2021 and possible reasons for the lack of improvement are not articulated. Developers report no unexpected findings.

                  Summary

                  N/A

                  First Name
                  Kory
                  Last Name
                  Anderson

                  Submitted by Kory on Fri, 01/12/2024 - 18:25

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Multiple studies have clearly shown the importance of timely imaging and intervention in acute ischemic stroke to improve short and long term outcomes.  This measure supports those clinical guidelines.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  It is likely as feasible as other measures that are abstracted.  That said, it does appear to have a lot of elements that need to be looked at, which if a provider/facility does not have software to do (which could be expensive) they will need manual extraction, which can also be expensive and time consuming.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  No other comments aside from what PQM noted.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  No other comments

                  Equity

                  Equity Rating
                  Equity

                  Wasn't really commented on or addressed.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  I would also agree with PQM's assessment on this section.  The lack of improvement with the measure being in place is puzzling.

                  Summary

                  Ultimately, I think the measure Mets, but there are questions around the equity.  Also, given there doesn't seem to be improvement noted, even though the measure has been in place for some time and there is a lack of clarity on how to improve in the measure.

                  First Name
                  Ray
                  Last Name
                  Dantes

                  Submitted by Dr. Ray Dantes on Mon, 01/15/2024 - 10:57

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Agree with staff assessment.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Agree with staff assessment.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Agree with staff assessment.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Agree with staff assessment.

                  Equity

                  Equity Rating
                  Equity

                  Agree with staff assessment.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  There has not been much movement on this measure over many years. Could the developers address why this is the case? Are there nuances to the definition of the measure like key exclusion groups that were not accounted for? For example, I don't see any exclusions for patient transfers. If an ED performs head imaging, and transfers to another ED within that 2 hour timeframe for Neurology or Neurosurgery consultation, for example, the second ED may not require additional STAT head imaging.

                  Summary

                  See my comments above on use and usability.

                  Importance

                  Importance Rating
                  Importance

                  Clinical guidilines have been in place for a long time.  Critical measure for the recovery of the patient

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Data elements are being collected

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Measure well defined and data is stable without improvement

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Validity testing performed

                  Equity

                  Equity Rating
                  Equity

                  Not addressed but the should be no equity variable for the population

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Part of public reporting.  The data may not be used as needed since the data is stable in the 70s% and since this has been in place for a number of years more progress should have been made.

                  Summary

                  Part of public reporting.  The data may not be used as needed since the data is stable in the 70s% and since this has been in place for a number of years more progress should have been made.  This measure must be part of the 5 Star rating and available to the public to view.  

                  First Name
                  Tamaire
                  Last Name
                  Ojeda

                  Submitted by Tamaire Ojeda on Mon, 01/15/2024 - 16:37

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  There is strong evidence supporting the value of this measure. I support this measure with but would like to see the reason why there is a 45-minute time limit for the CT scan or MRI. I am unable to find it in the literature cited. 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  There is strong evidence that this is a feasible measure. There is a need for the measure established with data of collected score average over the year being fairly stable, even though a higher score indicates better patient outcomes. However, the developer fails to clarify if the proprietary information would be considered a burden to the implementer. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Agree with staff assessment. 

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Agree with staff assessment. 

                  Equity

                  Equity Rating
                  Equity

                  Not addressed in the information submitted. 

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  The usability of this measure is well established by the availability for reporting after several years. The need and benefits are clear. However, I would like to understand why the scores have been stagnant for years. 

                  Summary

                  As far as this measure, it seems like it is feasible and usable. However, I would like to understand better why the 45 minutes post arrival time frame was selected and why the scores over the years have been stagnant as reported. 

                  First Name
                  Matthew
                  Last Name
                  Pickering

                  Submitted by MPickering01 on Fri, 01/19/2024 - 08:22

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  The importance of timely identification of stroke type and appropriate treatment is clearly essential.

                   

                  I would like to know more about what might interfere with rapid access to scanning. Are there in-hospital factors (eg. crowded waiting rooms and difficulty identifying potential stroke sufferers, especially those who do not arrive via ambulance)? Education factors (how to expand education to patients/families about the signs of stroke?  SDOH factors (rural areas with lengthy travel times to facilities? racial differences in responsiveness?)?

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Agree with staff assessment

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Nothing to add to staff assessment.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Nothing to add to staff assessment.

                  Equity

                  Equity Rating
                  Equity

                  Nothing to add to staff assessment.

                   

                  The fact that response time and identification of stroke symptoms is slower for women than for men is quite serious and should be further studied and rectified at some point.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Agree with staff assessment.

                  Summary

                  No comments.

                  First Name
                  Kyle
                  Last Name
                  Campbell

                  Submitted by Kyle Campbell on Fri, 01/19/2024 - 14:24

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Evidence and clinical guidelines linking rapid treatment to improved outcomes, appropriate logic model, measure still has ample room for improvement. Meaningfulness to patients also addressed. Perhaps developer could clarify if guideline recommendations were graded.

                   

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Routine data appear to be generated during course of care but measure as specified requires abstraction of data. Since it has been in use for a long time it is feasible to report. Would be interested to know from developer if there is a path to support implementation of an ecqm that could be less burdensome on facilities.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Median and Mean reliability scores are above SMP threshold of 0.6. However, reliability for about 1/3 of facilities are below the 0.6 threshold and E&M guidance suggests that not just the median or mean should be considered. Increasing minimum case count from 11 may improve reliability.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Data Element validity testing indicates high agreement with gold standard across critical data elements with interpretation time having the lowest agreement at 85% or 0.83. Hypothesis driven validity testing results and interpretation do not seem to align as developer interprets cohen’s d of 0.06 as moderate and common threshold for cohen’s d moderate effect is 0.5. Perhaps the developer could clarify. Otherwise validity seems appropriate.

                  Equity

                  Equity Rating
                  Equity

                  Recognize this is an optional criteria. Was significance testing conducted on stratified results? Data appear to be available to address.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Measure is in use in the Outpatient Quality Reporting Program. Measures Scores appear to generally be stable over the timeframe with some improvement noted early in implementation. Measure has feedback mechanism with no unintended consequences identified. Would like to understand developers rationale for lack of improvement.

                  Summary

                  N/A

                  First Name
                  Tammy
                  Last Name
                  Love

                  Submitted by Tammy Jean Love on Fri, 01/19/2024 - 15:04

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Agree with PQM assessment to include recommendations for suspected stroke patients receive brain imaging studies as soon as possible. Studies have shown as well as witness to personal success stories of patients have shown the importance of timely imaging/intervention in acute ischemic stroke to improve patient outcomes. 

                  The mean hospital scores although stable, do show room for improvement overall. 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  The measure appears to be feasible insofar as data elements are collected in the normal course of care. This measure is currently in use, therefore sites should have processes in place to monitor and drive performance. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  The measure is well defined and specified

                  Consider suggestions by PQM staff for considering a higher minimum case volume, extend the time frame. For lower volume sites, how to keep this best practice a focus for providers should be a focus. 

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Agree with the limitations for the hypothesis testing confirming a difference between before performance for male vs female patients.  How can this measure help sites identify this inequity and drive actions to promote better/quicker recognition for female patients? Is there data that improvements with this measure have been made for one gender more than the other (e.g., have the male time dropped but female has not, or has it improved and still not at the same timeframe as a male?)

                  Equity

                  Equity Rating
                  Equity

                  Developer did not address this optional criterion

                   

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Agree - Performance scores continue to show room for improvement but have remained largely stable from 2015-2021.  Would be interested to know how many sites have increased over the 6 years or remain at a stable level or have declined. Is this measure driving the expected improvements in diagnosing/treating ischemic stroke patients since results have been in the low 70% range for over 5 years?

                  Summary

                  Overall, the research behind the measure to drive decreasing the time for diagnosis and treatment is valid, I would like to learn more about what can be done with this measure to actual drive improvement as the metrics have only been stable for a few years now. 

                  First Name
                  Kobi
                  Last Name
                  Ajayi

                  Submitted by Kobi on Sat, 01/20/2024 - 02:15

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  There's strong evidence for this measure for the care of suspected stroke patients in the ED.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  The developer did not mention the copyright implication of the measure on physicians and/or hospital systems.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  The measure is suitable and accurately specified.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  The measure is valid.

                  Equity

                  Equity Rating
                  Equity

                  It's not discussed, even though there are questions about why no data on equity was reported. 

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  I agree with the staff assessment.

                  Summary

                  NA

                  First Name
                  Marianne
                  Last Name
                  Kraemer

                  Submitted by Marianne on Sat, 01/20/2024 - 17:21

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  agree

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  agree

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  agree

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  agree

                  Equity

                  Equity Rating
                  Equity

                  although past research shows a gender difference in which women have a greater order to read time, this needs to be more fully researched and adressed

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  agree

                  Summary

                  NA

                  First Name
                  Carole
                  Last Name
                  Hemmelgarn

                  Submitted by Carole Hemmelgarn on Sun, 01/21/2024 - 10:52

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  The data and literature continue to show that time to treatment improves patient outcomes and quality of life. This is supported by the Lang article that discusses patient reported outcomes and quality of life

                   

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Paper absraction is time consuming. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Support PQM comments

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Support PQM comments

                  Equity

                  Equity Rating
                  Equity

                  While they have some data on gender there is nothing on race and ethnicity. Their data comes from one source Florida State Stroke registry. I would think there is more race and ethnicity data for strokes and it would have been good to see.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Like others have mentioned, I am concerned with the lack of change from 2015-2021. If this measure continues on, during the next maintenance review process, there should be scrutiny and discussion if there continues to be minimal change.

                  Summary

                  This measure impacts quality of life for patients and families so it is important. Addressing that since this measure has been implemented, there have not been any unintended consequences is worth highlighting.

                  First Name
                  Talia
                  Last Name
                  Sasson

                  Submitted by Talia Sasson on Sun, 01/21/2024 - 15:39

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  The measure showed importance of IV tPA treatment within 2 hours from ischemic stroke to patient's outcome. Therefore, the concept of assessing quality of care by checking "percentage of acute ischemic stroke or hemorrhagic stroke patients who arrive at the emergency department (ED) within two hours of the onset of symptoms and have a head computed tomography (CT) or magnetic resonance imaging (MRI) scan interpreted within 45 minutes of ED arrival" could be important. 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  It seems that the data is collectable from existing database.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Agree with staff assessment. 

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  agree with staff assessment.

                  Equity

                  Equity Rating
                  Equity

                  Equity not addressed by developer.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  The rates massacred remain stable during the data collecting years. The developers showed include strategies to improve rates otherwise there is no point in collecting the data. 

                  Summary

                  This is important measure but in order for it to be useful it should address ways to improve performance. 

                  First Name
                  Kent
                  Last Name
                  Bream

                  Submitted by Kent Bream M.D. on Sun, 01/21/2024 - 17:41

                  Permalink

                  Importance

                  Importance Rating
                  Importance
                  1. The developer presents the existing quality measure including measure characteristics and specifications. While this has a high level of detail, it does not outline the importance of this measure. 
                  2. There are two citations of literature including one consensus guideline(Powers,et al) and a review of the guideline by Waqas et al.  These citations expand numerator performance definition to 20 minutes to interpreting scans from ED entry as well as consider time from last known well of 4.5/6/16/24 hours rather than the measure value of 45 minutes until scan interpretation and 2 hours from last known well. These represent different measures without a rationale for teh current measure standard except that it is an existing standard. 
                  3. The Lang et al study supports the importance of this measure to patients. 
                  4. In exploring any performance gaps, there was no performance improvement shown in the last five years per the developer which provides a continued opportunity for improvement. Beyond this opportunity to keep working on it, no improvement begs the question why continue to invest resources in measurement for a measure that does not change? This does not mean stroke treatment is not important and an opportunity exists to improve quality. It may mean the current measure is not an important or effecgive way to improve that quality gap.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  This is a maintenance measure that is currently being carried out. 

                   

                  While the AMA CPT use is identified as a cost/fee, the developer does not describe other potential requirements that may impact feasibility. This measure, however, requires chart abstraction to report as specified which may be a significant fee or cost.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Data were abstracted from two existing data sets for a 3.5 year non-continuous period between 2018-2021. Patient level descriptive data are presented including age, sex, race, ethnicity, and diagnosis. Because this is a facility level analysis, facility characteristics would be valuable to consider. Urban/exurban, ownership status, size, teaching, etc. would correlate to the unit of analysis for the measure. Availability of a specialty stroke service may also influence the quality of care and outcome that the measure seeks to address. 

                   

                  Beta-binomial for facility is used and briefly described. This is consistent with the norm around facility level testing and the interpretation follows that norm. 

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                    Empiric validity testing is presented in two ways. 

                   

                  One of the validity tests presented is a comparison of the data warehouse(CDW) to the audit group(CDAC). As presented, this seems to be more of a reliability testing based on two observations of the data. A use of these two sources for validity testing would be to demonstrate the validity of the audit group methodology in comparison to the data warehouse. 

                   

                  The data on Patient Sex may be an opportunity for discriminant validity to be measured and reported but this is not-addequately described. 

                   

                  Otherwise, face validity may be easier to demonstrate with the expert guidelines used in the importance section. 

                  Equity

                  Equity Rating
                  Equity

                  This is an optional element so met was selected. The developer chose not to consider equity in the submission. This is an opportunity for the next maintenance submission. 

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  The measure is currently in use as part of the Hospital OQR. 

                   

                  While the measure has not found any improvement since 2018, the developer suggests a way to change this is "provider education". Other potential opportunities for improvement could be system changes such as the physical workflow in the emergency department including the location of the CT and radiology department or changing IT workflows or standard order sets. Involvement of non-provider staff and public health level education programs are other opportunities. 

                   

                  As described above, however, this measure has shown no improvement since 2018. It is not clear, based on this, how the resources invested in the measure are positively impacting quality. While this is not all the developers responsibility, there may be opportunities to refine the measure to better measure and impact quality in stroke care. 

                  Summary

                  The care of stroke, particularly the detection of hemorrhagic versus embolic, stroke is important for patients, especially those who have a higher burden of vascular diseases in their communities. 

                   

                  The current measure has not demonstrated improvement at least since 2018. The measure has not clearly distinguished where improvement has occured. There may be an opportunity to refine how we measure/evaluate stroke care in order to improve outcomes. While this measure should be kept, for the next maintenance period, the measurement should be refined to identify improvement.

                  First Name
                  Gregary
                  Last Name
                  Bocsi

                  Submitted by Greg Bocsi on Mon, 01/22/2024 - 02:36

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  strongly agree with preliminary assessment 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  scoring many hospitals over many years also demonstrates feasibility 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Reliability seems adequate in most instances.  As noted in the preliminary assessment, further investigation of the cause(s) of lower reliability scores in a sizeable group of entities is warranted.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  the validity testing results were reassuring that validity is adequate and specific threats to validity weren't identified 

                  Equity

                  Equity Rating
                  Equity

                  not addressed

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Need an rationale for the lack of improvement.  Are there any examples of improved performance anywhere?

                  Summary

                  main concern is apparent lack of improvement followed by or maybe related to reliability for some facilities 

                  First Name
                  Karen
                  Last Name
                  Johnson

                  Submitted by Karen Johnson on Mon, 01/22/2024 - 11:09

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Description of patient input does not support the conclusion that time-to-interpretation is meaningful for patients (evidence is about tPA administration, not interpretation of images).

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Data for the measure are generated during care. Measure uses data from EHRs or other electronic sources.  Measure is being implemented. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Specifications are clear. Reliability results are fair overall, although concerning for quite a few facilities. Low case volume may be the reason for the lower reliability numbers although this isn’t clear from the submission. If so, there may not be an “easy fix” since sampling for the measure is proportional to size, with the smallest facilities reporting on all cases (although one option would be to extend the measure timeframe to >1 year).  

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Validity testing was conducted for data elements and measure score, with adequate results for both.  

                  Equity

                  Equity Rating
                  Equity

                  The validity testing results indicates a difference in performance rates for men vs. women.  Other subpopulation results were shown although differences between subgroups were not tested.  Additional analysis (especially by insurance status) would be helpful.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Measure is in use.  There has been no substantial feedback on the measure or indications of unexpected findings.  Improvement is positive although slight.  Utility for low-volume providers in an accountability application likely is questionable, although the measure should be useful for quality improvement for many low-volume providers.

                  Summary

                  This measure meets most of the requirements for re-endorsement.  The developer should solicit patient input about the meaningfulness of the timeliness of imaging interpretation for patients.  Reliability appears to be less than adequate for some facilities (likely those with low case-volume); while this may impact the decision to use the measure in certain programs, I don’t think it should disqualify the measure for re-endorsement.

                  First Name
                  AC
                  Last Name
                  Comiskey

                  Submitted by Ashley Comiskey on Mon, 01/22/2024 - 12:19

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Clinical guidelines have been in place for quite some time.  This measure supports early identification of ICH vs ischemic stroke to begin appropriate initial treatment.  

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Agree with staff assessment

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  no additional comments

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  no additional comments

                  Equity

                  Equity Rating
                  Equity

                  not addressed

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  The measure has been in place for several years in the Outpatient Quality Reporting Program with minimal or no improvement in recent years.  I would like to see the measure steward address barriers to improvement.  

                  Summary

                  I believe this measure is still valid and important to measure.  

                  First Name
                  Jean-Luc
                  Last Name
                  Tilly

                  Submitted by Jean-Luc Tilly on Mon, 01/22/2024 - 13:38

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Clinical guidelines and systematic reviews clearly support the measure as constructed.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Successfully used in federal reporting.

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Agree with staff assessment that a mitigation strategy for facilities with a low denominator is appropriate; personally my target would be no lower than a .5 reliability score, so something like a 15th percentile and below cutoff.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Data element validity is clearly established, but I would have preferred a different approach to empirical validity of the measure score, ideally by correlating this performance measure with other scores on related measures.

                  Equity

                  Equity Rating
                  Equity

                  Not addressed.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  The measure is in use, and performance scores have been made available and feedback has been solicited. However, performance scores are arguably worsening over time since implementation, and training providers is the only recommended intervention. No meaningful feedback was apparently raised. At a subsequent maintenance review, if performance and available interventions remain as-is, the measure should no longer be considered usable.

                  Summary

                  The key here is whether the developers can articulate a clear case for why the measure should continue to be used if measure scores have not improved, and interventions to improve scores are limited with no evidence presented of a successful implementation. 

                  First Name
                  Matt
                  Last Name
                  Austin

                  Submitted by Matt Austin on Mon, 01/22/2024 - 15:57

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Clinical guidelines provide support for this measure concept; The measure developers identified a gap in performance (deciles show variation).  Adherence to the process results in improved functional/QOL outcomes for patients 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Neither of these were identified in the submission:

                   

                  • Near-term paths are specified to support routine and electronic data capture with an implementable data collection strategy OR
                  • Required data are routinely generated and used during care, required data are available in EHRs or other electronic sources, and the data collection strategy can be implemented

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Concerns with the reliability testing --  approximately 30-35% of entities have reliability less than 0.6, likely facilities with a low denominator size.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  No concerns.

                  Equity

                  Equity Rating
                  Equity

                  Did not answer.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  There has been little improvement over the last 6 years on this measure.  What are the plans to drive improvement?

                  Summary

                  Comments on 0661

                  First Name
                  Jill
                  Last Name
                  Blazier

                  Submitted by Jill Blazier on Mon, 01/22/2024 - 16:02

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  This measure has been in use for a number of years. Logic indicates that faster door-to-imaging times yield better patient outcomes. It is curious that the measure performance has grown stagnant 

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  This measure has been abstracted for many years. The real trick will be getting abstraction done soon enough to provide physicians with relatively concurrent data from which to learn and improve. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Agree with staff assessment.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Agree with staff assessment.

                  Equity

                  Equity Rating
                  Equity

                  Developer did not address this optional criterion.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Process improvement is definitely a challenge with this measure. I look forward to hearing more comments on this point. This measure currently in use in CMS HOQR has been present since 2012, and is publicly reported, but has also not shown improvement since 2015 for reasons that are not articulated. 

                   

                  To improve quality, hospitals are expected to educate providers on guidelines for diagnosing and treating ischemic stroke, but the developers do not mention potential QI mechanisms such as providing performance reports to providers. Manually abstracted measures also have a lagged feedback loop to providers unless sites take steps to facilitate this improvement point. 

                  Summary

                  This is an important measure and I look forward to this discussion. This measure has been in CMS HOQR for the last 11 years and performance settled around 2015. I would like to know what are the thoughts around driving improvement in this area, as the developer does not specify. 

                  First Name
                  Hannah
                  Last Name
                  Ingber

                  Submitted by Hannah Ingber on Mon, 01/22/2024 - 17:36

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Evidence suggests measure would lead to improved outcomes. Credible link between structure, process, outcome. Evidence matches specifications of the measure. Empirical studies provided. Continued variation in measure scores. Patients surveyed find the measure valuable.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  Routinely generated from electronic sources. I assume AMA statements are about CPT codes used to capture data, which is the case for many measures and is not a major concern. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Technically, the threshold is not met, hence rating. Agree with staff assessment of accountable entity-level reliability ratings falling short of thresholds and potential fix of increasing minimum thresholds to improve reliability. Would hesitate to suggest extending timeframe due to measure's context and rolling, quarterly nature of result collection. This is because, when considered along with the data element validity testing results, I think some of the potential issues with reliability may be addressed. Would like to hear from others who know more about SNR testing drawbacks. Can the results be considered in concert? Or must that be entirely separate? 

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Would like to know a little more about any issues with data element "Head CT/MRI Scan Interpretation Time" because this seems essential. But Gwet's AC1 (which appears to be a valid use of this test from outside research--would have liked more explanation of this choice) results suggest there isn't a strong concern. Would like to know if developer has a plan for addressing this difference in this data element, though. Hypothesis testing makes sense to me although would appreciate more explanation of the results.

                  Equity

                  Equity Rating
                  Equity

                  not required.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  Agree with staff assessment--would like to know more about why results have not improved over time.

                  Summary

                  Would like just a little more explanation for a few findings. See above.

                  First Name
                  Selena
                  Last Name
                  McCord

                  Submitted by Selena McCord on Mon, 01/22/2024 - 19:58

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  The developer cites recommendations that support the measure. In addition, the developer cited a survey to show meaningfulness to patients who would be impacted by this process measure.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  There are no concerns regarding feasibility as the measure is being submitted for maintenance. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  In agreement with staff assessment  

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  In agreement with staff assessment  

                  Equity

                  Equity Rating
                  Equity

                  Not addressed by the developer as this was optional.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  In agreement with staff assessment.

                  Summary

                  none

                  First Name
                  Helen
                  Last Name
                  Haskell

                  Submitted by Helen W Haskell on Mon, 01/22/2024 - 21:53

                  Permalink

                  Importance

                  Importance Rating
                  Importance

                  Prompt imaging and accurate treatment of stroke patients is essential. This is an important measure that has been in place for some years. The important quyrstion raised is why there has been no improvement in the years it has been in place.

                  Feasibility Acceptance

                  Feasibility Rating
                  Feasibility Acceptance

                  This appears to have a history of effective data collection. It is, however a relatively complex measure that would be time-consuming to do manually. 

                  Scientific Acceptability

                  Scientific Acceptability Reliability Rating
                  Scientific Acceptability Reliability

                  Higher minimum case volume should be considered.

                  Scientific Acceptability Validity Rating
                  Scientific Acceptability Validity

                  Gender differences in performance need more exploration.

                  Equity

                  Equity Rating
                  Equity

                  Not addressed.

                  Use and Usability

                  Use and Usability Rating
                  Use and Usability

                  the measure does not seem to have had an effect on performance. Would like some discussion of the reasons for this and ways it could be addressed. Would also like more details on the reporting of samples.

                  Summary

                  This is an important measure but the lack of effect over the years should be addressed.