Kidney Health Evaluation

CBE ID

4315e

1.5 Project

Initial Recognition and Management

Endorsement Status

Not Endorsed

E&M Committee Rationale/Justification

Not Endorsed due to No Consensus

1.0 New or Maintenance

New

Previous Endorsement Cycle

Spring 2024

Is Under Review

1.6 Measure Description

Percentage of patients aged 18-85 years with a diagnosis of diabetes who received a kidney health evaluation defined by an Estimated Glomerular Filtration Rate (eGFR) AND Urine Albumin-Creatinine Ratio (uACR) within the 12-month measurement period

Measure Specs

General Information

1.7 Measure Type

Process

1.7 Composite Measure

1.3 Electronic Clinical Quality Measure (eCQM)

Yes

1.8 Level of Analysis

Clinician: Individual

1.9 Care Setting

Clinician Office/Clinic

1.10 Measure Rationale

Chronic Kidney Disease (CKD) is a major driver of morbidity, mortality and high healthcare costs in the United States. Currently, 37 million American adults have CKD and millions of others are at increased risk (National Kidney Foundation [NKF], 2022), with an estimated population prevalence growing to nearly 17% among Americans aged 30 years and older by the year 2030 (Saran et al., 2019; Hoerger et al., 2015). Total Medicare spending in 2016 on both CKD and End-Stage Renal Disease (ESRD) was over $114 billion, comprising 23% of total Medicare fee-for-service spending overall with costs increasing exponentially with advancing CKD (Saran et al., 2019; Nichols et al., 2020). In the US from 2002-2016, the burden of CKD, defined as years of life lost, years living with disability, disability-adjusted life years, and deaths, outpaced changes in the burden of disease for other conditions (Bowe et al., 2018). Patients with CKD are readmitted to the hospital more frequently than those without diagnosed CKD (Saran et al., 2019). CKD is the 9th leading cause of death in the US and is the fastest growing non-communicable disease in terms of in burden largely due to death (Hoerger et al., 2015; Bowe et al., 2018). This public health issue is driven largely by the impact of diabetes—the most common comorbid risk factor for CKD (Saran et al., 2019; Bowe et al., 2018).

The intent of this process measure is to improve rates of guideline-concordant kidney health evaluation in patients with diabetes to more consistently identify and potentially treat or delay progression of CKD in this high-risk population. Annual kidney health evaluation in patients with diabetes to determine risk of CKD using estimated glomerular filtration rate (eGFR) and urine albumin creatinine ratio (uACR) is recommended by clinical practice guidelines (American Diabetes Association, 2023; de Boer, 2022; NKF, 2007; NKF, 2012) and has been a focus of various local and national health care quality improvement initiatives, including Healthy People 2030 (Healthy People 2030, 2023). However, performance of these tests in patients with diabetes remains low, with rates that vary across Medicare (41.8%) and private insurers (49.0%) (Saran et al., 2019; Alfego et al., 2021; Stempneiwicz et al., 2021). Low rates of detection of CKD in a population of patients with diabetes have been demonstrated to be associated with low patient awareness of their own kidney health status (Szczech et al., 2014). Indeed, 90% of individuals with CKD are unaware of their condition due to under-recognition and under-diagnosis (Saran, et al., 2019; Centers for Disease Control and Prevention, 2023). Currently, an individual’s lifetime probability of developing CKD is relatively high, reaching 54% for someone currently aged 30-49 years (Hoerger et al., 2015). Regular kidney health evaluations, utilizing both eGFR and uACR, provide an opportunity to improve identification and potential reversal of worsening kidney function, particularly in high risk populations, such as those with diabetes.

This measure replaces and improves upon the previous Merit-Based Incentive Program (MIPS) medical attention for nephropathy measure. This measure is more specific as it requires utilizing the eGFR and uACR tests to assess a patient’s kidney health.

References:

Alfego, D., Ennis, J., Gillespie, B., Lewis, M.J., Montgomery, E., Ferrè, S., … Letovsky, S. (2021). Chronic kidney disease testing among at-risk adults in the U.S. remains low: Real-world evidence from a National Laboratory database. Diabetes Care, 44(9), 2025-2032.https://doi.org/10.2337/dc21-0723

American Diabetes Association Professional Practice Committee. (2023). Chronic kidney disease and risk management: Standards of medical care in diabetes—2023.Diabetes Care, 46(Supplement_1), S191-S202. https://doi.org/10.2337/dc23-S011

Bowe, B., Xie, Y., Li, T., Mokdad, A. H., Xian, H., Yan, Y.,... Al-Aly, Z. (2018). Changes in the US burden of chronic kidney disease from 2002 to 2016. JAMA Network Open, 1(7).

Centers for Disease Control and Prevention. Chronic Kidney Disease in the United States. (2023). Retrieved from: https://www.cdc.gov/kidneydisease/publications-resources/ckd-national-f…

de Boer, I.H., Khunti, K., Sadusky, T., Tuttle, K.R., Neumiller, J.J., Rhee, Bakris, G. (2022). Diabetes management in chronic kidney disease: a consensus report by the American Diabetes Association (ADA) and Kidney Disease: Improving Global Outcomes (KDIGO). Kidney International, 102(5):974-989. doi: 10.1016/j.kint.2022.08.012

Healthy People 2030. Retrieved from: https://health.gov/healthypeople/objectives-and-data/browse-objectives/…

Hoerger, T. J., Simpson, S. A., Yarnoff, B. O., Pavkov, M. E., Burrows, N. R., Saydah, S. H., . . . Zhuo, X. (2015). The future burden of CKD in the United States: A simulation model for the CDC CKD Initiative. American Journal of Kidney Diseases, 65(3), 403-411. doi:10.1053/j.ajkd.2014.09.023

National Kidney Foundation. (2007). KDOQI Clinical practice guidelines and clinical practice recommendations for diabetes and chronic kidney disease. Retrieved from: https://www.kidney.org/sites/default/files/docs/diabetes_ajkd_febsuppl_…

National Kidney Foundation. (2012). KDOQI Clinical practice guideline for diabetes and CKD: 2012 Update. Retrieved from: http://www.kidney.org/sites/default/files/docs/diabetes-ckd-update-2012…

National Kidney Foundation. (2022). About chronic kidney disease. Retrieved from: https://www.kidney.org/atoz/content/about-chronic-kidney-disease

Nichols, G.A., Ustyugova, A., Déruaz-Luyet, A., O’Keeffe-Rosetti, M., & Brodovicz, K.G. (2020). Health care costs by type of expenditure across eGFR stages among patients with and without diabetes, cardiovascular disease, and heart failure. Journal of the American Society of Nephrology, 31(7), 1594-1601. DOI: https://doi.org/10.1681/asn.2019121308

Saran R. B., Abbott K. C., Agodoa, L.Y.C., Bragg-Gresham, J., Balkrishnan, R., Shahinian, V. (2019). US renal data system 2018 annual data report: Epidemiology of kidney disease in the United States. American Journal of Kidney Diseases, 73(3). DOI: https://doi.org/10.1053/j.ajkd.2019.01.001

Stempneiwicz, N., Vassalotti, J.A., Cuddeback, J.K., Ciemins, E., Storfer-Isser, A., Sang, Y., … Coresh, J. (2021). Chronic kidney disease testing among primary care patients with type 2 diabetes across 24 U.S. health care organizations. Diabetes Care, 44(9), 2000-2009. DOI: https://doi.org/10.2337/dc20-2715

Szczech, L. A., Stewart, R. C., Su, H., Deloskey, R. J., Astor, B. C., Fox, C. H., . . . Vassalotti, J. A. (2014). Primary care detection of chronic kidney disease in adults with type-2 diabetes: The ADD-CKD study (Awareness, detection and drug therapy in type 2 diabetes and chronic kidney disease). PLoS ONE, 9(11). DOI: https://doi.org/10.1371/journal.pone.0110535 

1.11 Measure Webpage

https://ecqi.healthit.gov/ecqm/ec/2023/cms0951v1

1.20 Types of Data Sources

Electronic Health Records

1.25 Data Source Details

Practices collect EHR data using certified electronic health record technology (CEHRT). The MAT output, which includes the human readable and XML artifacts of the clinical quality language (CQL) for the measure are contained in the eCQM specifications attached. No additional tools are used for data collection for eCQMs.

Exclusions

1.15b Denominator Exclusions

Patients with a diagnosis of ESRD active during the measurement period.

Patients with a diagnosis of CKD Stage 5 active during the measurement period.

Patients who have an order for or are receiving hospice or palliative care.

1.15c Denominator Exclusions Details

The denominator exclusions are patients who meet the following criteria during the measurement period:

Patients with a diagnosis of ESRD active during the measurement period
Patients with a diagnosis of CKD Stage 5 active during the measurement period
Patients who have an order for or are receiving hospice or palliative care

All data elements necessary to calculate this numerator are defined in the eCQM specifications and associated value sets are available in the Value Set Authority Center (VSAC):

valueset "Chronic Kidney Disease, Stage 5" (2.16.840.1.113883.3.526.3.1002)
valueset "End Stage Renal Disease" (2.16.840.1.113883.3.526.3.353)
valueset "Hospice Care Ambulatory" (2.16.840.1.113883.3.526.3.1584)
valueset "Hospice Diagnosis" (2.16.840.1.113883.3.464.1003.1165)
valueset "Hospice Encounter" (2.16.840.1.113883.3.464.1003.1003)
valueset "Palliative Care Diagnosis" (2.16.840.1.113883.3.464.1003.1167)
valueset "Palliative Care Encounter" (2.16.840.1.113883.3.464.1003.101.12.1090)
valueset "Palliative Care Intervention" (2.16.840.1.113883.3.464.1003.198.12.1135)
code "Discharge to healthcare facility for hospice care (procedure)" ("SNOMEDCT Code (428371000124100)")
code "Discharge to home for hospice care (procedure)" ("SNOMEDCT Code (428361000124107)")
code "Functional Assessment of Chronic Illness Therapy - Palliative Care Questionnaire (FACIT-Pal)" ("LOINC Code (71007-9)")
code "Hospice care [Minimum Data Set]" ("LOINC Code (45755-6)")
code "Yes (qualifier value)" ("SNOMEDCT Code (373066001)")

To access the value sets for the measure, please visit the Value Set Authority Center (VSAC), sponsored by the National Library of Medicine, at https://vsac.nlm.nih.gov/.

Importance

Evidence

2.1 Attach Logic Model

NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing 5-3-24.pdf

2.2 Evidence of Measure Importance

This clinical quality measure is based on two evidence-based clinical guidelines from the National Kidney Foundation (NKF), from 2007 and 2012, and the American Diabetes Association (ADA), from 2023. These guidelines explicitly recommended eGFR and uACR laboratory testing in patients with a diagnosis of diabetes. Kidney health evaluations, utilizing both eGFR and uACR tests, among patients with a diagnosis of diabetes provide an opportunity to improve identification and potential reversal of worsening kidney function.

The following evidence statements are quoted verbatim from the referenced clinical guidelines and other sources, where applicable:

American Diabetes Association Professional Practice Committee, 2023

At least annually, urinary albumin (e.g., spot urinary albumin-to-creatinine ratio) and estimated glomerular filtration rate should be assessed in people with type 1 diabetes with duration of ≥5 years and in all people with type 2 diabetes regardless of treatment. Level of Evidence: B

National Kidney Foundation, 2007 and 2012

Patients with diabetes should be screened annually for Diabetic Kidney Disease (DKD). Initial screening should commence:

5 years after the diagnosis of type 1 diabetes; (Quality of Evidence: A) or
From diagnosis of type 2 diabetes. (Quality of Evidence: B)

Screening should include:

Measurements of urinary albumin-creatinine ratio (ACR) in a spot urine sample; (Quality of Evidence: B)
Measurement of serum creatinine and estimation of GFR. (Quality of Evidence: B)

References:

National Kidney Foundation. (2012). KDOQI Clinical practice guideline for diabetes and CKD: 2012 Update. Retrieved from: http://www.kidney.org/sites/default/files/docs/diabetes-ckd-update-2012…

Measure Impact

2.3 Anticipated Impact

CKD is asymptomatic at onset. Clinicians and patients can only learn about the presence of CKD through routine testing for CKD among people who are at risk.

At this writing, 90% of people living with chronic kidney disease remain undetected in primary care settings, including as many as 40% of people whose kidney have already failed.(Centers for Disease Control and Prevention, 2023; Szczech et al., 2014) Overarching care for people with a CKD diagnosis is also suboptimal as most people do not receive guideline directed medical therapy (GDMT).(Tummalapalli et al., 2019) Approximately 50% of people with advanced CKD receive nephrology care.(United States Renal Data System)

Recent publications suggest that improvements in CKD testing, particularly the use of urine albumin-creatinine ratio, impacts the probability of people with CKD receiving GDMT in primary care settings.(Chu et al., 2023) Studies regarding the impact of new interventions, such as sodium-glucose transporter 2 inhibitors (SGLT2i)(Heerspink et al., 2020; Herrington et al., 2023; Perkovic et al., 2019) or non-steroidal mineralocorticoid receptor agonists (nsMRA)(Bakris et al., 2020), have demonstrated significant reductions in CKD progression, associated cardiovascular events, and related utilization. A recent retrospective cohort study illustrated that good CKD disease management in CKD Stage 3 and Stage 4 (defined as CKD testing, diagnosis, risk factor management, and use of basic interventions to address proteinuria) could yield as much as a 40% reduction in inpatient hospitalization, a 30% reduction in emergency room visits, and an decrease in monthly healthcare costs by as much as 17%.(Li et al., 2023)

While the opportunity to slow CKD progression, reduce the rising cardiovascular risk associated with it, and reduce utilization are opportunities that arise from improved CKD testing, there are few adverse events that are associated with the use of the two widely available, inexpensive tests associated with this measure.

References:

Bakris GL, Agarwal R, Anker SD, et al. Effect of Finerenone on Chronic Kidney Disease Outcomes in Type 2 Diabetes. New England Journal of Medicine 2020;383(23):2219-2229. DOI: doi:10.1056/NEJMoa2025845.

Centers for Disease Control and Prevention. Chronic Kidney Disease in the United States, 2023. Retrieved from: https://www.cdc.gov/kidneydisease/publications-resources/ckd-national-f…

Chu CD, Xia F, Du Y, et al. Estimated Prevalence and Testing for Albuminuria in US Adults at Risk for Chronic Kidney Disease. JAMA Netw Open 2023;6(7):e2326230. (In eng). DOI: 10.1001/jamanetworkopen.2023.26230.

Heerspink HJL, Stefánsson BV, Correa-Rotter R, et al. Dapagliflozin in Patients with Chronic Kidney Disease. N Engl J Med 2020;383(15):1436-1446. (In eng). DOI: 10.1056/NEJMoa2024816.

Herrington WG, Staplin N, Wanner C, et al. Empagliflozin in Patients with Chronic Kidney Disease. N Engl J Med 2023;388(2):117-127. (In eng). DOI: 10.1056/NEJMoa2204233.

Li Y, Barve K, Cockrell M, et al. Managing comorbidities in chronic kidney disease reduces utilization and costs. BMC Health Serv Res 2023;23(1):1418. (In eng). DOI: 10.1186/s12913-023-10424-8.

Perkovic V, Jardine MJ, Neal B, et al. Canagliflozin and Renal Outcomes in Type 2 Diabetes and Nephropathy. N Engl J Med 2019;380(24):2295-2306. (In eng). DOI: 10.1056/NEJMoa1811744.

Tummalapalli SL, Powe NR, Keyhani S. Trends in Quality of Care for Patients with CKD in the United States. Clinical Journal of the American Society of Nephrology 2019;14(8):1142-1150. DOI: 10.2215/cjn.00060119.

United States Renal Data System. 2021 USRDS Annual Data Report: Epidemiology of kidney disease in the United States. National Institutes of Health, National Institute of Diabetes and Digestive and Kidney Diseases, Bethesda, MD: , 2021. (https://adr.usrds.org/2021)

2.5 Health Care Quality Landscape

This measure replaces and improves upon the previous Merit-Based Incentive Program (MIPS) medical attention for nephropathy measure. The rationale provided in section 1.10 provides additional context on why this measure is needed.

2.6 Meaningfulness to Target Population

This measure was developed with input from a technical expert panel (TEP), which included patient and caregiver representation. These individuals were also trained in measure development prior to participating on the TEP. Generally, patients express the wish to have been made aware of their kidney health or diagnosed with CKD earlier as it would have allowed them opportunities for better lifestyle choices and engage in decision-making.

Equity

Equity

Equity

3.1 Contributions Toward Closing Care Gaps

As shown in above Table 4, Characteristics of denominators, there are differences in the patient demographic distribution. Patients’ age ranged from 19 to 85 years old. More specifically, 10.4% patients aged 18-45, 15.1% aged 46-55, 25.8% aged 56-65, 31.0% aged 66-75%, and 17.7% aged 76-85. Female patients are 50.7%; white race 85.5% and black race 7.6%; by Hispanic, N/HL is 91.7%. So, the demographic characteristics are representative.

We also performed analysis of performance score by each demographic variable, results are presented in Table 9. The distributions of performance score by each variable’s category are different. For example, among age groups, the patients with aged 56-65 and 66-75 have higher performance score; there is no obvious difference between male and female; median score for white is 7.5 and for black it is 0.0.

Two recent studies of Americans with diabetes demonstrated disparities for lower uACR testing among Blacks or African Americans and socioeconomically disadvantaged groups (Medicare-Medicaid dually eligible, lower neighborhood income, lower education status). (Ferrè et al, 2023)(Bhave et al, 2024) This incrementally contributes to kidney health inequities that include higher rates of type 2 diabetes, more rapid CKD progression, lower access to kidney protective therapies such as sodium glucose co-transporter-2 inhibitors (SGLT2i), reduced access to nephrology services and kidney failure replacement inequities (default in-center hemodialysis and lower access to patient-centric home dialysis and kidney transplant).

See Tables 4 & 9 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

References:

Ferrè, S., A. Storfer-Isser, K. Kinderknecht, E. Montgomery, M. Godwin, A. Andrews, S. Dunning, M. Barton, D. Roman, J. Cuddeback, N. Stempniewicz, C. D. Chu, D. S. Tuot and J. A. Vassalotti. Fulfillment and Validity of the Kidney Health Evaluation Measure for People with Diabetes. Mayo Clin Proc Innov Qual Outcomes. 2023 Aug 29;7(5):382-391.doi: 10.1016/j.mayocpiqo.2023.07.002. eCollection 2023 Oct.

Bhave NM, Han Y, Steffick D, Bragg-Gresham J, Zivin K, Burrows NR, Pavkov ME, Tuot D, Powe NR, Saran R. Assessing trends and variability in outpatient dual testing for chronic kidney disease with urine albumin and serum creatinine, 2009-2018: a retrospective cohort study in the Veterans Health Administration System. BMJ Open. 2024 Feb 12;14(2):e073136. doi: 10.1136/bmjopen-2023-073136.

Feasibility

Feasibility
Proprietary Information

Feasibility

4.1 Feasibility Assessment

All of the data used to calculate this measure are routinely generated and used during care delivery. All data elements were assessed to determine the extent to which the data could be collected using electronic health record systems from two ambulatory practices using the feasibility scorecard. Findings and refinements to the measure based on this evaluation are described in section 3.3.

4.2 Attach Feasibility Scorecard

NKF Kidney Health Evaluation Feasibility Scorecard rev.2 5-3-2024.xlsx

4.3 Feasibility Informed Final Measure

Each practice provided information about data availability, data accuracy, data standards, workflow, and burden to collect and report data. All required data elements were able to be collected, although some data elements were not currently able to be captured in structured fields. Several data elements were captured via free text. Barriers to feasibility are primarily related to the ability to capture data elements for hospice and palliative care in an ambulatory setting. Without the ability to capture all applicable denominator exclusion data elements, the overall performance calculation may be impacted, as the denominator or eligible population would not be reduced by the number of patients for whom the measure is not applicable. Therefore, it could mean a lower calculated performance score. However, it is also possible that some patients who would have met the elements not captured also possess one or more of the other exclusions that are feasible, so they would be appropriately removed. Data elements critical to the calculation of the measure score are feasible and the measure was considered feasible for implementation. The measure was also tested in BONNIE in 2020 and the measure logic performs as expected. The measure has 100% coverage and all 19 of the test cases are passing, within the BONNIE system.

Proprietary Information

4.4 Proprietary Information

Proprietary measure or components (e.g., risk model, codes), without fees

4.4a Fees, Licensing, or Other Requirements

Physician Performance Measures (Measures) and related data specifications developed by the National Kidney Foundation (NKF) are intended to facilitate quality improvement activities by health care professionals.

These Measures are intended to assist health care professionals in enhancing quality of care. These Measures are not clinical guidelines and do not establish a standard of medical care and have not been tested for all potential applications. NKF encourages testing and evaluation of its Measures.

Measures are subject to review and may be revised or rescinded at any time by NKF. The measures may not be altered without prior written approval from NKF. The measures, while copyrighted, can be reproduced and distributed, without modification, for noncommercial purposes. Commercial use is defined as the sale, license, or distribution of the measures for commercial gain, or incorporation of the measures into a product or service that is sold, licensed, or distributed for commercial gain. Commercial uses of the measures require a license agreement between the user and NKF. Neither NKF nor its members shall be responsible for any use of the measures.

THESE MEASURES AND SPECIFICATIONS ARE PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND.

Limited proprietary coding is contained in the Measure specifications for convenience. Users of the proprietary code sets should obtain all necessary licenses from the owners of these code sets.

CPT(R) contained in the Measure specifications is copyright 2004-2023 American Medical Association. LOINC(R) is copyright 2004-2023 Regenstrief Institute, Inc. This material contains SNOMED Clinical Terms(R) (SNOMED CT[R]) copyright 2004-2023 International Health Terminology Standards Development Organisation. ICD-10 is copyright 2023 World Health Organization. All Rights Reserved.

The PCPI’s and American Medical Association (AMA)’s significant past efforts and contributions to the development and updating of the measure are acknowledged.

Due to technical limitations, registered trademarks are indicated by (R) or [R].

Scientific Acceptability

Testing Data

5.1.1 Data Used for Testing

Two ambulatory test sites were selected with different electronic health record systems (EHRs) (Allscripts and AthenaNet) in the Midwest. Site 1 included 59 clinicians and 1 clinician provided care at Site 2. Data on 2,950 patients from these two practices from January 1, 2019 to December 31, 2019 were used for the reliability and validity testing.

5.1.2 Differences in Data

Data element validity testing used the same data described in section 4.1.1 but only a sample 85 patients randomly selected at each site, 170 patients total was used for the medical record abstraction.

5.1.3 Characteristics of Measured Entities

There are total 60 entities, i.e., clinicians, included in the measure. As shown in Table 3, the median size as measured by number of patients per clinician is 24, the interquartile ranges from 5 to 67; the minimum is only 1 patient and the maximum size is 306 patients.

See Table 3 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

5.1.4 Characteristics of Units of the Eligible Population

There are total 2,950 patients included in the measure analysis. The mean (STD) age is 63.2 years (13.0 years); the median age is 65 years with the interquartile range from 55 to 72 years old; the minimum and maximum age is 19 years old and 85 years old, respectively. The other characteristics including age groups, sex, race, and Hispanic are presented in Table 4.

See Table 4 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

Reliability

5.2.1 Level(s) of Reliability Testing Conducted

Accountable entity level (i.e., measure score) (e.g., signal-to-noise analysis)

5.2.2 Method(s) of Reliability Testing

To assess signal-to-noise, we employed the beta-binomial model as described by JL Adams in “The Reliability of Provider Profiling” (Adams, JL. The reliability of provider profiling: A tutorial. RAND Health, 2009). Using the techniques detailed in that document, we estimated the clinician-to-clinician variance (the signal) and the within-clinician variance (the noise). The ratio of these estimates then produced an estimate of the reliability at each clinician, where a reliability of 0 implies that all variability is due to measurement error, while a reliability of 1 indicates that all variability is due to real differences in performance. Lower levels of reliability (0.7-0.8) are considered acceptable for drawing conclusions about groups. Generally speaking, 0.8 is a very good reliability. The distribution of reliability estimates across all clinicians was examined. The equation of reliability is as below and results are presented in Table 5.

See Table 5 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

5.2.3 Reliability Testing Results

As shown in Table 5, the mean reliability is 0.74 and STD is 0.27; the median reliability is 0.80 with the interquartile range from 0.64 to 1.0 across the 60 clinicians with at least one patient meeting the denominator criteria. The distribution of the reliability score by deciles, total number of entities and total patients in each decile range are also provided in the below Table 6.

See Tables 5 & 6 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

5.2.3a Attach Additional Reliability Testing Results

NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing 5-3-24_1.pdf

5.2.4 Interpretation of Reliability Results

Reliability is the measure of whether you can tell one clinician from another. As discussed above, a reliability of 1 indicates that all variability is due to real differences in performance. Our result of median reliability of 0.8 allows us to draw conclusions about clinician performance. Also about 72% of clinicians (43 out of 60) had about reliability of >0.7; 18% of clinicians (11 out of 60) had reliability between 0.54 -0.66; only 10% of clinicians (6 out of 60) had reliability < 0.3. The smaller number of clinicians with lower reliability could be due to different reasons such as a small practice with few patients or larger variances in performance scores.

Validity

5.3.1 Level(s) of Validity Testing Conducted

Person or encounter level (i.e., data element) (e.g., sensitivity and specificity)

5.3.3 Method(s) of Validity Testing

We conducted data element validity test for two practices with two EHR systems: Site 1 (Allscripts) and Site 2 (AthenaNet), respectively. Each practice were asked to implement the eCQM specifications and produce an electronic report of the data elements required for the measure. Trained abstractors then reviewed the medical record for the presence or absence of the same data elements on a randomly selected set of patients and the results from the electronic report were compared against the medical record (gold standard). The sample size for this analysis was 85 patients randomly selected at each site, 170 patients total.

First, we calculated percentage of agreement of data used in the analysis with data from golden standard for each data element. We defined “agreement” if both are reported same. Second, we calculated Kappa coefficient which is a measure of interrater agreement. When there is perfect agreement between the two ratings, the kappa coefficient equals +1. When the observed agreement exceeds chance agreement, the value of kappa is positive, and its magnitude reflects the strength of agreement. The minimum value of kappa is between –1 and 0, depending on the marginal proportions. A value of kappa higher than 0.75 can be considered (arbitrarily) as "excellent" agreement, while lower than 0.4 will indicate "poor" agreement.

In addition, we also calculated sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) for the individual data elements. Using the diabetes diagnosis data element as an example, these analyses tell us the following:

Sensitivity is the probability that a person is reported as having a diagnosis of diabetes among those who are truly have that diagnosis documented in the “golden standard”.
Specificity is the fraction of those reporting that they do not have the diabetes diagnosis who actually do not have the diagnosis documented in the “golden standard”.
PPV is the probability of true patients with a diagnosis of diabetes among those who reported diabetes.
NPV is the probability of true patients without a diagnosis of diabetes among those who reported that they did not have that diagnosis.

Overall agreement rates for the numerator, denominator, and exclusions were also calculated to better demonstrate how the data elements collectively perform at those levels.

5.3.4 Validity Testing Results

The test results are presented in Table 7 for Site 1 and Site 2, respectively. Testing of the denominator indicated an overall agreement rate of 94%, with an agreement rate of 91% at site 1 and 98% at site 2. Agreement is very high as expected.

Testing for the exclusions indicated an overall agreement rate of 87%, with an agreement rate of 84% at site 1 and 91% at site 2. There were 21 cases where data was available through abstraction and not electronic reporting. Exclusions include indicators often reported in inpatient settings such as palliative and/or hospice care. In many healthcare systems including ambulatory only, these elements are available for electronic reporting and agreement is expected to be higher. Agreement is high as expected.

Testing for the numerator indicated an overall agreement rate of 50%, with an agreement rate of 69% at site 1 and 31% at site 2. There were 78 cases where data was available through abstraction and not electronic reporting. Of these 78 cases, 57 cases were attributed to site 2. The overall agreement and agreement for site 1 is moderate.

Analysis of the individual data elements demonstrated that the range of percentage of agreement is from 38.8% to 100% for Site 1 and is from 68.2% to 100% for Site 2. Most of these percentages are great than 50% (only one element had an agreement rate of 38.8%). The range of Kappa is from -0.018 to 1.00.

See Table 7 found in the attachment labeled: NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing.

5.3.4a Attach Additional Validity Testing Results

NKF Kidney Health Evaluation Measure Logic Model Gaps and Testing 5-3-24_2.pdf

5.3.5 Interpretation of Validity Results

Overall agreement rates for the data elements required for the denominator and exclusions were very high or high, respectively and agreement is considered moderate for those data elements required for the numerator.

There are several factors that impacted the numerator agreement. One is that uACR results are sometimes reported by the laboratory as “unable to calculate” because the urine albumin concentration is below the analytic detectable threshold. Another is that the clinical sites are part of a system that only provides ambulatory care and have more limitations in accessing laboratory data in discrete fields than is typically seen in other healthcare systems. Much of the eGFR and uACR testing is completed through smaller laboratories and results are often scanned into the EHR or the availability of data in discrete fields as received from the laboratory data is limited. These discrepancies do not highlight an issue with the data flow from the EHR to the measure, but rather in the data transfer from the laboratory to the EHR where the level of granularity of the data is less specific and thereby more challenging to capture.

A recent publication provided a solution for the “unable to calculate” laboratory reporting problem by recommending rounding and a numerical value with a less than sign be reported to allow for electronic data capture. (Miller et al, 2019) The NKF is collaborating with the NKF Laboratory Engagement Initiative to implement this solution. (National Kidney Foundation, 2024) To overcome the scanning issue, we have confidence that the data in most large healthcare systems will be available in discrete fields and systems that have these barriers can develop appropriate data workflows and implement appropriate mapping and labelling of laboratory tests as specified in this measure. Many systems on implementation of existing eCQMs have demonstrated the ability to make the necessary changes to enable electronic reporting.

References:

Miller WG, Bachmann LM, Fleming JK, Delanghe JR, Parsa A, Narva AS; Laboratory Working Group of the National Kidney Disease Education Program and the IFCC Working Group for Standardization of Albumin in Urine. Recommendations for Reporting Low and High Values for Urine Albumin and Total Protein. Clin Chem. 2019 Feb;65(2):349-350. doi: 10.1373/clinchem.2018.297861. Epub 2018 Nov 20.

National Kidney Foundation. (2024). Laboratory Engagement Initiative. Retrieved from:

https://www.kidney.org/content/laboratory-engagement-initiative-lei

5.3.2 Type of Accountable Entity Level Validity Testing Conducted (derived)

Not applicable/accountable entity level validity testing not conducted

Use & Usability

Use
Usability

Use

6.1.1 Current Status

In use

6.1.2 Current or Planned Use(s)

Payment Program

6.1.3 Program Details

Name of the program and sponsor

Center for Medicare & Medicaid Services (CMS) Merit-Based Incentive Payment System (MIPS)

Geographic area and percentage of accountable entities and patients included

National

Applicable level of analysis and care setting

Individual clinician and clinician group; outpatient/ambulatory care.

Usability

6.2.1 Actions of Measured Entities to Improve Performance

Two actions entities must take to ensure successful performance on this measure include education of cross-functional clinical care teams and engagement with laboratory leadership to ensure accurate calculation and reporting of kidney profile results.

Fulfilment of the KED measure for measured entities is approximately under 40% for the population with diabetes who should be receiving annual eGFR and uACR testing according to clinical practice guideline recommendations. (Ferrè et al, 2023) To increase fulfilment, clinical care team education on the importance of screening at risk populations and targeted albuminuria testing is important. Once clinical care teams are engaged, collaboration with laboratory leadership is essential to clarify measured values are being captured accurately as issues may arise in the calculation of the uACR if urine albumin levels are below a detectable range.

Reference:

Ferrè S, Storfer-Isser A, Kinderknecht K, et al. Fulfillment and Validity of the Kidney Health Evaluation Measure for People with Diabetes. Mayo Clin Proc Innov Qual Outcomes. 2023;7(5):382-391. Published 2023 Aug 29. doi:10.1016/j.mayocpiqo.2023.07.002

6.2.2 Feedback on Measure Performance

Feedback on current measure performance and implementation are received through public inquiries via the Jira-issue response process governed by Mathematica and the Centers for Medicare & Medicaid Services (CMS) for measure stewards of eCQMs in the quality reporting programs. Significant feedback obtained on the measure during past maintenance cycles have focused on expansion of several value sets and expansion of the initial population age range to increase alignment with the National Committee for Quality Assurance (NCQA) Healthcare Effectiveness Data and Information Set (HEDIS) version of this measure and clarifying timing of the denominator.

6.2.3 Consideration of Measure Feedback

Based on feedback received on measure specifications and implementation, we modified the measure to increase the age range of the initial population from 18-75 to 18-85. We also included several new LOINC codes to the eGFR value set.

6.2.4 Progress on Improvement

This measure was implemented in MIPS beginning in 2023 and we are not yet able to track progress on improvement. NKF will continue to evaluate the degree to which clinicians are able to demonstrate higher rates in future years.

6.2.5 Unexpected Findings

Unexpected findings during the implementation of this measure include differences in measure application and testing ability in rural versus urban settings. Specifically, NKF received a request to consider including urine protein testing since one practice in a rural area in Texas identified that some laboratory services in their area did not include uACR; therefore, they were unable to meet the measure. Because the evidence outlined in the Kidney Disease: Improving Global Outcomes (KDIGO) clinical practice guidelines clearly shows that assays to measure albumin are more precise and sensitive than assays to measure urine protein, NKF was unable to agree to this expansion. (KDIGO, 2024) To date, there has only been one comment submitted through the CMS processes voicing this concern and as a result we do not believe that the issue is widespread. NKF will continue to monitor feedback and performance results to ensure that the measure continues to perform as intended.

References:

Kidney Disease: Improving Global Outcomes (KDIGO) CKD Work Group. KDIGO 2024 Clinical Practice Guideline for the Evaluation and Management of Chronic Kidney Disease. Kidney Int. 2024;105(4S): S117–S314.

Comments

Public Comments

Public Comment Shared During May 29 Listening Session

This is an important measure. I'm glad it's an eCQM. And that you [the developer] added the secondary to the eGFR. I'm glad that there's another indicator there, because I think that's important. There's some questions on eGFR with some of the different race and ethnicities. So thank you.

Organization

Janice Tufte

align MIPS with Star Ratings

UnitedHealthcare recommends this measure be modified to align with NCQA’s Kidney Health Evaluation for Patients With Diabetes (KED) HEDIS measure. Similar to CMS’s initiatives with the Universal Foundation, using the same measure in MIPS and Medicare Advantage Star Ratings will reduce provider burden. The NCQA version more precisely measures kidney health evaluation.

Organization

UnitedHealthcare

4315e Kidney Health Evaluation

The American Medical Association (AMA) questions whether this measure produces scores that can be considered sufficiently reliable in the minimum reliability was 0.042, which falls well below what the AMA considers to be acceptable. We believe that measures must achieve a minimum reliability of at least 0.7 or higher in order to be achieve endorsement. We are also concerned with the low agreement rates for the two data elements required for the numerator and believes that additional testing is needed to demonstrate improved validity of the underlying data.

Organization

American Medical Association

Comment on the 4315e measure on Kidney Health Evaluation

The 2023 American Diabetes Association guidelines suggest that older adults with diabetes merit individualized recommendations that consider life expectancy, comorbidity burden, and greater prevalence of cognitive and functional limitations (American Diabetes Association, Recommendation 13.8, https://diabetesjournals.org/care/article/46/Supplement_1/S216/148044/1… [diabetesjournals.org]). However, the age range for this measure is quite large and does not account for these considerations.

There is precedent for this modification of guidelines given the American Diabetes Association recommends less-stringent glycemic goals for older adults who have significant co-morbidity burden, cognitive impairment, or functional impairment. (Recommendation 13.10, American Diabetes Association, Standards for Care in Diabetes—2023). Similar recommendations were also made by the National Kidney Foundation (Recommendation 2.3, KDOQI Diabetes Guideline: 2012 Update at www.kidney.org/sites/default/files/docs/diabetes-ckd-update-2012.pdf [kidney.org]).

To reduce the likelihood of inappropriate and low-value testing, the American Geriatrics Society (AGS) recommends lowering the age limit to 75 years (e.g., screen individuals from 18-75 years). The average life expectancy for men is approximately 75 years and for women is approximately 80 years (Centers for Disease Control, https://www.cdc.gov/nchs/fastats/life-expectancy.htm [cdc.gov]). While an imperfect approach, such a change would reduce the likelihood of including persons with less than five years of life expectancy. If desired, this recommendation could be stratified by sex, with a cutoff of 70 years for men and 75 years for women though this may hinder implementation.

Organization

American Geriatrics Society

Staff Preliminary Assessment

CBE #4315e Staff Assessment

Importance

Importance Rating

Met

Importance

Strengths:

The developer provides a logic model illustrating that annual kidney health evaluations using eGFR and uACR laboratory testing in diabetic patients can lead to increased patient awareness, early CKD diagnosis/treatment, and eventually, a decreased incidence of progression to kidney failure and cardiovascular disease. The developer highlights that CKD is asymptomatic at onset and that 90% of people living with CKD remain undetected in primary care settings.
The developer cites clinical practice guidelines from two sources, National Kidney Foundation (2007 and 2012) and American Diabetes Association (2023), that recommend eGFR and uACR laboratory testing in patients with diabetes at least annually. Guidelines were rated as moderate or high with level of evidence from well-conducted studies.
The developer assessed performance scores using data for calendar year 2019 in 60 clinicians across two practices. The difference between the minimum (0%) and maximum (100%) scores, as well as the range of mean scores across deciles (0% to 40.7%) suggest variations in performance among entities being measured.
The developer notes that this measure is intended to replace and improve upon the previous MIPS measure, Diabetes: Medical Attention for Nephropathy as it requires eGFR and uACR.
The developer explains that the measure was developed with input from patient and caregiver representatives serving on the Technical Expert Panel. Patients expressed that earlier awareness of their kidney health would have allowed them opportunities for decision making and lifestyle choices, suggesting the measure’s utility and meaningfulness to the patient population.

Limitations:

The NKF and ADA clinical practice guidelines specify annual screening should commence in patients with type 1 diabetes 5 years after diagnosis and from diagnosis in patients with type 2 diabetes. The measure specification includes patients with a diagnosis of diabetes at the start of the measurement period, regardless of the duration of the diagnosis. It is unclear if this nuance was discussed with a technical expert panel or other experts and why this decision was made.

Rationale:

There is a business case for the measure along with supporting evidence for the importance of the measured process with demonstrated gap in performance.

Closing Care Gaps

Feasibility Assessment

Feasibility Assessment Rating

Not met but addressable

Feasibility Assessment

Strengths:

The developer conducted a feasibility assessment in two ambulatory practices representing two electronic health record vendors, Allscripts and AthenaHealth. The developer provided the required eCQM Feasibility Scorecard. The developer states that all required data elements used to calculate this measure are routinely generated and used during care delivery. The developer notes that, while copyrighted, the measure can be reproduced and distributed (without modification) for noncommercial purposes and used commercially with a license agreement between the user and NKF, indicating that the measure can be used without substantial burden.

Limitations:

The developer explained that several data elements used in the measure were available in the electronic health record (e.g., as a provider note or scanned document) but not as a structured field. These data elements are used to exclude patients who are receiving hospice care or to capture patients in the denominator. The practices included in the feasibility assessment did not provide hospice services, home healthcare services, or outpatient office consultations. However, the developer anticipates that because these data elements are coded in a similar manner as Encounter, Performed: Annual Wellness Visit (a feasible data element), other practices who do care for hospice patients or provide these services would be able to feasibly capture and report the data elements. Additional feasibility testing to assess feasibility of these data elements in other sites should be considered by the developer.

Rationale:

The developer conducted a feasibility assessment across two ambulatory practices representing two electronic health record vendors, Allscripts and AthenaHealth. The developer identified some data elements that, while available in the electronic health record, were not in structured fields. The developer anticipates that these data elements will be feasible to collect in practices that provide home health care services, outpatient consultations, or care for hospice patients as they are coded in a similar manner as Encounter, Performed: Annual Wellness Visit (a feasible data element).Additional feasibility testing to assess feasibility of these data elements in other sites should be considered by the developer. The measure can be reproduced and distributed (without modification) for noncommercial purposes and used commercially with a license agreement between the user and NKF, indicating that the measure can be used without substantial burden.

Scientific Acceptability

Scientific Acceptability Reliability Rating

Not met but addressable

Scientific Acceptability Reliability

Strengths:

The measure is clear and well defined.
Signal-to-noise reliability for at least 75% of clinicians is above the threshold of 0.6 with a median of 0.8.

Limitations:

Low number of entities in reliability calculations.
At least six clinicians included in the reliability analysis had one patient each. These clinicians would have reliability equal to one because the within variability (noise) for these clinicians would be zero since there are not multiple patients.

Rationale:

Table 3 shows that the 10th percentile for number of patients per clinician is equal to one, meaning at least six clinicians included in the analysis have only one patient each. These clinicians would each have reliability equal to one because they do not have multiple patients so within-clinician variability is zero.
To confirm that signal-to-noise reliability truly meets the threshold, clinicians included in the reliability analysis should have multiple patients in order to estimate within-clinician variability.

Scientific Acceptability Validity Rating

Not met but addressable

Scientific Acceptability Validity

Strengths:

The developer conducted data element validity testing in two ambulatory practices representing two electronic health record vendors, Allscripts and AthenaHealth. The developer randomly sampled 170 patients across the two sites and trained abstractors compared the results from electronic implementation of the measure to the medical record (gold standard). The developer found percent agreement rates to be high for the denominator (94% overall agreement) and denominator exclusions (84% overall agreement).

Limitations:

The developer found percent agreement rates for the numerator to be moderate (50% overall agreement). The developer provided kappa values for critical data elements, several of which were lower than 0.4 (indicating “poor” agreement): eGFR (sites 1 and 2), UACR (sites 1 and 2), and CKD Stage 5 (site 2). The developer explained several factors that impacted numerator agreement, including 1) uACR results sometimes being reported by the lab as “unable to calculate” due to the concentration levels below the detectable threshold and 2) issues with laboratory data not being in discrete fields. The developer described a proposed solution to support electronic data capture, noting that recommending rounding and a numerical value with a less than sign be reported to allow for electronic data capture to address the “unable to calculate” issue. Additionally, the developer explained that while the clinical sites that participated in testing are part of a system that provides ambulatory care and have more limitations in accessing laboratory data in discrete fields, in many healthcare systems (including ambulatory only) these elements are available for electronic reporting and agreement is expected to be higher. Additional testing to assess numerator validity in other sites should be considered by the developer.

Rationale:

The developer conducted data element validity testing in two ambulatory practices representing two electronic health record vendors, Allscripts and AthenaHealth. The developer found percent agreement rates to be high for the denominator (94% overall agreement) and denominator exclusions (84% overall agreement) and moderate for the numerator (50% overall agreement). Kappa values for two data elements critical to the calculation of the numerator (eGFR and uACR) were lower than 0.4 (indicating “poor” agreement). The developer explained that while the clinical sites that participated in testing are part of a system that provides ambulatory care and have more limitations in accessing laboratory data in discrete fields, in many healthcare systems (including ambulatory only) these elements are available for electronic reporting and agreement is expected to be higher. Additional testing to assess numerator validity in other sites should be considered by the developer.

Use and Usability

Committee Independent Review

Breadcrumb

Public Comment Shared During May 29 Listening Session

align MIPS with Star Ratings

4315e Kidney Health Evaluation

Comment on the 4315e measure on Kidney Health Evaluation

CBE #4315e Staff Assessment

Valuable measure, with need of further evaluation

do not support

Great concept that needs re-testing

Support if issues can be addressed

Strongly Support

CBE #4315e - Kidney Health Evaluation

Do not support

Important subject but…

support